Performance By Design A blog devoted to Windows performance, application responsiveness and scala

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Sunday, 14 July 2013

Virtual memory management in VMware: Final thoughts

Posted on 17:00 by Unknown
This is final blog post in a series on VMware memory management. The previous post in the series is here:

Final Thoughts

We constructed and I have been discussing in some detail a case study where VMware memory over-commitment led to guest machine memory ballooning and swapping, which, in turn, had a substantial impact on the performance of the applications that were running. When memory contention was present, the benchmark application executed to completion three times slower than the same application run standalone. The difference was entirely due to memory management “overhead,” the cost of demand paging when the supply of machine memory was insufficient to the task.

Analysis of the case study results unequivocally shows that the cost equation associated with aggressive server consolidation using VMware needs to be adjusted based on the performance risks that can arise when memory is over-committed. When configuring the memory on a VMware Host machine, for optimal performance it is important to realize that virtual memory systems do not degrade gracefully. When virtual memory workloads overflow the amount of physical memory available for them to execute, they are subject to page fault resolution delays that are punishing in nature. The amount of delay during the time it takes to perform a disk I/O necessary to bring a block of code or data from the paging file into memory to resolve a page fault is several orders of magnitude larger than almost any other sort of execution time delay a running thread is ever likely to encounter. 

VMware implements a policy of memory over-commitment in order to support aggressive server consolidation. In many operational environments, like Server hosting or application testing, guest machines are frequently dormant. But when they are active, they are extremely active in bursts. These kinds of environments are well-served by aggressive guest machine consolidation on server hardware that is massively over-provisioned.

On the other hand, implementing overly aggressive server consolidation of active production workloads with more predictable levels of activity presents a very different set of operational challenges. One entirely unexpected result of the benchmark was the data on Transparent Memory Sharing reported in an earlier post that showed the benefits of memory sharing evaporating almost completely in the face of guest machines actively using their allotted physical memory. Since the guest machines used in the benchmark were configured identically, down to running the same exact application code, it was surprising to see how ineffective memory sharing proved to be once the benchmark applications started to execute on their respective guest machines. Certainly the same memory over-commitment mechanism is extremely effective when guest machines are idle for extended periods of time. This finding that memory sharing is ineffective when the guest machines are active, if it can be replicated in other environments, would call for re-evaluating the value of the whole approach, especially since idle machines can be swapped out of memory entirely.

Moreover, the performance-related risks for critical workloads that arise when memory over-commitment leads to ballooning and swapping are substantial. Consider that if an appropriate amount of physical memory was chosen for a guest machine configuration at the outset, any pages removed from the guest machine memory footprint via ballooning and/or swapping is potentially very damaging. For this reason, for example, 
warnings from SQL Server DBAs about VMware’s policy of over-committing machine memory are very prominent in blog posts. See http://www.sqlskills.com/blogs/jonathan/the-accidental-dba-day-5-of-30-vm-considerations/ for an example.

In the benchmark test discussed here, each of the guest machines ran identical workloads that, when a sufficient number of them were run in tandem, combined to stress the virtual memory management capabilities of the VMware Host. Using the ballooning technique, VMware successfully transmitted the external memory contention in effect to the individual guest machines. This successful transmission diffused the response to the external problem, but did not in any way lessen its performance impact.

More typical of a production environment, perhaps, is the case where a single guest machine is the primary source of the memory contention. Just as in a single OS image when one user process consuming an excess of physical memory can create a resource shortage with a global impact, a single guest machine consuming an excess of machine memory can generate a resource shortage that impacts multiple tenants in the virtualization environment.


Memory Reservations.

In VMware, customers do have the ability to prioritize guest machines so that all tenants sharing an over-committed virtualization Host machine are not penalized equally when there is a resource shortage. The most effective way to protect a critical guest machine from being subjected to ballooning and swapping due to a co-resident guest is to set up a machine memory Reservation. A machine memory Reservation establishes a floor guaranteeing that a certain amount of machine memory is always granted to the guest. With a Reservation value set, VMware will not subject a guest machine to ballooning or swapping that will result in the machine memory granted to the guest falling below that minimum. 

But in order to set an optimal memory Reservation size, it is first necessary to understand how much physical memory the guest machine requires, not always an easy task. A Reservation value that is set too high on a Host machine experiencing memory contention will have the effect of increasing the level of memory reclamation activity on the remaining co-tenants of the VMware Host.

Another challenge is how to set an optimal Reservation value for guest machines running applications that, like the .NET Framework application used in the benchmark discussed here, dynamically expand their working set to grab as much physical memory as possible on the machine. Microsoft SQL Server is one of the more prominent Windows server applications that does that, but others include the MS Exchange Store process (fundamentally also a database application), and ASP.NET web sites. Like the benchmark application, SQL Server and Store listen for Low Memory notifications from the OS, and will trim back their working set of resident pages in response. If the memory remaining proves inadequate to the task, there are performance ramifications.

With server applications like SQL Server that expand to fill the size of RAM, it is often very difficult to determine how much RAM is optimal, except through trial and error. The configuration flexibility inherent in virtualization technology does offer a way to experiment with different machine memory configurations. Once the appropriate set of performance “experiments” have been run, the results can then be used to reserve the right amount of machine memory for these guest machines. Of course, these workloads are also subject to growth and change over time, so once memory reservation parameters are set, they need to be actively monitored at both the VMware Host and guest machine and application levels.
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in memory management, VMware | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Using QueryThreadCycleTime to access CPU execution timing
    As a prelude to a discussion of the Scenario instrumentation library, I mentioned in the previous post that a good understanding of the cloc...
  • Using xperf to analyze CSwitch events
    Continuing the discussion from the previous blog entry on event-driven approaches to measuring CPU utilization in Windows ... Last time arou...
  • Virtual memory management in VMware: memory ballooning
    This is a continuation of a series of blog posts on VMware memory management. The previous post in the series is  here . Ballooning Ballooni...
  • Correcting the Process level measurements of CPU time for Windows guest machines running under VMware ESX
    Recently, I have been writing about how Windows guest machine performance counters are affected by running in a virtual environment, includi...
  • Virtual memory management in VMware: Swapping
    This is a continuation of a series of blog posts on VMware memory management. The previous post in the series is  here . Swapping VMware has...
  • Deconstructing disk performance rules: final thoughts
    To summarize the discussion so far: While my experience with rule-based approaches to computer performance leads me to be very skeptical of ...
  • Rules in PAL: the Performance Analysis of Logs tool
    In spite of their limitations, some of which were discussed in an earlier blog entry , rule-based bromides for automating computer performan...
  • Measuring application response time using the Scenario instrumentation library.
    This blog post describes the Scenario instrumentation library, a simple but useful tool for generating response time measurements from insi...
  • High Resolution Clocks and Timers for Performance Measurement in Windows.
    Within the discipline of software performance engineering (SPE), application response time monitoring refers to the capability of instrument...
  • Page Load Time and the YSlow scalability model of web application performance
    This is the first of a new series of blog posts where I intend to drill into an example of a scalability model that has been particularly in...

Categories

  • artificial intelligence; automated decision-making;
  • artificial intelligence; automated decision-making; Watson; Jeopardy
  • hardware performance; ARM
  • Innovation; History of the Internet
  • memory management
  • VMware
  • Windows
  • Windows 8
  • windows-performance; application-responsiveness; application-scalability; software-performance-engineering
  • windows-performance; context switches; application-responsiveness; application-scalability; software-performance-engineering

Blog Archive

  • ▼  2013 (14)
    • ►  November (1)
    • ►  October (1)
    • ►  September (1)
    • ▼  July (3)
      • Virtual memory management in VMware: Final thoughts
      • Virtual memory management in VMware: Swapping
      • Virtual memory management in VMware: memory balloo...
    • ►  June (5)
    • ►  May (1)
    • ►  February (1)
    • ►  January (1)
  • ►  2012 (11)
    • ►  December (1)
    • ►  November (2)
    • ►  October (2)
    • ►  July (1)
    • ►  May (1)
    • ►  April (2)
    • ►  March (2)
  • ►  2011 (14)
    • ►  November (3)
    • ►  October (2)
    • ►  May (1)
    • ►  April (1)
    • ►  February (3)
    • ►  January (4)
Powered by Blogger.

About Me

Unknown
View my complete profile