SlideShare une entreprise Scribd logo
1  sur  42
Télécharger pour lire hors ligne
<Insert Picture Here>




                        Update on
                        Transcendent
    2010                Memory on Xen
                            Speaker: Dan Magenheimer
                                   Oracle Corporation
Agenda

         •   Motivation and Challenge
         •   BRIEF overview of Physical Memory Management
         •   BRIEF Transcendent Memory (“tmem”) Overview
         •   Tmem Progress since Xen Summit 2009
         •   Self-ballooning + Tmem Performance Analysis




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Motivation
        • Memory is increasingly becoming a
          bottleneck in virtualized system
        • Existing mechanisms have major holes
      Four underutilized 2-cpu virtual servers
                                                                                 ballooning
                           each with 1GB RAM
                                                 One 4-CPU physical
                                                 server w/4GB RAM




                                   X
                                                            X
                               memory
                           overcommitment                                   page
                                                                           sharing


Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
The Virtualized Physical Memory
                      Resource Optimization Challenge
         Optimize, across time, the distribution of machine
           memory among a maximal set of virtual machines by:
         • measuring the current and future memory need of
           each running VM and
         • reclaiming memory from those VMs that have an
           excess of memory and either:
               • providing it to VMs that need more memory or
               • using it to provision additional new VMs.
         • without suffering a significant performance penalty




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Agenda

         • Motivation and Challenge
         • BRIEF Overview of Physical Memory Management
               • in an operating system
               • in a virtual machine monitor (Xen)
         • BRIEF Transcendent Memory (“tmem”) Overview
         • Tmem Progress since Xen Summit 2009
         • Self-ballooning + Tmem Performance Analysis




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
OS Physical Memory Management



                                                                           • Operating systems
                                                                             are memory hogs!
                               OS




        Memory constraint




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
OS Physical Memory Management

                                                                           • Operating systems are
                                                                             memory hogs!
                           OS


                                                                           If you give an
                                                                             operating system
                                                                             more memory…..



            New larger memory
                constraint

Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
OS Physical Memory Management

                                                                           • Operating systems are
                                                                             memory hogs!

                                        My name is
                                        Linux and I                        If you give an OS more
                                           am a
                                        memory hog                           memory
                                                                           …it uses up any
                                                                             memory you give it!



             Memory constraint

Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
OS Physical Memory Management

                                                                           • What does an OS do
                                                                             with all that memory?
                                                                             • Kernel code and data
                                   Kernel code                               • User code and data
                                        User data
                                                                  Page       • Page cache!
       Kernel
                                        User code                 cache
        data
                           Everything
                              else




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
OS Physical Memory Management


                                                                           • What does an OS do
                                                                             with all that memory?
                         page
                         cache
                                                                            Page cache attempts to
                                                                            predict future needs of
                                                                            pages from the disk…
                                                                            sometimes it gets it right
                                                                              “good” pages



Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
OS Physical Memory Management


                                                                           • What does an OS do
                                                                             with all that memory?
                         page
                         cache
                                                                           Page cache attempts to
                                                                           predict future needs of
                                                                           pages from the disk…
                                                                           sometimes it gets it wrong
                                                                             “wasted” pages




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
OS Physical Memory Management


                                                                           • What does an OS do
                                                                             with all that memory?
                           page cache
                                                                           …much of the time
                                                                             mostly page cache
                                                                           … some of which will
                                                                             be useful in the future
                                                                           … and some (or
                                                                             maybe most…)
                                                                             of which is wasted
               Everything else


Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Agenda

         • Motivation and Challenge
         • BRIEF Overview of Physical Memory Management
               • in an operating system
               • in a virtual machine monitor (Xen)
         • BRIEF Transcendent Memory (“tmem”) Overview
         • Tmem Progress since Xen Summit 2009
         • Self-ballooning + Tmem Performance Analysis




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
VMM Physical Memory Management


                                                    fallow
                                                                           • Xen partitions memory
                                                                             among more guests
                                                         gues    fallow
                              guest                          t                •   Xen memory
                                                                              •   dom0 memory
                                                                 guest
                                                                              •   guest 1 memory
                                                                              •   guest 2 memory
                                                                              •   guest 3…
                                           fallow




                         guest
                                                                           • BUT still fallow memory
                                                                             leftover

                                                                           fallow, adj., land left without a
                                                                           crop for one or more years



Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
VMM Physical Memory Management
                            in the presence of migration
                                                                                                     fallow


                                                                                                          gue           fallow
                                                                                    guest                     t
                                                                                                              s



                                                                    Physical                                            gues
                                                                   machine “B”
  • migration                                                                                                             t


        • requires fallow memory                                                 fallow
          in the target machine
        • leaves behind fallow                                                                        fallow



          memory in the                                                                                           gue

                                                                                                                  st
                                                                                                                         fallow



          originating machine                                                                                           gues
                                                                                                                          t




                                                                                            fallow
                                                                Physical
                                                                                  guest
                                                               machine “A”



Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
VMM Physical Memory Management
                            in the presence of ballooning


                                  guest
                                                      fallow
                                                                           Ballooning works great for
                                                           guest
                                                                            giving more memory TO
                                                                            a guest OS…
                                                               guest
                                                                                Look ma! No more
                  gues                                                        fallow memory! (*burp*)
                    t
                                        gues
                                          t




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
VMM Physical Memory Management
                            in the presence of ballooning
                                                        But not so great to take memory away
                                                        • not instantaneous (memory inertia)
                                                        • guest can’t predict future needs
                                                              • good pages are evicted along with the bad
                                                        • host doesn’t know how much/fast to balloon
                                                              • too much or too fast has dire results
                                                                   thrashing or the dreaded OOM killer




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
VMM Physical Memory Management
                            in the presence of migration AND ballooning

                                                                           • Look ma! No more
                                             fallow
                             guest                                           fallow memory!
                                                guest
                                                                           ….But now live migration
                                                      guest
                                                                             crashes and burns
                  gue
                   st              gue
                                    st




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Why this IS a hard problem!
                                         Summary
         • OS’s use as much memory as they are given
               • but cannot predict the future so often guess wrong
               • and often much memory owned by an OS is wasted
         • Xen leaves large amounts of memory fallow
               • fixed partitioning results in fragmentation
               • migration requires fallow memory to succeed
         • Ballooning helps but:
               • can’t predict future memory needs of guests
               • memory has inertia
               • the price of incorrect guesses can be dire
              NEED A NEW APPROACH TO VIRTUALIZED
               PHYSICAL MEMORY MANAGEMENT!!
Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Agenda

         •   Motivation and Challenge
         •   BRIEF Overview of Physical Memory Management
         •   BRIEF Transcendent Memory (“tmem”) Overview
         •   Tmem Progress since Xen Summit 2009
         •   Self-ballooning + Tmem Performance Analysis




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Transcendent memory
                    creating the transcendent memory pool
                                                     fallow
                                                                            • Step 1a: reclaim all fallow memory
                                                                            • Step 1b: reclaim wasted guest
                                                                   fallow
                                                              t
                             guest                        gues
                                                                                   memory (e.g. via self-ballooning)
                                                                            • Step 1c: collect it all into a pool
                                                                  guest
                                            fallow




                       guest
                                                                                          Transcendent
                                                                                             memory
                                                                                              pool




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Transcendent memory
                    creating the transcendent memory pool

                                                                           • Step 2: provide indirect
                  guest                                                      access, strictly controlled by
                                                            t
                                                         gues                the hypervisor and dom0
                                   data
          data
                                                                             control
             Transcendent                                       guest
                memory data
                 pool

                       data                         control


               guest




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Transcendent memory
                                              API characteristics

                                                                           Transcendent memory API
              guest                         guest
                                                                           • paravirtualized (lightly)
                                                                           • narrow
                                                                           • well-specified
                                                                           • operations are:
                                                                              • synchronous
                                                                              • page-oriented (one page per op)
                                                                              • copy-based
                                                                           • multi-faceted
                            Transcendent                                   • extensible
                               memory
                                pool




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Transcendent memory
                                    four different subpool types
                                          four different uses
                                                                                         Legend:

           flags                     ephemeral                             persistent    Implemented and
               private “second-chance”                              Fast swap
                                                                                           working today
                       clean-page cache!!                           “device”!!             (Linux + Xen)
                                    “cleancache”                           “frontswap”     Now in 2.6.32
               shared server-side cluster                           inter-guest shared        patch
                      filesystem cache                              memory?
                          “shared                                                              Under
                      cleancache”                                                           investigation

        eph-em-er-al, adj., … transitory, existing only briefly, short-lived (i.e. NOT persistent)



Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Agenda

         •   Motivation and Challenge
         •   BRIEF Overview of Physical Memory Management
         •   BRIEF Transcendent Memory (“Tmem”) Overview
         •   Tmem Progress since Xen Summit 2009
         •   Self-ballooning + Tmem Performance Analysis




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Transcendent Memory
                            update since Xen Summit 2009 (Feb’09)
    • Tmem support officially released in Xen 4.0.0
          • Xen boot option: tmem
    •   Enterprise-quality concurrency
    •   Complete save/restore and live migration support
    •   Page deduplication support (post-4.0.0)
    •   Linux-side patches posted, including
          • ocfs2, btrfs, ext4 filesystem support (was ext3 only)
          • sysfs support for in-guest tmem statistics
    • Other tmem releases:
          •   Oracle VM 2.2 (10/2009)
          •   OpenSuSE 11.2 (11/2009); SLE11 SP1 later this year
          •   Oracle Enterprise Linux 5 update 4/5 rpm’s available soon
          •   targeting upstream Linux 2.6.35 (aka cleancache and frontswap)

Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Tmem page deduplication
   • Now in xen-unstable (for 4.1) and xen-4.0-testing soon (for 4.0.1)
         • Xen boot option: tmem_dedup
   • Similar in intent to “page sharing” but
         •   very different in implementation
         •   completely transparent to tmem-enabled guests
         •   neither copy-on-write nor host swapping required
         •   all tmem ephemeral pages automatically are sharing candidates
   • Can optionally be combined with:
         • compression (tmem_compress); or
         • “trailing zero elimination” (tmem_tze)
   • Statistics available through existing “xm tmem-list”
         • e.g., dedup+compression increase time per tmem op by ~10x



Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Known outstanding issue

    Fragmentation! (ominous music plays here…)
    • Xen page allocator allows “order>0” allocations
          • order==1             2 contiguous pages, order==2              4 contiguous pages, etc.

    • Some Xen features depend on order>0 allocations
          • some are resilient and fall back to multiple one-page allocations
          • some fail badly if not available (e.g., shadow code, domain creation)
    • Ballooning+tmem quickly fragments all Xen RAM
    • Hacky workaround in place for 4.0, but…
    CALL-TO-ACTION:
        Post-dom0-boot multi-page allocations are fundamentally
        incompatible with all dynamic memory technologies and must
        either be made resilient or eliminated from Xen!

Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Agenda

         •   Motivation and Challenge
         •   BRIEF Overview of Physical Memory Management
         •   BRIEF Transcendent Memory (“Tmem”) Overview
         •   Tmem Progress since Xen Summit 2009
         •   Self-ballooning + Tmem Performance Analysis




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Test workload (overcommitted!)
   • Dual core (Conroe) processor, 2GB RAM, IDE disk
   • Four single vcpu PV VMs, in-kernel self-ballooning+tmem
         • Oracle Enterprise Linux 5 update 4; two 32-bit + two 64-bit
         • mem=384MB (maxmem=512MB)… total = 1.5GB (2GB maxmem)
         • virtual block device is tap:aio (file contains 3 LVM partitions: ext3+ext3+swap)
   • Each VM waits for all VMs to be ready, then simultaneously
         • two Linux kernel compiles (2.6.32 source), then force crash:
               • make clean; make –j8; make clean; make –j8
               • echo c > /proc/sysrq-trigger

   • Dom0: 256MB fixed, 2 vcpus
         • automatically launches all domains
         • checks every 60s, waiting for all to be crashed
         • saves away statistics, then reboots



Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Measurement methodology
     • Four statistics measured for each run
           • Temporal: (1) wallclock time to completion; (2) total vcpu including dom0
           • Disk access: vbd sectors (3) read and (4) written
     • Test workload run five times for each configuration
           • high and low sample of each statistic discarded
           • use average of middle three samples for “single-value” statistic
     • Five different configurations:
                                      Features Self-                       Tmem   Page    Compression
                                       enabled ballooning                         Dedup
              Configuration
              Unchanged                                 NO                 NO     NO      NO

              Self-ballooning                           YES                NO     NO      NO

              Tmem                                      YES                YES    NO      NO

              Tmem w/dedup                              YES                YES    YES     NO

              Tmem w/dedup+ comp                        YES                YES    YES     YES



Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Self-ballooning recap                                           (see Xen Summit 2008)

                                                      • Goal: Allow memory overcommit
                                                      • Use in-guest feedback to resize balloon
                                                            •   aggressively
                                                            •   frequently
                                                            •   independently
                                                            •   configurably
                                                      • For Linux, size to maximum of:
                             guest                          • /proc/meminfo “CommittedAS”
                                                            • memory floor enforced by Xen balloon driver
                                                      • Userland daemon or patched kernel

       Committed_AS: An estimate of how much RAM you would need to make a 99.99%
       guarantee that there never is OOM (out of memory) for this workload. Normally the kernel
       will overcommit memory. The Committed_AS is a guesstimate of how much RAM/swap you
       would need worst-case. (From http://www.redhat.com/advice/tips/meminfo.html)



Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Unchanged vs. Self-ballooning only
                                           Temporal stats




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Unchanged vs. Self-ballooning only
                            Virtual block device stats




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
*Sigh* Why is there a performance hit?
         Aggressive ballooning (by itself) doesn’t work very well!
         • (Self-)ballooning indiscriminately shrinks the guest OS’s
           page cache, causing refaults!
         • Insufficiently responsive (self-)ballooning when guest OS
           needs memory now, results in swapping… or OOMs!
            PERFORMANCE WILL GET WORSE WHEN LARGE-
           MEMORY GUESTS ARE AGGRESSIVELY BALLOONED




                            guest



Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Self-ballooning AND Transcendent Memory
                          …go together like a horse and carriage
                                                                           • Self-ballooned memory is returned
                                                     fallow
                                                                             to Xen and absorbed by tmem
                                                                  fallow
                                                                           • Most tmem memory can be
                                                          guest              instantly reclaimed when needed
                                                                             for a memory-needy or new guest
                            guest                                          • Tmem also provides a safety valve
                                                                  guest
                                                                             when ballooning is not fast enough
                                            fallow




                                                                                        Transcendent
                       guest                                                               memory
                                                                                            pool




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Self-ballooning AND Tmem
                                            Temporal stats


                                                                                 79% utilization*




                     5%-8% faster completion                               72% utilization*




                                                                                                * 2 cores

Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Self-ballooning AND Tmem
                            virtual block device stats




                  31-52% reduction in sectors read

                                                                           (no significant change in sectors written)




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
WOW! Why is tmem so good?
   • Tmem-enabled guests statistically multiplex one shared
     virtual page cache to reduce disk refaults!
         • 252068 page (984MB) max (NOTE: actual tmem measurement)
   • Deduplication and compression together transparently
     QUADRUPLE apparent size of this virtual page cache!
         • 953166 page (3723MB) max (actually measured by tmem… on 2GB system!)
   • Swapping-to-disk (e.g. due to insufficiently responsive ballooning)
     is converted to in-memory copies and statistically multiplexed
         • 82MB at workload completion, 319MB combined max (actual measurement)
         • uses compression but not deduplication
   • CPU “costs” entirely hidden by increased CPU utilization
           RESULTS MAY BE EVEN BETTER WHEN
       WORKLOAD IS TEMPORALLY DISTRIBUTED/SPARSE

Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Transcendent Memory Update
                    Summary
   Tmem advantages:
   • greatly increased memory utilization/flexibility
   • dramatic reduction in I/O bandwidth requirements
   • more effective CPU utilization
   • faster completion of (some?) workloads
   Tmem disadvantages:
   • tmem-modified kernel required (so only Linux now)
   • higher power consumption due to higher CPU utilization




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
Transcendent Memory Update
                    Acknowledgements

   Special thanks to Jeremy Fitzhardinge for many
    reviews and improvements in the Linux-side tmem
    code!




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
For more information

           http://oss.oracle.com/projects/tmem
           or xen-unstable.hg/docs/misc/tmem-internals.html
               dan.magenheimer@oracle.com




Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer

Contenu connexe

Tendances

Evaluation and Enhancement to Memory Sharing and Swapping in Xen 4.1
Evaluation and Enhancement to Memory Sharing and Swapping in Xen 4.1Evaluation and Enhancement to Memory Sharing and Swapping in Xen 4.1
Evaluation and Enhancement to Memory Sharing and Swapping in Xen 4.1
The Linux Foundation
 
Structure for scale: Dialing in your apps for optimal performance
Structure for scale: Dialing in your apps for optimal performanceStructure for scale: Dialing in your apps for optimal performance
Structure for scale: Dialing in your apps for optimal performance
Atlassian
 
Dell opti plex_3010_spec_sheet
Dell opti plex_3010_spec_sheetDell opti plex_3010_spec_sheet
Dell opti plex_3010_spec_sheet
Arsalan Qureshi
 

Tendances (17)

Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
 
IBM Upgrades SVC with Solid State Drives — Achieves Better Storage Utilization
IBM Upgrades SVC with Solid State Drives — Achieves Better Storage UtilizationIBM Upgrades SVC with Solid State Drives — Achieves Better Storage Utilization
IBM Upgrades SVC with Solid State Drives — Achieves Better Storage Utilization
 
Ronnie Oomen (EMC)
Ronnie Oomen (EMC)Ronnie Oomen (EMC)
Ronnie Oomen (EMC)
 
Virtualization And Disk Performance
Virtualization And Disk PerformanceVirtualization And Disk Performance
Virtualization And Disk Performance
 
Computer Hardware
Computer HardwareComputer Hardware
Computer Hardware
 
Persistence of memory: In-memory Is Not Often the Answer
Persistence of memory: In-memory Is Not Often the AnswerPersistence of memory: In-memory Is Not Often the Answer
Persistence of memory: In-memory Is Not Often the Answer
 
Dsmp Whitepaper Release 3
Dsmp Whitepaper Release 3Dsmp Whitepaper Release 3
Dsmp Whitepaper Release 3
 
Practical experiences and best practices for SSD and IBM i
Practical experiences and best practices for SSD and IBM iPractical experiences and best practices for SSD and IBM i
Practical experiences and best practices for SSD and IBM i
 
Genie Backup Manager 8 Discovery
Genie Backup Manager 8 DiscoveryGenie Backup Manager 8 Discovery
Genie Backup Manager 8 Discovery
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java Developers
 
4. Memory virtualization and management
4. Memory virtualization and management4. Memory virtualization and management
4. Memory virtualization and management
 
Dpm 2012
Dpm 2012Dpm 2012
Dpm 2012
 
Lect13
Lect13Lect13
Lect13
 
Evaluation and Enhancement to Memory Sharing and Swapping in Xen 4.1
Evaluation and Enhancement to Memory Sharing and Swapping in Xen 4.1Evaluation and Enhancement to Memory Sharing and Swapping in Xen 4.1
Evaluation and Enhancement to Memory Sharing and Swapping in Xen 4.1
 
Structure for scale: Dialing in your apps for optimal performance
Structure for scale: Dialing in your apps for optimal performanceStructure for scale: Dialing in your apps for optimal performance
Structure for scale: Dialing in your apps for optimal performance
 
Dell opti plex_3010_spec_sheet
Dell opti plex_3010_spec_sheetDell opti plex_3010_spec_sheet
Dell opti plex_3010_spec_sheet
 
Sansymphony v-r9
Sansymphony v-r9Sansymphony v-r9
Sansymphony v-r9
 

En vedette

Xen summit 2010 extending xen into embedded
Xen summit 2010 extending xen into embeddedXen summit 2010 extending xen into embedded
Xen summit 2010 extending xen into embedded
The Linux Foundation
 
Xen summit spring2010_tom_woller_amd
Xen summit spring2010_tom_woller_amdXen summit spring2010_tom_woller_amd
Xen summit spring2010_tom_woller_amd
The Linux Foundation
 

En vedette (9)

Xen summit 2010 extending xen into embedded
Xen summit 2010 extending xen into embeddedXen summit 2010 extending xen into embedded
Xen summit 2010 extending xen into embedded
 
XPDS16: Patch review for non-maintainers - George Dunlap, Citrix Systems R&D...
 XPDS16: Patch review for non-maintainers - George Dunlap, Citrix Systems R&D... XPDS16: Patch review for non-maintainers - George Dunlap, Citrix Systems R&D...
XPDS16: Patch review for non-maintainers - George Dunlap, Citrix Systems R&D...
 
Xen summit spring2010_tom_woller_amd
Xen summit spring2010_tom_woller_amdXen summit spring2010_tom_woller_amd
Xen summit spring2010_tom_woller_amd
 
XPDS16: CPUID handling for guests - Andrew Cooper, Citrix
XPDS16:  CPUID handling for guests - Andrew Cooper, CitrixXPDS16:  CPUID handling for guests - Andrew Cooper, Citrix
XPDS16: CPUID handling for guests - Andrew Cooper, Citrix
 
Xen summit amd_2010v3
Xen summit amd_2010v3Xen summit amd_2010v3
Xen summit amd_2010v3
 
My Principle Of The Unity In The Human Act
My Principle Of The Unity In The Human ActMy Principle Of The Unity In The Human Act
My Principle Of The Unity In The Human Act
 
XPDS16: Keeping coherency on ARM - Julien Grall, ARM
XPDS16: Keeping coherency on ARM - Julien Grall, ARMXPDS16: Keeping coherency on ARM - Julien Grall, ARM
XPDS16: Keeping coherency on ARM - Julien Grall, ARM
 
XPDS16: Porting Xen on ARM to a new SOC - Julien Grall, ARM
XPDS16: Porting Xen on ARM to a new SOC - Julien Grall, ARMXPDS16: Porting Xen on ARM to a new SOC - Julien Grall, ARM
XPDS16: Porting Xen on ARM to a new SOC - Julien Grall, ARM
 
XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...
XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...
XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...
 

Similaire à Transcendent memoryupdate xensummit2010-final

XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...
XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...
XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...
Peter Ocasek
 
Pm 01 bradley stone_openstorage_openstack
Pm 01 bradley stone_openstorage_openstackPm 01 bradley stone_openstorage_openstack
Pm 01 bradley stone_openstorage_openstack
OpenCity Community
 
[Harvard CS264] 05 - Advanced-level CUDA Programming
[Harvard CS264] 05 - Advanced-level CUDA Programming[Harvard CS264] 05 - Advanced-level CUDA Programming
[Harvard CS264] 05 - Advanced-level CUDA Programming
npinto
 

Similaire à Transcendent memoryupdate xensummit2010-final (11)

XS Boston 2008 Memory Overcommit
XS Boston 2008 Memory OvercommitXS Boston 2008 Memory Overcommit
XS Boston 2008 Memory Overcommit
 
XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...
XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...
XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...
 
Performance in a virtualized environment
Performance in a virtualized environmentPerformance in a virtualized environment
Performance in a virtualized environment
 
Chen Haibo
Chen HaiboChen Haibo
Chen Haibo
 
Openstorage with OpenStack, by Bradley
Openstorage with OpenStack, by BradleyOpenstorage with OpenStack, by Bradley
Openstorage with OpenStack, by Bradley
 
Pm 01 bradley stone_openstorage_openstack
Pm 01 bradley stone_openstorage_openstackPm 01 bradley stone_openstorage_openstack
Pm 01 bradley stone_openstorage_openstack
 
Memory management
Memory managementMemory management
Memory management
 
Zu e x5 training 04 06-2010
Zu e x5 training 04 06-2010Zu e x5 training 04 06-2010
Zu e x5 training 04 06-2010
 
Membase Meetup - San Diego
Membase Meetup - San DiegoMembase Meetup - San Diego
Membase Meetup - San Diego
 
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
 
[Harvard CS264] 05 - Advanced-level CUDA Programming
[Harvard CS264] 05 - Advanced-level CUDA Programming[Harvard CS264] 05 - Advanced-level CUDA Programming
[Harvard CS264] 05 - Advanced-level CUDA Programming
 

Plus de The Linux Foundation

Plus de The Linux Foundation (20)

ELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made SimpleELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made Simple
 
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
 
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
 
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
 
XPDDS19 Keynote: Unikraft Weather Report
XPDDS19 Keynote:  Unikraft Weather ReportXPDDS19 Keynote:  Unikraft Weather Report
XPDDS19 Keynote: Unikraft Weather Report
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
 
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, XilinxXPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
 
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
 
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, BitdefenderXPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
 
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
 
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making... OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, CitrixXPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
 
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltdXPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
 
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
 
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&DXPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsXPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
 
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
 
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
 
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSEXPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Dernier (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Transcendent memoryupdate xensummit2010-final

  • 1. <Insert Picture Here> Update on Transcendent 2010 Memory on Xen Speaker: Dan Magenheimer Oracle Corporation
  • 2. Agenda • Motivation and Challenge • BRIEF overview of Physical Memory Management • BRIEF Transcendent Memory (“tmem”) Overview • Tmem Progress since Xen Summit 2009 • Self-ballooning + Tmem Performance Analysis Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 3. Motivation • Memory is increasingly becoming a bottleneck in virtualized system • Existing mechanisms have major holes Four underutilized 2-cpu virtual servers ballooning each with 1GB RAM One 4-CPU physical server w/4GB RAM X X memory overcommitment page sharing Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 4. The Virtualized Physical Memory Resource Optimization Challenge Optimize, across time, the distribution of machine memory among a maximal set of virtual machines by: • measuring the current and future memory need of each running VM and • reclaiming memory from those VMs that have an excess of memory and either: • providing it to VMs that need more memory or • using it to provision additional new VMs. • without suffering a significant performance penalty Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 5. Agenda • Motivation and Challenge • BRIEF Overview of Physical Memory Management • in an operating system • in a virtual machine monitor (Xen) • BRIEF Transcendent Memory (“tmem”) Overview • Tmem Progress since Xen Summit 2009 • Self-ballooning + Tmem Performance Analysis Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 6. OS Physical Memory Management • Operating systems are memory hogs! OS Memory constraint Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 7. OS Physical Memory Management • Operating systems are memory hogs! OS If you give an operating system more memory….. New larger memory constraint Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 8. OS Physical Memory Management • Operating systems are memory hogs! My name is Linux and I If you give an OS more am a memory hog memory …it uses up any memory you give it! Memory constraint Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 9. OS Physical Memory Management • What does an OS do with all that memory? • Kernel code and data Kernel code • User code and data User data Page • Page cache! Kernel User code cache data Everything else Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 10. OS Physical Memory Management • What does an OS do with all that memory? page cache Page cache attempts to predict future needs of pages from the disk… sometimes it gets it right “good” pages Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 11. OS Physical Memory Management • What does an OS do with all that memory? page cache Page cache attempts to predict future needs of pages from the disk… sometimes it gets it wrong “wasted” pages Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 12. OS Physical Memory Management • What does an OS do with all that memory? page cache …much of the time mostly page cache … some of which will be useful in the future … and some (or maybe most…) of which is wasted Everything else Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 13. Agenda • Motivation and Challenge • BRIEF Overview of Physical Memory Management • in an operating system • in a virtual machine monitor (Xen) • BRIEF Transcendent Memory (“tmem”) Overview • Tmem Progress since Xen Summit 2009 • Self-ballooning + Tmem Performance Analysis Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 14. VMM Physical Memory Management fallow • Xen partitions memory among more guests gues fallow guest t • Xen memory • dom0 memory guest • guest 1 memory • guest 2 memory • guest 3… fallow guest • BUT still fallow memory leftover fallow, adj., land left without a crop for one or more years Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 15. VMM Physical Memory Management in the presence of migration fallow gue fallow guest t s Physical gues machine “B” • migration t • requires fallow memory fallow in the target machine • leaves behind fallow fallow memory in the gue st fallow originating machine gues t fallow Physical guest machine “A” Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 16. VMM Physical Memory Management in the presence of ballooning guest fallow Ballooning works great for guest giving more memory TO a guest OS… guest Look ma! No more gues fallow memory! (*burp*) t gues t Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 17. VMM Physical Memory Management in the presence of ballooning But not so great to take memory away • not instantaneous (memory inertia) • guest can’t predict future needs • good pages are evicted along with the bad • host doesn’t know how much/fast to balloon • too much or too fast has dire results thrashing or the dreaded OOM killer Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 18. VMM Physical Memory Management in the presence of migration AND ballooning • Look ma! No more fallow guest fallow memory! guest ….But now live migration guest crashes and burns gue st gue st Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 19. Why this IS a hard problem! Summary • OS’s use as much memory as they are given • but cannot predict the future so often guess wrong • and often much memory owned by an OS is wasted • Xen leaves large amounts of memory fallow • fixed partitioning results in fragmentation • migration requires fallow memory to succeed • Ballooning helps but: • can’t predict future memory needs of guests • memory has inertia • the price of incorrect guesses can be dire NEED A NEW APPROACH TO VIRTUALIZED PHYSICAL MEMORY MANAGEMENT!! Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 20. Agenda • Motivation and Challenge • BRIEF Overview of Physical Memory Management • BRIEF Transcendent Memory (“tmem”) Overview • Tmem Progress since Xen Summit 2009 • Self-ballooning + Tmem Performance Analysis Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 21. Transcendent memory creating the transcendent memory pool fallow • Step 1a: reclaim all fallow memory • Step 1b: reclaim wasted guest fallow t guest gues memory (e.g. via self-ballooning) • Step 1c: collect it all into a pool guest fallow guest Transcendent memory pool Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 22. Transcendent memory creating the transcendent memory pool • Step 2: provide indirect guest access, strictly controlled by t gues the hypervisor and dom0 data data control Transcendent guest memory data pool data control guest Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 23. Transcendent memory API characteristics Transcendent memory API guest guest • paravirtualized (lightly) • narrow • well-specified • operations are: • synchronous • page-oriented (one page per op) • copy-based • multi-faceted Transcendent • extensible memory pool Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 24. Transcendent memory four different subpool types four different uses Legend: flags ephemeral persistent Implemented and private “second-chance” Fast swap working today clean-page cache!! “device”!! (Linux + Xen) “cleancache” “frontswap” Now in 2.6.32 shared server-side cluster inter-guest shared patch filesystem cache memory? “shared Under cleancache” investigation eph-em-er-al, adj., … transitory, existing only briefly, short-lived (i.e. NOT persistent) Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 25. Agenda • Motivation and Challenge • BRIEF Overview of Physical Memory Management • BRIEF Transcendent Memory (“Tmem”) Overview • Tmem Progress since Xen Summit 2009 • Self-ballooning + Tmem Performance Analysis Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 26. Transcendent Memory update since Xen Summit 2009 (Feb’09) • Tmem support officially released in Xen 4.0.0 • Xen boot option: tmem • Enterprise-quality concurrency • Complete save/restore and live migration support • Page deduplication support (post-4.0.0) • Linux-side patches posted, including • ocfs2, btrfs, ext4 filesystem support (was ext3 only) • sysfs support for in-guest tmem statistics • Other tmem releases: • Oracle VM 2.2 (10/2009) • OpenSuSE 11.2 (11/2009); SLE11 SP1 later this year • Oracle Enterprise Linux 5 update 4/5 rpm’s available soon • targeting upstream Linux 2.6.35 (aka cleancache and frontswap) Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 27. Tmem page deduplication • Now in xen-unstable (for 4.1) and xen-4.0-testing soon (for 4.0.1) • Xen boot option: tmem_dedup • Similar in intent to “page sharing” but • very different in implementation • completely transparent to tmem-enabled guests • neither copy-on-write nor host swapping required • all tmem ephemeral pages automatically are sharing candidates • Can optionally be combined with: • compression (tmem_compress); or • “trailing zero elimination” (tmem_tze) • Statistics available through existing “xm tmem-list” • e.g., dedup+compression increase time per tmem op by ~10x Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 28. Known outstanding issue Fragmentation! (ominous music plays here…) • Xen page allocator allows “order>0” allocations • order==1 2 contiguous pages, order==2 4 contiguous pages, etc. • Some Xen features depend on order>0 allocations • some are resilient and fall back to multiple one-page allocations • some fail badly if not available (e.g., shadow code, domain creation) • Ballooning+tmem quickly fragments all Xen RAM • Hacky workaround in place for 4.0, but… CALL-TO-ACTION: Post-dom0-boot multi-page allocations are fundamentally incompatible with all dynamic memory technologies and must either be made resilient or eliminated from Xen! Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 29. Agenda • Motivation and Challenge • BRIEF Overview of Physical Memory Management • BRIEF Transcendent Memory (“Tmem”) Overview • Tmem Progress since Xen Summit 2009 • Self-ballooning + Tmem Performance Analysis Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 30. Test workload (overcommitted!) • Dual core (Conroe) processor, 2GB RAM, IDE disk • Four single vcpu PV VMs, in-kernel self-ballooning+tmem • Oracle Enterprise Linux 5 update 4; two 32-bit + two 64-bit • mem=384MB (maxmem=512MB)… total = 1.5GB (2GB maxmem) • virtual block device is tap:aio (file contains 3 LVM partitions: ext3+ext3+swap) • Each VM waits for all VMs to be ready, then simultaneously • two Linux kernel compiles (2.6.32 source), then force crash: • make clean; make –j8; make clean; make –j8 • echo c > /proc/sysrq-trigger • Dom0: 256MB fixed, 2 vcpus • automatically launches all domains • checks every 60s, waiting for all to be crashed • saves away statistics, then reboots Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 31. Measurement methodology • Four statistics measured for each run • Temporal: (1) wallclock time to completion; (2) total vcpu including dom0 • Disk access: vbd sectors (3) read and (4) written • Test workload run five times for each configuration • high and low sample of each statistic discarded • use average of middle three samples for “single-value” statistic • Five different configurations: Features Self- Tmem Page Compression enabled ballooning Dedup Configuration Unchanged NO NO NO NO Self-ballooning YES NO NO NO Tmem YES YES NO NO Tmem w/dedup YES YES YES NO Tmem w/dedup+ comp YES YES YES YES Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 32. Self-ballooning recap (see Xen Summit 2008) • Goal: Allow memory overcommit • Use in-guest feedback to resize balloon • aggressively • frequently • independently • configurably • For Linux, size to maximum of: guest • /proc/meminfo “CommittedAS” • memory floor enforced by Xen balloon driver • Userland daemon or patched kernel Committed_AS: An estimate of how much RAM you would need to make a 99.99% guarantee that there never is OOM (out of memory) for this workload. Normally the kernel will overcommit memory. The Committed_AS is a guesstimate of how much RAM/swap you would need worst-case. (From http://www.redhat.com/advice/tips/meminfo.html) Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 33. Unchanged vs. Self-ballooning only Temporal stats Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 34. Unchanged vs. Self-ballooning only Virtual block device stats Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 35. *Sigh* Why is there a performance hit? Aggressive ballooning (by itself) doesn’t work very well! • (Self-)ballooning indiscriminately shrinks the guest OS’s page cache, causing refaults! • Insufficiently responsive (self-)ballooning when guest OS needs memory now, results in swapping… or OOMs! PERFORMANCE WILL GET WORSE WHEN LARGE- MEMORY GUESTS ARE AGGRESSIVELY BALLOONED guest Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 36. Self-ballooning AND Transcendent Memory …go together like a horse and carriage • Self-ballooned memory is returned fallow to Xen and absorbed by tmem fallow • Most tmem memory can be guest instantly reclaimed when needed for a memory-needy or new guest guest • Tmem also provides a safety valve guest when ballooning is not fast enough fallow Transcendent guest memory pool Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 37. Self-ballooning AND Tmem Temporal stats 79% utilization* 5%-8% faster completion 72% utilization* * 2 cores Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 38. Self-ballooning AND Tmem virtual block device stats 31-52% reduction in sectors read (no significant change in sectors written) Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 39. WOW! Why is tmem so good? • Tmem-enabled guests statistically multiplex one shared virtual page cache to reduce disk refaults! • 252068 page (984MB) max (NOTE: actual tmem measurement) • Deduplication and compression together transparently QUADRUPLE apparent size of this virtual page cache! • 953166 page (3723MB) max (actually measured by tmem… on 2GB system!) • Swapping-to-disk (e.g. due to insufficiently responsive ballooning) is converted to in-memory copies and statistically multiplexed • 82MB at workload completion, 319MB combined max (actual measurement) • uses compression but not deduplication • CPU “costs” entirely hidden by increased CPU utilization RESULTS MAY BE EVEN BETTER WHEN WORKLOAD IS TEMPORALLY DISTRIBUTED/SPARSE Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 40. Transcendent Memory Update Summary Tmem advantages: • greatly increased memory utilization/flexibility • dramatic reduction in I/O bandwidth requirements • more effective CPU utilization • faster completion of (some?) workloads Tmem disadvantages: • tmem-modified kernel required (so only Linux now) • higher power consumption due to higher CPU utilization Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 41. Transcendent Memory Update Acknowledgements Special thanks to Jeremy Fitzhardinge for many reviews and improvements in the Linux-side tmem code! Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer
  • 42. For more information http://oss.oracle.com/projects/tmem or xen-unstable.hg/docs/misc/tmem-internals.html dan.magenheimer@oracle.com Update on Transcendent Memory on Xen (Xen Summit 2010) - Dan Magenheimer