SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
Controlling Memory Footprint at All Layers:
         Linux Kernel, Applications,
          Libraries, and Toolchain


                      Xi Wang
                Broadcom Corporation


         Embedded Linux Conference 2011
Introduction
●   Search with more breadth and depth
     –   Visit more layers
           ●   This talk is not about vmlinux or kmalloc
     –   Not just reducing size. Goal is to add features
         without adding memory
           ●   E.g. improve system performance under low memory
               situation

●   About medium-sized embedded systems
     –   With DRAM + Flash, no swap partition
     –   Lightly configured kernel image fits in comfortably,
         but need more space for everything else
     –   Based on 2.6.2x, let me know if there are new
         changes
Introduction
                                                                              Kernel Dynamic Memory



                                    SLAB                                           Kernel Dynamic Data
                                                kmalloc
                                   Allocator
                                  (SLOB/SLUB
All Physical Memory                 Optional)
(Embedded System, ignoring
   Zones, NUMA here)                                                            Kernel Module Code Segment
                                                          vmalloc




                                                                                User Space Memory
                             alloc_pages
                                                                                       Environment

       Paged Memory                                                                       Stack

                                       Buddy                                         Shared Memory
                                       System
                                      Allocator
                               Reserved
                                Atomic                                                    Heap
                               Memory
         Early Statically
        Allocated Memory
                                                             Virtual Memory
      (add_memory_region)
                                                              Management
                                     Page
                                    Cache
                                    Reclaim                                          Shared Libraries
      Kernel Code and
        Static Data
         (vmlinux)



                                                                                          Code
Kernel Memory Lifecycle
●   Understanding kernel memory lifecycle
     –   Memory is an active subsystem with its own behaviors
     –   Kernel gives away memory carelessly and let everyone
         keep it
           ●   E.g., file caching, free memory in slab not returned
               immediately
     –   Sometimes try to get it back
           ●   Periodic reclaiming from kswapd
           ●   When there is not enough memory
●   Most apparent effect of memory life cycle: Free
    memory in /proc/meminfo is usually low unless you
    have a lot of memory
     –   Never use the free memory number in /proc/meminfo as
         free memory is a fuzzy concept in Linux kernel. Think in
         terms of memory pressure instead
Improve System Low-Memory Behavior
Low memory system behavior can be poor in
embedded systems
● Symptoms

     –   System slows down significantly when memory is
         low, but there is still enough memory
     –   OOM kills processes too eagerly
●   Reasons
     –   Target system for typical kernel developers is DRAM
         + hard drive, so some behaviors may not be the test
         fit for DRAM + Flash embedded systems
Improve System Low-Memory Behavior

 –   When memory is low, kernel spends a lot of time in
     kswapd trying swap out pages, but only with limited
     success – dirty pages have nowhere to go in a
     swapless system
       ●   Performance impact
       ●   More chances for memory-introduced deadlock

 –   Incremental search for free-able pages would not be
     optimal for better speed or reducing fragmentation
     under low memory conditions
 mmzone.h
  /*
   * The "priority" of VM scanning is how much of the queues we will scan in one
   * go. A value of 12 for DEF_PRIORITY implies that we will scan 1/4096th of the
   * queues ("queue_length >> 12") during an aging round.
   */
   #define DEF_PRIORITY 12
Improve System Low-Memory Behavior
●   Keep system running when memory is low
     –   Lower the priority kswap kernel thread
           ●   Rationale: In a swapless system, it cannot really swap
                  –   Made a big difference, no serious side effects
     –   When trying to free pages, look at more pages at the
         first try (make DEF_PRIORITY smaller)
           ●   Not enough data to tell the effect
     –   Turn off OOM killer
           ●   For small closed embedded system, every process is
               essential, as simple solution is to reboot the system
               instead of killing processes
●   Excessive page frame reclaim activity can
    cause other problems as well
     –   Power management
     –   Real-time latency
Tap Into Reserves
●   When kernel thinks memory is low?
●   Parameters of interest: Minimum free memory
    standards maintained by kernel
     –   Reserved memory pool for GFP_ATOMIC
           ●   /proc/sys/vm/min_free_kbytes
           ●   Default value is calculated based on memory size
     –   Zone watermarks put a threshold on fragmentation
           ●   Certain number of free pages at certain size
           ●   See zone_watermark_ok function for details
●   When the minimum standard cannot be met,
    kswapd will keep running
     –   Until it meets the standard or calls OOM killer
Tap Into Reserves
●   Tweak possible
     –   Reduce reserved memory pool
           ●   From sysfs in newer kernels,
               /proc/sys/vm/min_free_kbytes
     –   Lower fragmentation threshold
           ●   Can change algorithm in zone_watermark_ok function
●   Trade-offs
     –   Running out of memory in interrupt context
     –   Not enough big free memory blocks
     –   But for embedded systems we have better control,
         can be running closer to the limit
Fight Fragmentation
●   Fragmented free memory can be as bad as no
    memory – don't ignore fragmentation
●   Try not to create fragmentation
     –   Preallocate memory when possible
     –   Write a task-specific memory allocator if needed
●   Able to use fragmented memory
     –   Use vmalloc for allocating bigger memory blocks
     –   Move code to user space
●   Already discussed
     –   Lower fragmentation threshold
     –   When trying to free pages, look at more pages at the
         first try
           ●   Not enough data to tell the effect
More on Kernel Memory
●   Alternative allocators
     –   SLOB
     –   SLUB
●   Reference
     –   “Understanding the Linux Kernel” Bovet & Cesati.
         O'Reilly Media
System Design Issue:
           MMU Is An Advantage, Use It
uClinux may look like an interesting option, but
regular kernel with virtual memory is a better
bet if you have many user applications
●   The default behavior, read-only “swapping”
    between Flash and DRAM is beneficial
     –   More programs with less DRAM – load from Flash
         into DRAM when needed
     –   Slows down when overloaded, but graceful
         degradation
     –   Only possible with virtual memory
●   Execute in place may not be a good trade-off
     –   A lot more flash space for a little less DRAM space
Problems with Shared Library
Shifting focus into user space
● Lower code density of .so

     –   Function need to be ABI-compliant
     –   Position independent code
     –   Functions cannot be inlined
     –   Limit inter-function compiler optimizations
●   Need per-instance memory pages for dynamic
    linking
     –   Dynamic linking tables (GOT, PLT)
     –   Global variables
     –   Need to watch this overhead: If 10 processes in the
         system and each of them is linked to 10 shared
         libraries, ~400k of free memory is consumed by
         dynamic linking
Move Away From Shared Library
●   Tweak: Do not link unnecessary shared
    libraries
     –   Check your linker options

●   Better Tweak: Use client-server model in
    place of shared libraries when possible
     –   From: m application processes linked to n shared
         libraries
     –   To: Create a server process with n statically linked
         libraries. Use IPC mechanisms to connect m
         application processes to the server process
Toolchain Considerations
●   Plain old malloc allocator
     –   Two level memory allocation for user applications:
         malloc allocator on top of kernel page allocation
           ●   For performance reasons, malloc does not return all the
               free pages to kernel, it is done only when free memory
               exceed a threshold
                 –   These are dirty pages so kernel has no way to reuse it in a
                     swapless system
     –   Tweak: We can adjust parameters to make malloc
         allocator return unused pages to kernel more
         eagerly
           ●   libc: Call mallocopt function with M_TRIM_THRESHOLD
               as parameter
           ●   uClibc: No mallocopt function, change
               DEFAULT_TRIM_THRESHOLD at compile time
                 –   Default is 256K, too big for quite a few systems
Tools
●   Some proc entries
     –   Ignore free memory shown in /proc/meminfo
     –   echo 3 > /proc/sys/vm/drop_caches followed
         /proc/meminfo is better
     –   /proc/<pid>/maps for user processes
●   Tools from Matt Mackall
     –   /proc/kpagemap, /proc/<pid>/pagemap: Page walk
         tool
     –   /proc/<pid>/smaps
           ●   Aware of shared memory pages
           ●   “Proportional Set Size (PSS)”: Divide shared pages by
               number of process sharing and attribute to each process
     –   Matt Mackall's talks at ELC 2007, 2008, 2009
           ●   http://elinux.org/ELC_2009_Presentations
           ●   http://lwn.net/Articles/230975
Tools
●   My (experimental) metric: Kernel Mobile
    Memory
     –   A number that simple math actually works
           ●   Kernel Mobile Memory = 5000k → kmalloc 100k →
               Kernel Mobile Memory = 4900k
     –   Kernel Mobile Memory = # of memory pages
         that are free, or can be recycled by the kernel
         at anytime. For DRAM + Flash swapless
         system
           ●   Include: user application code segments (file backed
               pages that are not dirty), other cached files, etc.
           ●   Exclude: all dirty pages, memory consumed by
               kmalloc, user space malloc, etc.
Tools

–   How to measure
      ●   Non-intrusive: Should be able to measure by walk
          through page table and get statistics
      ●   Intrusive: Let kernel memory reclaim mechanisms until it
          cannot find any free pages (super version of
          drop_caches), then do /proc/meminfo.
             –   I wrote some code for older kernels, but it is not production-
                 quality and it did not get updated for newer kernels
             –   You can try to reuse power management code that puts the
                 system into hibernation (pm_suspend_disk), as that part of
                 kernel code does very similar things
–   Use
      ●   If it is very low in absolute terms (for example, <1.5M),
          the system is about to lock up
      ●   If it is small compared to the working set (Don't have a
          way to measure:)), the system runs slowly
Revisit
                                                                              Kernel Dynamic Memory



                                                                                   Kernel Dynamic Data
                                    SLAB
                                                kmalloc
                                   Allocator
                                  (SLOB/SLUB
All Physical Memory                 Optional)
(Embedded System, ignoring                                                      Kernel Module Code Segment
   Zones, NUMA here)
                                                          vmalloc



                                                                                User Space Memory

                             alloc_pages                                                Environment

                                                                                            Stack
       Paged Memory
                                                                                       Shared Memory
                                       Buddy
                                       System
                                      Allocator
                               Reserved                                                     Heap
                                Atomic
                               Memory
        Early Statically
       Allocated Memory                                      Virtual Memory
     (add_memory_region)                                      Management
                                                                               Jump Tables For Shared Libraries
                                     Page
                                    Cache
                                    Reclaim
      Kernel Code and
        Static Data                                                                    Shared Libraries
         (vmlinux)




                                                                                            Code
Don't Forget the Basics
●   Squash + LZMA is usually an efficient read-
    only root memory system
●   Remove unused symbols in shared libraries
     –   Debian mklibs script
           ●   http://packages.debian.org/unstable/devel/mklibs
●   Turn off unused kernel options
●   Check all GCC options for your platform
     –   Choose an efficient ABI if available
●   These are low-level optimizations. Don't
    forget algorithm and data structure
    optimizations
     –   O(..) level improvements
     –   Use arrays instead of linked list, trees when possible
         to avoid overheads
Conclusion
●   This may be tedious work, but if we avoid
    eyesight fixation or tunnel vision, it can be
    interesting too
     –   Memory consumed by kmalloc is important, but not
         everything. Application layer malloc, shared libraries
         need to be considered
     –   Size is not everything. Fragmentation and low-
         memory CPU utilization are also important
     –   It is interesting to find out that client-server model
         can be preferred for memory reasons

Contenu connexe

Tendances

Os solaris memory management
Os  solaris memory managementOs  solaris memory management
Os solaris memory managementTech_MX
 
31 address binding, dynamic loading
31 address binding, dynamic loading31 address binding, dynamic loading
31 address binding, dynamic loadingmyrajendra
 
Linux memorymanagement
Linux memorymanagementLinux memorymanagement
Linux memorymanagementpradeepelinux
 
Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernelVadim Nikitin
 
Windows memory manager internals
Windows memory manager internalsWindows memory manager internals
Windows memory manager internalsSisimon Soman
 
Linux memory consumption
Linux memory consumptionLinux memory consumption
Linux memory consumptionhaish
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory ManagementRajan Kandel
 
Swap-space Management
Swap-space ManagementSwap-space Management
Swap-space ManagementAgnas Jasmine
 
Swap space management and protection in os
Swap space management and protection  in osSwap space management and protection  in os
Swap space management and protection in osrajshreemuthiah
 
Unix Memory Management - Operating Systems
Unix Memory Management - Operating SystemsUnix Memory Management - Operating Systems
Unix Memory Management - Operating SystemsDrishti Bhalla
 

Tendances (20)

Os solaris memory management
Os  solaris memory managementOs  solaris memory management
Os solaris memory management
 
virtual memory - Computer operating system
virtual memory - Computer operating systemvirtual memory - Computer operating system
virtual memory - Computer operating system
 
31 address binding, dynamic loading
31 address binding, dynamic loading31 address binding, dynamic loading
31 address binding, dynamic loading
 
Linux memorymanagement
Linux memorymanagementLinux memorymanagement
Linux memorymanagement
 
Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernel
 
Windows memory manager internals
Windows memory manager internalsWindows memory manager internals
Windows memory manager internals
 
Linux memory consumption
Linux memory consumptionLinux memory consumption
Linux memory consumption
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
Memory virtualization
Memory virtualizationMemory virtualization
Memory virtualization
 
Memory comp
Memory compMemory comp
Memory comp
 
Swap-space Management
Swap-space ManagementSwap-space Management
Swap-space Management
 
Cache memory
Cache memoryCache memory
Cache memory
 
Memory managment
Memory managmentMemory managment
Memory managment
 
SQL 2005 Memory Module
SQL 2005 Memory ModuleSQL 2005 Memory Module
SQL 2005 Memory Module
 
Cache memory
Cache memoryCache memory
Cache memory
 
Os
OsOs
Os
 
Memory System
Memory SystemMemory System
Memory System
 
Swap space management and protection in os
Swap space management and protection  in osSwap space management and protection  in os
Swap space management and protection in os
 
Unix Memory Management - Operating Systems
Unix Memory Management - Operating SystemsUnix Memory Management - Operating Systems
Unix Memory Management - Operating Systems
 
Memory
MemoryMemory
Memory
 

En vedette

Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...
Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...
Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...Anne Nicolas
 
Abi capabilities
Abi capabilitiesAbi capabilities
Abi capabilitiesABI
 
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingKernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingAnne Nicolas
 
Introduction to spartakus and how it can help fight linux kernel ABI breakages
Introduction to spartakus and how it can help fight linux kernel ABI breakagesIntroduction to spartakus and how it can help fight linux kernel ABI breakages
Introduction to spartakus and how it can help fight linux kernel ABI breakagesSamikshan Bairagya
 
FISL XIV - The ELF File Format and the Linux Loader
FISL XIV - The ELF File Format and the Linux LoaderFISL XIV - The ELF File Format and the Linux Loader
FISL XIV - The ELF File Format and the Linux LoaderJohn Tortugo
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenLex Yu
 
Linux binary analysis and exploitation
Linux binary analysis and exploitationLinux binary analysis and exploitation
Linux binary analysis and exploitationDharmalingam Ganesan
 
Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406Tim Bunce
 
PCD - Process control daemon - Presentation
PCD - Process control daemon - PresentationPCD - Process control daemon - Presentation
PCD - Process control daemon - Presentationhaish
 
Android Memory , Where is all My RAM
Android Memory , Where is all My RAM Android Memory , Where is all My RAM
Android Memory , Where is all My RAM Yossi Elkrief
 
Perl Memory Use 201209
Perl Memory Use 201209Perl Memory Use 201209
Perl Memory Use 201209Tim Bunce
 
Process control daemon
Process control daemonProcess control daemon
Process control daemonhaish
 
Workshop - Linux Memory Analysis with Volatility
Workshop - Linux Memory Analysis with VolatilityWorkshop - Linux Memory Analysis with Volatility
Workshop - Linux Memory Analysis with VolatilityAndrew Case
 
Александр Терещук - Memory Analyzer Tool and memory optimization tips in Android
Александр Терещук - Memory Analyzer Tool and memory optimization tips in AndroidАлександр Терещук - Memory Analyzer Tool and memory optimization tips in Android
Александр Терещук - Memory Analyzer Tool and memory optimization tips in AndroidUA Mobile
 
Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Tim Bunce
 
4. Memory virtualization and management
4. Memory virtualization and management4. Memory virtualization and management
4. Memory virtualization and managementHwanju Kim
 

En vedette (20)

Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...
Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...
Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...
 
Abi capabilities
Abi capabilitiesAbi capabilities
Abi capabilities
 
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingKernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
 
Introduction to spartakus and how it can help fight linux kernel ABI breakages
Introduction to spartakus and how it can help fight linux kernel ABI breakagesIntroduction to spartakus and how it can help fight linux kernel ABI breakages
Introduction to spartakus and how it can help fight linux kernel ABI breakages
 
FISL XIV - The ELF File Format and the Linux Loader
FISL XIV - The ELF File Format and the Linux LoaderFISL XIV - The ELF File Format and the Linux Loader
FISL XIV - The ELF File Format and the Linux Loader
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_Tizen
 
Linux binary analysis and exploitation
Linux binary analysis and exploitationLinux binary analysis and exploitation
Linux binary analysis and exploitation
 
Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406
 
PCD - Process control daemon - Presentation
PCD - Process control daemon - PresentationPCD - Process control daemon - Presentation
PCD - Process control daemon - Presentation
 
Linux Memory
Linux MemoryLinux Memory
Linux Memory
 
Android Memory , Where is all My RAM
Android Memory , Where is all My RAM Android Memory , Where is all My RAM
Android Memory , Where is all My RAM
 
Perl Memory Use 201209
Perl Memory Use 201209Perl Memory Use 201209
Perl Memory Use 201209
 
Poster_Jan
Poster_JanPoster_Jan
Poster_Jan
 
Process control daemon
Process control daemonProcess control daemon
Process control daemon
 
Workshop - Linux Memory Analysis with Volatility
Workshop - Linux Memory Analysis with VolatilityWorkshop - Linux Memory Analysis with Volatility
Workshop - Linux Memory Analysis with Volatility
 
Александр Терещук - Memory Analyzer Tool and memory optimization tips in Android
Александр Терещук - Memory Analyzer Tool and memory optimization tips in AndroidАлександр Терещук - Memory Analyzer Tool and memory optimization tips in Android
Александр Терещук - Memory Analyzer Tool and memory optimization tips in Android
 
Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Perl Memory Use - LPW2013
Perl Memory Use - LPW2013
 
Memory in Android
Memory in AndroidMemory in Android
Memory in Android
 
4. Memory virtualization and management
4. Memory virtualization and management4. Memory virtualization and management
4. Memory virtualization and management
 
Memory management in linux
Memory management in linuxMemory management in linux
Memory management in linux
 

Similaire à Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libraries, and Toolchain

Driver development – memory management
Driver development – memory managementDriver development – memory management
Driver development – memory managementVandana Salve
 
Quick introduction to Java Garbage Collector (JVM GC)
Quick introduction to Java Garbage Collector (JVM GC)Quick introduction to Java Garbage Collector (JVM GC)
Quick introduction to Java Garbage Collector (JVM GC)Marcos García
 
Jug Lugano - Scale over the limits
Jug Lugano - Scale over the limitsJug Lugano - Scale over the limits
Jug Lugano - Scale over the limitsDavide Carnevali
 
Caching principles-solutions
Caching principles-solutionsCaching principles-solutions
Caching principles-solutionspmanvi
 
SanDisk: Persistent Memory and Cassandra
SanDisk: Persistent Memory and CassandraSanDisk: Persistent Memory and Cassandra
SanDisk: Persistent Memory and CassandraDataStax Academy
 
Recent advances in the Linux kernel resource management
Recent advances in the Linux kernel resource managementRecent advances in the Linux kernel resource management
Recent advances in the Linux kernel resource managementOpenVZ
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedEqunix Business Solutions
 
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time SystemsTaming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time SystemsHeechul Yun
 
Dsmp Whitepaper Release 3
Dsmp Whitepaper Release 3Dsmp Whitepaper Release 3
Dsmp Whitepaper Release 3gelfstrom
 
network ram parallel computing
network ram parallel computingnetwork ram parallel computing
network ram parallel computingNiranjana Ambadi
 
Tips and Tricks for SAP Sybase IQ
Tips and Tricks for SAP  Sybase IQTips and Tricks for SAP  Sybase IQ
Tips and Tricks for SAP Sybase IQDon Brizendine
 
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1IBM India Smarter Computing
 
Vmreport
VmreportVmreport
Vmreportmeru2ks
 
Spectrum Scale Memory Usage
Spectrum Scale Memory UsageSpectrum Scale Memory Usage
Spectrum Scale Memory UsageTomer Perry
 

Similaire à Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libraries, and Toolchain (20)

Driver development – memory management
Driver development – memory managementDriver development – memory management
Driver development – memory management
 
Quick introduction to Java Garbage Collector (JVM GC)
Quick introduction to Java Garbage Collector (JVM GC)Quick introduction to Java Garbage Collector (JVM GC)
Quick introduction to Java Garbage Collector (JVM GC)
 
Jug Lugano - Scale over the limits
Jug Lugano - Scale over the limitsJug Lugano - Scale over the limits
Jug Lugano - Scale over the limits
 
UNIT IV.pptx
UNIT IV.pptxUNIT IV.pptx
UNIT IV.pptx
 
Linux Huge Pages
Linux Huge PagesLinux Huge Pages
Linux Huge Pages
 
Caching principles-solutions
Caching principles-solutionsCaching principles-solutions
Caching principles-solutions
 
SanDisk: Persistent Memory and Cassandra
SanDisk: Persistent Memory and CassandraSanDisk: Persistent Memory and Cassandra
SanDisk: Persistent Memory and Cassandra
 
Recent advances in the Linux kernel resource management
Recent advances in the Linux kernel resource managementRecent advances in the Linux kernel resource management
Recent advances in the Linux kernel resource management
 
Cache memory presentation
Cache memory presentationCache memory presentation
Cache memory presentation
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
 
Linux%20 memory%20management
Linux%20 memory%20managementLinux%20 memory%20management
Linux%20 memory%20management
 
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time SystemsTaming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems
 
Dsmp Whitepaper Release 3
Dsmp Whitepaper Release 3Dsmp Whitepaper Release 3
Dsmp Whitepaper Release 3
 
network ram parallel computing
network ram parallel computingnetwork ram parallel computing
network ram parallel computing
 
Tips and Tricks for SAP Sybase IQ
Tips and Tricks for SAP  Sybase IQTips and Tricks for SAP  Sybase IQ
Tips and Tricks for SAP Sybase IQ
 
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
 
Shignled disk
Shignled diskShignled disk
Shignled disk
 
Vmreport
VmreportVmreport
Vmreport
 
CPU Caches
CPU CachesCPU Caches
CPU Caches
 
Spectrum Scale Memory Usage
Spectrum Scale Memory UsageSpectrum Scale Memory Usage
Spectrum Scale Memory Usage
 

Dernier

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 

Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libraries, and Toolchain

  • 1. Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libraries, and Toolchain Xi Wang Broadcom Corporation Embedded Linux Conference 2011
  • 2. Introduction ● Search with more breadth and depth – Visit more layers ● This talk is not about vmlinux or kmalloc – Not just reducing size. Goal is to add features without adding memory ● E.g. improve system performance under low memory situation ● About medium-sized embedded systems – With DRAM + Flash, no swap partition – Lightly configured kernel image fits in comfortably, but need more space for everything else – Based on 2.6.2x, let me know if there are new changes
  • 3. Introduction Kernel Dynamic Memory SLAB Kernel Dynamic Data kmalloc Allocator (SLOB/SLUB All Physical Memory Optional) (Embedded System, ignoring Zones, NUMA here) Kernel Module Code Segment vmalloc User Space Memory alloc_pages Environment Paged Memory Stack Buddy Shared Memory System Allocator Reserved Atomic Heap Memory Early Statically Allocated Memory Virtual Memory (add_memory_region) Management Page Cache Reclaim Shared Libraries Kernel Code and Static Data (vmlinux) Code
  • 4. Kernel Memory Lifecycle ● Understanding kernel memory lifecycle – Memory is an active subsystem with its own behaviors – Kernel gives away memory carelessly and let everyone keep it ● E.g., file caching, free memory in slab not returned immediately – Sometimes try to get it back ● Periodic reclaiming from kswapd ● When there is not enough memory ● Most apparent effect of memory life cycle: Free memory in /proc/meminfo is usually low unless you have a lot of memory – Never use the free memory number in /proc/meminfo as free memory is a fuzzy concept in Linux kernel. Think in terms of memory pressure instead
  • 5. Improve System Low-Memory Behavior Low memory system behavior can be poor in embedded systems ● Symptoms – System slows down significantly when memory is low, but there is still enough memory – OOM kills processes too eagerly ● Reasons – Target system for typical kernel developers is DRAM + hard drive, so some behaviors may not be the test fit for DRAM + Flash embedded systems
  • 6. Improve System Low-Memory Behavior – When memory is low, kernel spends a lot of time in kswapd trying swap out pages, but only with limited success – dirty pages have nowhere to go in a swapless system ● Performance impact ● More chances for memory-introduced deadlock – Incremental search for free-able pages would not be optimal for better speed or reducing fragmentation under low memory conditions mmzone.h /* * The "priority" of VM scanning is how much of the queues we will scan in one * go. A value of 12 for DEF_PRIORITY implies that we will scan 1/4096th of the * queues ("queue_length >> 12") during an aging round. */ #define DEF_PRIORITY 12
  • 7. Improve System Low-Memory Behavior ● Keep system running when memory is low – Lower the priority kswap kernel thread ● Rationale: In a swapless system, it cannot really swap – Made a big difference, no serious side effects – When trying to free pages, look at more pages at the first try (make DEF_PRIORITY smaller) ● Not enough data to tell the effect – Turn off OOM killer ● For small closed embedded system, every process is essential, as simple solution is to reboot the system instead of killing processes ● Excessive page frame reclaim activity can cause other problems as well – Power management – Real-time latency
  • 8. Tap Into Reserves ● When kernel thinks memory is low? ● Parameters of interest: Minimum free memory standards maintained by kernel – Reserved memory pool for GFP_ATOMIC ● /proc/sys/vm/min_free_kbytes ● Default value is calculated based on memory size – Zone watermarks put a threshold on fragmentation ● Certain number of free pages at certain size ● See zone_watermark_ok function for details ● When the minimum standard cannot be met, kswapd will keep running – Until it meets the standard or calls OOM killer
  • 9. Tap Into Reserves ● Tweak possible – Reduce reserved memory pool ● From sysfs in newer kernels, /proc/sys/vm/min_free_kbytes – Lower fragmentation threshold ● Can change algorithm in zone_watermark_ok function ● Trade-offs – Running out of memory in interrupt context – Not enough big free memory blocks – But for embedded systems we have better control, can be running closer to the limit
  • 10. Fight Fragmentation ● Fragmented free memory can be as bad as no memory – don't ignore fragmentation ● Try not to create fragmentation – Preallocate memory when possible – Write a task-specific memory allocator if needed ● Able to use fragmented memory – Use vmalloc for allocating bigger memory blocks – Move code to user space ● Already discussed – Lower fragmentation threshold – When trying to free pages, look at more pages at the first try ● Not enough data to tell the effect
  • 11. More on Kernel Memory ● Alternative allocators – SLOB – SLUB ● Reference – “Understanding the Linux Kernel” Bovet & Cesati. O'Reilly Media
  • 12. System Design Issue: MMU Is An Advantage, Use It uClinux may look like an interesting option, but regular kernel with virtual memory is a better bet if you have many user applications ● The default behavior, read-only “swapping” between Flash and DRAM is beneficial – More programs with less DRAM – load from Flash into DRAM when needed – Slows down when overloaded, but graceful degradation – Only possible with virtual memory ● Execute in place may not be a good trade-off – A lot more flash space for a little less DRAM space
  • 13. Problems with Shared Library Shifting focus into user space ● Lower code density of .so – Function need to be ABI-compliant – Position independent code – Functions cannot be inlined – Limit inter-function compiler optimizations ● Need per-instance memory pages for dynamic linking – Dynamic linking tables (GOT, PLT) – Global variables – Need to watch this overhead: If 10 processes in the system and each of them is linked to 10 shared libraries, ~400k of free memory is consumed by dynamic linking
  • 14. Move Away From Shared Library ● Tweak: Do not link unnecessary shared libraries – Check your linker options ● Better Tweak: Use client-server model in place of shared libraries when possible – From: m application processes linked to n shared libraries – To: Create a server process with n statically linked libraries. Use IPC mechanisms to connect m application processes to the server process
  • 15. Toolchain Considerations ● Plain old malloc allocator – Two level memory allocation for user applications: malloc allocator on top of kernel page allocation ● For performance reasons, malloc does not return all the free pages to kernel, it is done only when free memory exceed a threshold – These are dirty pages so kernel has no way to reuse it in a swapless system – Tweak: We can adjust parameters to make malloc allocator return unused pages to kernel more eagerly ● libc: Call mallocopt function with M_TRIM_THRESHOLD as parameter ● uClibc: No mallocopt function, change DEFAULT_TRIM_THRESHOLD at compile time – Default is 256K, too big for quite a few systems
  • 16. Tools ● Some proc entries – Ignore free memory shown in /proc/meminfo – echo 3 > /proc/sys/vm/drop_caches followed /proc/meminfo is better – /proc/<pid>/maps for user processes ● Tools from Matt Mackall – /proc/kpagemap, /proc/<pid>/pagemap: Page walk tool – /proc/<pid>/smaps ● Aware of shared memory pages ● “Proportional Set Size (PSS)”: Divide shared pages by number of process sharing and attribute to each process – Matt Mackall's talks at ELC 2007, 2008, 2009 ● http://elinux.org/ELC_2009_Presentations ● http://lwn.net/Articles/230975
  • 17. Tools ● My (experimental) metric: Kernel Mobile Memory – A number that simple math actually works ● Kernel Mobile Memory = 5000k → kmalloc 100k → Kernel Mobile Memory = 4900k – Kernel Mobile Memory = # of memory pages that are free, or can be recycled by the kernel at anytime. For DRAM + Flash swapless system ● Include: user application code segments (file backed pages that are not dirty), other cached files, etc. ● Exclude: all dirty pages, memory consumed by kmalloc, user space malloc, etc.
  • 18. Tools – How to measure ● Non-intrusive: Should be able to measure by walk through page table and get statistics ● Intrusive: Let kernel memory reclaim mechanisms until it cannot find any free pages (super version of drop_caches), then do /proc/meminfo. – I wrote some code for older kernels, but it is not production- quality and it did not get updated for newer kernels – You can try to reuse power management code that puts the system into hibernation (pm_suspend_disk), as that part of kernel code does very similar things – Use ● If it is very low in absolute terms (for example, <1.5M), the system is about to lock up ● If it is small compared to the working set (Don't have a way to measure:)), the system runs slowly
  • 19. Revisit Kernel Dynamic Memory Kernel Dynamic Data SLAB kmalloc Allocator (SLOB/SLUB All Physical Memory Optional) (Embedded System, ignoring Kernel Module Code Segment Zones, NUMA here) vmalloc User Space Memory alloc_pages Environment Stack Paged Memory Shared Memory Buddy System Allocator Reserved Heap Atomic Memory Early Statically Allocated Memory Virtual Memory (add_memory_region) Management Jump Tables For Shared Libraries Page Cache Reclaim Kernel Code and Static Data Shared Libraries (vmlinux) Code
  • 20. Don't Forget the Basics ● Squash + LZMA is usually an efficient read- only root memory system ● Remove unused symbols in shared libraries – Debian mklibs script ● http://packages.debian.org/unstable/devel/mklibs ● Turn off unused kernel options ● Check all GCC options for your platform – Choose an efficient ABI if available ● These are low-level optimizations. Don't forget algorithm and data structure optimizations – O(..) level improvements – Use arrays instead of linked list, trees when possible to avoid overheads
  • 21. Conclusion ● This may be tedious work, but if we avoid eyesight fixation or tunnel vision, it can be interesting too – Memory consumed by kmalloc is important, but not everything. Application layer malloc, shared libraries need to be considered – Size is not everything. Fragmentation and low- memory CPU utilization are also important – It is interesting to find out that client-server model can be preferred for memory reasons