SlideShare a Scribd company logo
1 of 33
Linux Memory Management
Kamal Maiti
Sr. Linux System Engineer
Amdocs DVCI, Pune, India
AGENDA
 Basic concept of computer
 Hardware, firmware, driver, software, application
 CPU, RAM, How RAM used
 Moving Information within Computer
 Primary & Other Memory,
 Segment of RAM
 Memory Mapping, Process Address Space
 Page, Frame, Hugepage, MMU etc.
 Virtual Memory, PageCache
 Memory nodes, zones, lowmem
 NUMA
 Kernel Memory allocator
 Pagefault Handling, Tools, Memory leak, Memory related issues
 Hands-on Troubleshooting : sysrq, backtrace analysis, OOM messages investigation etc
BASIC CONCEPTS OF COMPUTER HARDWARE
 This model of the typical digital computer is often called the von
Neumann computer.
 Programs and data are stored in the same memory: primary memory
CPU
(Central Processing Unit)
Input
Units
Output
Units
Primary Memory
HARDWARE, FIRMWARE, DRIVER, SOFTWARE, APPLICATION
Hardware : All computer devices like - Input, Output
devices, Motherboard, mouse, keyboard
Firmware : Vendor provided low level codes that
interacts with hardware to get the output of instructions
passed to device.
Driver : On top of firmware, driver is used to interacts with
firmware or hardware directly.
Software/Application: which interacts with system calls
to call kernel and kernel interacts with driver to get the
output.
CPU
 The three major components of the CPU are:
1. Arithmetic Unit (Computations performed)
Accumulator (Results of computations kept here)
2. Control Unit (Has two locations where numbers are kept)
Instruction Register (Instruction placed here for
analysis)
Program Counter (Which instruction will be
performed next?)
3. Instruction Decoding Unit (Decodes the instruction)
 Motherboard: The place where most of the electronics including
the CPU are mounted.
RAM
 Commonly known as random access memory, or just
RAM
 Holds instructions and data needed for programs that
are currently running
 RAM is usually a volatile type of memory
 Contents of RAM are lost when power is turned off
HOW RAM USED ?
Memory is used to store:
 i) instructions - > to execute a program
 ii) data -> When the computer is doing any job, the data that
have to be processed are stored in the primary memory. This
data may come from an input device like keyboard or from a
secondary storage device like a floppy disk.
MOVING INFORMATION WITHIN THE COMPUTER
 How do binary numerals move into, out of, and within the computer?
 Information is moved about in bytes, or multiple bytes called
words.
 Words are the fundamental units of information.
 The number of bits per word may vary per computer.
 A word length for most large IBM computers is 32 bits:
MOVING INFORMATION WITHIN THE COMPUTER …
 Bits that compose a word are passed in parallel from place to
place.
 Ribbon cables:
 Consist of several wires, molded together.
 One wire for each bit of the word or byte.
 Additional wires coordinate the activity of moving
information.
 Each wire sends information in the form of a voltage
pulse.
MOVING INFORMATION WITHIN THE COMPUTER …
 Example of sending the word WOW over the ribbon cable
 Voltage pulses corresponding to the ASCII codes would pass
through the cable.
PRIMARY MEMORY
 Primary storage or memory: Where the data & program that are
currently in operation or being accessed are stored during use.
 Consists of electronic circuits: Extremely fast and expensive.
 Two types:
 RAM (non-permanent)
 Programs and data can be stored here for the
computer’s use.
 Volatile: All information will be lost once the computer
shuts down.
 ROM (permanent)
 Contents do not change.
 ROM : a transistor [storing video game software, electronic musical
instruments]. ROM is mostly used for firmware updates.
 EROM : Erasable programmable read-only memory
 EEPROM :Electrically Erasable Programmable Read-Only Memory
 Cache : Location in RAM where data is stored for a certain amount of time of
that it can be reused.
 Registers : various flip flop register[RS, D, JK, shift etc] holds information
 Swap : External disk is used to accommodate the demand of more RAM.
OTHER MEMORY
SEGMENT OF RAM
 Low mem, high mem, Normal mem, DMA, DMA32
 On a 32-bit architecture[DMA, Normal & HighMem] : the
address space range for addressing RAM is:
0x00000000 - 0xffffffff or 4'294'967'295 (4 GB).
The user space range: 0x00000000 - 0xbfffffff or 3 GB
The kernel space range: 0xc0000000 - 0xffffffff or 1 GB
Linux splits the 1GB kernel space into 2 pieces: LOWMEM and HIGHMEM.
 On 64 bit machine[DMA, DMA32 & Normal] : Normal
memory available beyond 4 GB
MEMORY MAPPING
 Linux uses only 4 segments in 32 bit arch:
 2 segments (code and data/stack) for KERNEL SPACE from [0xC000 0000] (3 GB) to [0xFFFF FFFF] (4 GB)
 2 segments (code and data/stack) for USER SPACE from [0] (0 GB) to [0xBFFF FFFF] (3 GB)
See virtual Map : $ pmap <PID> , see stack : $pstack <PID>
 Segmentation, Paging [To overcome flaw in segmentation] –
 allocating virtual small pages to each process so that they will be fit in RAM with out wasting it.
PROCESS ADDRESS SPACE – 31 BIT ARCH
Kernel
0xC0000000
File name, Environment
Arguments
Stack
Bss[Block started by Symbol]
_end
_bss_start
Data
_edata
_etext Text/code
Header
0x84000000
Shared Libs
Text/Code Segment: contains the actual
code
Data: contains global variables
BSS: contains uninitialized global variables
Heap: dynamic memory
Stack: collection of frames/functions
Heap
Unused Memory
4 GB -->
3 GB -->
0 GB -->
Kernel Space
User Space
PAGE & FRAME
 Paging, Demand Paging, Swapping
 Page Tables [64 bit 4, 32 bit 2]: Page Global Directory, Page Upper Directory,
Page Middle Directory, Page
 Min page size : getconf -a|grep -i page
 Life cycle of page: active----> inactive list --> dirty --> clean
SWAP, HUGE PAGE, MMU,TLB
 SWAP : All pages can’t be fit in RAM, need to call/send data from and to storage
disk
 Hugepage : default page is 4MB but large program uses chunks of memory area.
Hence, allow large page. [sysctl -a|grep -i huge]
 MMU/TLB : Responsible for translating logical address to physical address. TLB is buffer
that is used by MMU.
 Active/Inactive regions [cat /proc/meminfo]
 Shmem : shared memory area[ipcs -m]
 Buddyinfo : view memory fragmentation/ allocation[cat /proc/buddyinfo]
 Cache : For speeding up, sync to flush out and forcefully write on disk, bdflush does
at background [flush-253:0 in rhel 6]
buffer's policy is first-in, first-out
cache's policy is Least Recently Used[LRU] [$ vmstat -S M 1]
VIRTUAL MEMORY, HOW PROGRAM MAPS?
 Executable text
 Executable data
 Heap space
 Stack
 Get exact required memory by process :
 $ pmap -x <pid>,
 $cat /proc/<pid>/status
PAGE CACHE MEMORY CONTROL
 vm.dirty_expire_centisecs=2000
 vm.dirty_writeback_centisecs=400 //how long they’ll wait
 vm.dirty_background_ratio=5 // when percentage of total RAM filled, pdflush/flush daemon will
start write dirty data on disk
 vm.dirty_ratio=20 //when percentage of total RAM filled, process will start write data on disk
 vfs_cache_pressure [100] : controls the tendency of the kernel to reclaim the memory which is
used for caching of directory and inode objects
 Swappiness[60] : controls how kernel will use swap space.
 To free pagecache:
To free pagecache: echo 1 > /proc/sys/vm/drop_caches
To free dentries and inodes : echo 2 > /proc/sys/vm/drop_caches
To free pagecache, dentries and inodes: echo 3 > /proc/sys/vm/drop_caches
 cache writes done by : kernel thread pdflush/bdflush, now in rhel 6 it is flush.
 Life cycle of pages :
active---->inactive list -->dirty > clean
Link : https://www.kernel.org/doc/Documentation/sysctl/vm.txt
PHYSICAL MEMORY ALLOCATION LIMIT
 CommitLimit : total mem to be allocated based on ovcercommit_ratio
 Committed_AS : currently allocated
 overcommit_memory : from 0 to 2 << Start from here
0 = allow available memory on the system to be overloaded //default
1 = no memory over commit handling
2 = allocate best on overcommit_ratio // allocate best on condition
 Overcommit_ratio: % of RAM when overcommit_memory is set 2, default value 50
Example : 4 GB RAM, 2 GB Swap, overcommit_memory=2, Overcommit_ratio=50 , so
commitLimit = 2+ (4*50/100)=2+2= 4 GB
Issue : Application failed to start due to shortage of memory, Needed to disable
WHY MEMORY CACHE IS REALLY REQUIRED
Speed up processing :
 $ cat > XYZ
 $ echo 3 > /proc/sys/vm/drop_caches
 $ time cat XYZ //much time
 $ time cat XYZ //less time
MEMORY NODES, ZONES IN 32 BIT & 64 BIT
 Below zones are in 32 bits :
 Zone_DMA (0-16MB)
 Zone_Normal (16MB-896MB)
 ZONE_HIGH_MEM (896MB-above)
HIGHMEM's lower zone is NORMAL+DMA , NORMAL's lower zone is DMA.
 Below zones are in 64 bits :
 Normal : Beyond 4 GB
 DMA : till 16 MB
 DMA32 : till 4GB
 $ cat /proc/zoneinfo
 $ cat /proc/pagetypeinfo
 $cat /proc/<pid>/numa_maps
 $ cat /proc/buddyinfo
LOW MEMORY, ZONE_RECLAIM
 "lowmem" often means NORMAL+DMA
 “lowmem” is not present in RHEL 6, 64bit
 Reservation is controlled by : lowmem_reserve_ratio [DMA NORMAL HIGMEM]
 cat /proc/sys/vm/lowmem_reserve_ratio
256 256 32 // (1/256)*100 % = 0.39% of nearset zone is reserved
 zone_reclaim_mode: How more or less aggressive approaches to reclaim
memory when a zone runs out of memory
1 = Zone reclaim on
2 = Zone reclaim writes dirty pages out
4 = Zone reclaim swaps pages
NON-UNIFORM MEMORY ACCESS(NUMA)
 Numa concept :
Numa Placement – placement of processor & Memory, manual – application,
MPI(Message Passing Interface)
 Place application in correct node
 Two memory policy – Node Local[after linux boot], Interleave [during kernel boot]
 cat /proc/<pid>/numa_maps
 numactl -s //show policy
 numactl –hardware
 numactl [ --interleave nodes ] [ --preferred node ] [ --membind nodes ] [ --cpunodebind nodes ] [ --physcpubind cpus ] [ --
localalloc ] [--] command {arguments ...}
Ref : http://www.redhat.com/summit/2012/pdf/2012-DevDay-Lab-NUMA-Hacker.pdf
NUMA MANAGEMENT
 numactl --physcpubind=0,1,2,3 example_process
 numactl --physcpubind=0-3 example_process
 numactl --cpunodebind=2 example_process //run on this cpu
 numactl --physcpubind=0 --localalloc example_process
 numactl --membind=4 example_process
 numactl --cpunodebind=0 example_process //Only execute command on the CPUs of 0
 numactl --cpubind=0 --membind=0,1 process // Run process on node 0 with memory allocated on
node 0 and 1
 numactl –hardware
 cat /sys/devices/system/node/node*/numastat
 Allocation : $watch -n1 numastat
KERNEL MEMORY ALLOCATORS
 Low-level page allocator :
 Buddy system for contiguous multi-page allocations
 Provides pages for
 in-kernel allocations (slab cache)
 vmalloc areas (kernel modules, multi-page data areas)
 page cache, anonymous user pages
 misc. other users
 Slab cache :
 Manages allocations of objects of the same type
 Large-scale users: inodes, dentries, block I/O, network ...
 kmalloc (generic allocator) implemented on top
 Tool : slabtop
PAGE FAULT HANDLING
 Hardware support :
 Accessing invalid pages causes 'page translation' check
 Writing to protected pages causes 'protection exception'
 Translation-exception identification provides address
 'Suppression on protection' facility essential!
 Linux kernel page fault handler :
 Determine address/access validity according to VMA
 Invalid accesses cause SIGSEGV delivery
 Valid accesses trigger: page-in, swap-in, copy-on-write
 Extra support for stack VMA: grows automatically
 Out-of-memory if overcommitted causes SIGBUS
TOOLS TO CHECK MEMORY USAGE
 Report paging statistics : sar -B
 Report memory utilization statistics : sar –r
 Report memory statistics : sar –R
 Report swap space utilization statistics: sar –S
 Current memory usage :
 free –m|k|g
 Cat /proc/meminfo
 Memory allocation :
 cat /proc/buddyinfo
 VM memory allocation:
 pmap -x <PID>
 Cat /proc/<pid>/status
 Display kernel slab cache & memory information in real time:
 slabtop
 vmstat
 ps
 top
 cat /proc/meminfo
 strace, gcore
MEMORY LEAK CHECK
 Usage check : historical sar report
 mtrace : builtin c function.
 Valgrind :
 valgrind --tool=memcheck --leak-check=full --show-reachable=yes snmpd -f –Lo
ISSUES RELATED TO MEMORY
 TCP/IP communication delay – RH cluster broken
 High cache usage : slowdown application / system
 Memory pressure : Memory leak, App is not tuned properly
 Memory fragmentation : hugepage not used
 OOM killer kills application: Memory pressure, OOM is enabled
by default, kills based on badness value.
 Segmentation fault : Kernel reclaims in normal/low memory
region, hence no room for kernel, encounters segmentation
fault.
 Faulty Memory : Hardware failure or circuit failure in chip, need
a diagnosis and replace chip
TROUBLESHOOTING MEMORY ISSUE
 Memory & swap usage test :
swap_tendency = mapped_ratio/2 + distress + vm_swappiness
mapped_ratio= % of physical memory in use
distress = how much trouble kernel in freeing memory
vm_swappiness= default 60
swap_tendency >= 100, eligible for swap
swap_tendency < 100, reclaim from page cache
 Sysrq :
echo 1 > /proc/sys/kernel/sysrq
echo m > /proc/sysrq-trigger
 backtrace analysis
TROUBLESHOOTING
 OOM messages investigation :
Messages :
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461588] [] oom_kill_process+0x5c/0x80
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461591] [] out_of_memory+0xc5/0x1c0
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461595] [] __alloc_pages_nodemask+0x72c/0x740
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461599] [] __get_free_pages+0x1c/0x30
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461602] [] get_zeroed_page+0x12/0x20
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461606] [] fill_read_buffer.isra.8+0xaa/0xd0
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461609] [] sysfs_read_file+0x7d/0x90
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461613] [] vfs_read+0x8c/0x160
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461616] [] ? fill_read_buffer.isra.8+0xd0/0xd0
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461619] [] sys_read+0x3d/0x70
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461624] [] sysenter_do_call+0x12/0x28
Q/A
Ref :
https://www.kernel.org/
https://www.redhat.com/en
http://www.tldp.org/LDP/tlk/mm/memory.html
https://en.wikipedia.org/wiki/Virtual_memory
https://lwn.net/

More Related Content

What's hot

How to use KASAN to debug memory corruption in OpenStack environment- (2)
How to use KASAN to debug memory corruption in OpenStack environment- (2)How to use KASAN to debug memory corruption in OpenStack environment- (2)
How to use KASAN to debug memory corruption in OpenStack environment- (2)
Gavin Guo
 

What's hot (20)

Continguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux KernelContinguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux Kernel
 
Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux Kernel
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
 
Reverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux KernelReverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux Kernel
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdf
 
Physical Memory Models.pdf
Physical Memory Models.pdfPhysical Memory Models.pdf
Physical Memory Models.pdf
 
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedVmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
 
spinlock.pdf
spinlock.pdfspinlock.pdf
spinlock.pdf
 
How to use KASAN to debug memory corruption in OpenStack environment- (2)
How to use KASAN to debug memory corruption in OpenStack environment- (2)How to use KASAN to debug memory corruption in OpenStack environment- (2)
How to use KASAN to debug memory corruption in OpenStack environment- (2)
 
Advanced Namespaces and cgroups
Advanced Namespaces and cgroupsAdvanced Namespaces and cgroups
Advanced Namespaces and cgroups
 
Linux Memory
Linux MemoryLinux Memory
Linux Memory
 
Workshop - Linux Memory Analysis with Volatility
Workshop - Linux Memory Analysis with VolatilityWorkshop - Linux Memory Analysis with Volatility
Workshop - Linux Memory Analysis with Volatility
 
Linux Performance Tunning Memory
Linux Performance Tunning MemoryLinux Performance Tunning Memory
Linux Performance Tunning Memory
 
Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
 
Linux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKBLinux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKB
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
 
Extreme Linux Performance Monitoring and Tuning
Extreme Linux Performance Monitoring and TuningExtreme Linux Performance Monitoring and Tuning
Extreme Linux Performance Monitoring and Tuning
 
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
 

Viewers also liked

Christo kutrovsky oracle, memory & linux
Christo kutrovsky   oracle, memory & linuxChristo kutrovsky   oracle, memory & linux
Christo kutrovsky oracle, memory & linux
Kyle Hailey
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_Tizen
Lex Yu
 
Process' Virtual Address Space in GNU/Linux
Process' Virtual Address Space in GNU/LinuxProcess' Virtual Address Space in GNU/Linux
Process' Virtual Address Space in GNU/Linux
Varun Mahajan
 

Viewers also liked (20)

Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
Memory management in linux
Memory management in linuxMemory management in linux
Memory management in linux
 
Christo kutrovsky oracle, memory & linux
Christo kutrovsky   oracle, memory & linuxChristo kutrovsky   oracle, memory & linux
Christo kutrovsky oracle, memory & linux
 
Tuning Android for low RAM
Tuning Android for low RAMTuning Android for low RAM
Tuning Android for low RAM
 
DLL Injection
DLL InjectionDLL Injection
DLL Injection
 
Os Linux
Os LinuxOs Linux
Os Linux
 
Linux memorymanagement
Linux memorymanagementLinux memorymanagement
Linux memorymanagement
 
Shared memory
Shared memoryShared memory
Shared memory
 
(120513) #fitalk an introduction to linux memory forensics
(120513) #fitalk   an introduction to linux memory forensics(120513) #fitalk   an introduction to linux memory forensics
(120513) #fitalk an introduction to linux memory forensics
 
Understanding of linux kernel memory model
Understanding of linux kernel memory modelUnderstanding of linux kernel memory model
Understanding of linux kernel memory model
 
Linux Memory Basics for SysAdmins - ChinaNetCloud Training
Linux Memory Basics for SysAdmins - ChinaNetCloud TrainingLinux Memory Basics for SysAdmins - ChinaNetCloud Training
Linux Memory Basics for SysAdmins - ChinaNetCloud Training
 
Debugging Native heap OOM - JavaOne 2013
Debugging Native heap OOM - JavaOne 2013Debugging Native heap OOM - JavaOne 2013
Debugging Native heap OOM - JavaOne 2013
 
Input output in linux
Input output in linuxInput output in linux
Input output in linux
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_Tizen
 
Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406
 
PCD - Process control daemon - Presentation
PCD - Process control daemon - PresentationPCD - Process control daemon - Presentation
PCD - Process control daemon - Presentation
 
Memory leak
Memory leakMemory leak
Memory leak
 
Memory management in Andoid
Memory management in AndoidMemory management in Andoid
Memory management in Andoid
 
Android memory fundamentals
Android memory fundamentalsAndroid memory fundamentals
Android memory fundamentals
 
Process' Virtual Address Space in GNU/Linux
Process' Virtual Address Space in GNU/LinuxProcess' Virtual Address Space in GNU/Linux
Process' Virtual Address Space in GNU/Linux
 

Similar to Linux memory-management-kamal

5.6 Basic computer structure microprocessors
5.6 Basic computer structure   microprocessors5.6 Basic computer structure   microprocessors
5.6 Basic computer structure microprocessors
lpapadop
 
Basic computer hardware terminology
Basic computer hardware terminologyBasic computer hardware terminology
Basic computer hardware terminology
Imtiyaz Husaini
 
Presentacion pujol
Presentacion pujolPresentacion pujol
Presentacion pujol
Dylan Real G
 

Similar to Linux memory-management-kamal (20)

My presentation on 'computer hardware component' {hardware}
My presentation on 'computer hardware component' {hardware}My presentation on 'computer hardware component' {hardware}
My presentation on 'computer hardware component' {hardware}
 
5.6 Basic computer structure microprocessors
5.6 Basic computer structure   microprocessors5.6 Basic computer structure   microprocessors
5.6 Basic computer structure microprocessors
 
Information processing cycle
Information processing cycleInformation processing cycle
Information processing cycle
 
Computer Hardware
Computer HardwareComputer Hardware
Computer Hardware
 
Memory hierarchy.pdf
Memory hierarchy.pdfMemory hierarchy.pdf
Memory hierarchy.pdf
 
Introduction to Computer Hardware slides ppt
Introduction to Computer Hardware slides pptIntroduction to Computer Hardware slides ppt
Introduction to Computer Hardware slides ppt
 
Memory management
Memory managementMemory management
Memory management
 
Basic computer hardware terminology
Basic computer hardware terminologyBasic computer hardware terminology
Basic computer hardware terminology
 
Presentacion pujol
Presentacion pujolPresentacion pujol
Presentacion pujol
 
Lecture 2 - Computer Hardware & Operating Systems
Lecture 2 - Computer Hardware & Operating SystemsLecture 2 - Computer Hardware & Operating Systems
Lecture 2 - Computer Hardware & Operating Systems
 
Multimedia Technology
Multimedia TechnologyMultimedia Technology
Multimedia Technology
 
Computer Fundamentals
Computer FundamentalsComputer Fundamentals
Computer Fundamentals
 
Ram and types of ram.Cache
Ram and types of ram.CacheRam and types of ram.Cache
Ram and types of ram.Cache
 
Computer Memory Finder
Computer Memory FinderComputer Memory Finder
Computer Memory Finder
 
Computer hardware ppt1
Computer hardware ppt1Computer hardware ppt1
Computer hardware ppt1
 
Chapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworldChapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworld
 
Coa presentation3
Coa presentation3Coa presentation3
Coa presentation3
 
Computer Introduction-Lecture02
Computer Introduction-Lecture02Computer Introduction-Lecture02
Computer Introduction-Lecture02
 
Hardware
HardwareHardware
Hardware
 
Hardware
HardwareHardware
Hardware
 

Recently uploaded

Recently uploaded (20)

VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 

Linux memory-management-kamal

  • 1. Linux Memory Management Kamal Maiti Sr. Linux System Engineer Amdocs DVCI, Pune, India
  • 2. AGENDA  Basic concept of computer  Hardware, firmware, driver, software, application  CPU, RAM, How RAM used  Moving Information within Computer  Primary & Other Memory,  Segment of RAM  Memory Mapping, Process Address Space  Page, Frame, Hugepage, MMU etc.  Virtual Memory, PageCache  Memory nodes, zones, lowmem  NUMA  Kernel Memory allocator  Pagefault Handling, Tools, Memory leak, Memory related issues  Hands-on Troubleshooting : sysrq, backtrace analysis, OOM messages investigation etc
  • 3. BASIC CONCEPTS OF COMPUTER HARDWARE  This model of the typical digital computer is often called the von Neumann computer.  Programs and data are stored in the same memory: primary memory CPU (Central Processing Unit) Input Units Output Units Primary Memory
  • 4. HARDWARE, FIRMWARE, DRIVER, SOFTWARE, APPLICATION Hardware : All computer devices like - Input, Output devices, Motherboard, mouse, keyboard Firmware : Vendor provided low level codes that interacts with hardware to get the output of instructions passed to device. Driver : On top of firmware, driver is used to interacts with firmware or hardware directly. Software/Application: which interacts with system calls to call kernel and kernel interacts with driver to get the output.
  • 5. CPU  The three major components of the CPU are: 1. Arithmetic Unit (Computations performed) Accumulator (Results of computations kept here) 2. Control Unit (Has two locations where numbers are kept) Instruction Register (Instruction placed here for analysis) Program Counter (Which instruction will be performed next?) 3. Instruction Decoding Unit (Decodes the instruction)  Motherboard: The place where most of the electronics including the CPU are mounted.
  • 6. RAM  Commonly known as random access memory, or just RAM  Holds instructions and data needed for programs that are currently running  RAM is usually a volatile type of memory  Contents of RAM are lost when power is turned off
  • 7. HOW RAM USED ? Memory is used to store:  i) instructions - > to execute a program  ii) data -> When the computer is doing any job, the data that have to be processed are stored in the primary memory. This data may come from an input device like keyboard or from a secondary storage device like a floppy disk.
  • 8. MOVING INFORMATION WITHIN THE COMPUTER  How do binary numerals move into, out of, and within the computer?  Information is moved about in bytes, or multiple bytes called words.  Words are the fundamental units of information.  The number of bits per word may vary per computer.  A word length for most large IBM computers is 32 bits:
  • 9. MOVING INFORMATION WITHIN THE COMPUTER …  Bits that compose a word are passed in parallel from place to place.  Ribbon cables:  Consist of several wires, molded together.  One wire for each bit of the word or byte.  Additional wires coordinate the activity of moving information.  Each wire sends information in the form of a voltage pulse.
  • 10. MOVING INFORMATION WITHIN THE COMPUTER …  Example of sending the word WOW over the ribbon cable  Voltage pulses corresponding to the ASCII codes would pass through the cable.
  • 11. PRIMARY MEMORY  Primary storage or memory: Where the data & program that are currently in operation or being accessed are stored during use.  Consists of electronic circuits: Extremely fast and expensive.  Two types:  RAM (non-permanent)  Programs and data can be stored here for the computer’s use.  Volatile: All information will be lost once the computer shuts down.  ROM (permanent)  Contents do not change.
  • 12.  ROM : a transistor [storing video game software, electronic musical instruments]. ROM is mostly used for firmware updates.  EROM : Erasable programmable read-only memory  EEPROM :Electrically Erasable Programmable Read-Only Memory  Cache : Location in RAM where data is stored for a certain amount of time of that it can be reused.  Registers : various flip flop register[RS, D, JK, shift etc] holds information  Swap : External disk is used to accommodate the demand of more RAM. OTHER MEMORY
  • 13. SEGMENT OF RAM  Low mem, high mem, Normal mem, DMA, DMA32  On a 32-bit architecture[DMA, Normal & HighMem] : the address space range for addressing RAM is: 0x00000000 - 0xffffffff or 4'294'967'295 (4 GB). The user space range: 0x00000000 - 0xbfffffff or 3 GB The kernel space range: 0xc0000000 - 0xffffffff or 1 GB Linux splits the 1GB kernel space into 2 pieces: LOWMEM and HIGHMEM.  On 64 bit machine[DMA, DMA32 & Normal] : Normal memory available beyond 4 GB
  • 14. MEMORY MAPPING  Linux uses only 4 segments in 32 bit arch:  2 segments (code and data/stack) for KERNEL SPACE from [0xC000 0000] (3 GB) to [0xFFFF FFFF] (4 GB)  2 segments (code and data/stack) for USER SPACE from [0] (0 GB) to [0xBFFF FFFF] (3 GB) See virtual Map : $ pmap <PID> , see stack : $pstack <PID>  Segmentation, Paging [To overcome flaw in segmentation] –  allocating virtual small pages to each process so that they will be fit in RAM with out wasting it.
  • 15. PROCESS ADDRESS SPACE – 31 BIT ARCH Kernel 0xC0000000 File name, Environment Arguments Stack Bss[Block started by Symbol] _end _bss_start Data _edata _etext Text/code Header 0x84000000 Shared Libs Text/Code Segment: contains the actual code Data: contains global variables BSS: contains uninitialized global variables Heap: dynamic memory Stack: collection of frames/functions Heap Unused Memory 4 GB --> 3 GB --> 0 GB --> Kernel Space User Space
  • 16. PAGE & FRAME  Paging, Demand Paging, Swapping  Page Tables [64 bit 4, 32 bit 2]: Page Global Directory, Page Upper Directory, Page Middle Directory, Page  Min page size : getconf -a|grep -i page  Life cycle of page: active----> inactive list --> dirty --> clean
  • 17. SWAP, HUGE PAGE, MMU,TLB  SWAP : All pages can’t be fit in RAM, need to call/send data from and to storage disk  Hugepage : default page is 4MB but large program uses chunks of memory area. Hence, allow large page. [sysctl -a|grep -i huge]  MMU/TLB : Responsible for translating logical address to physical address. TLB is buffer that is used by MMU.  Active/Inactive regions [cat /proc/meminfo]  Shmem : shared memory area[ipcs -m]  Buddyinfo : view memory fragmentation/ allocation[cat /proc/buddyinfo]  Cache : For speeding up, sync to flush out and forcefully write on disk, bdflush does at background [flush-253:0 in rhel 6] buffer's policy is first-in, first-out cache's policy is Least Recently Used[LRU] [$ vmstat -S M 1]
  • 18. VIRTUAL MEMORY, HOW PROGRAM MAPS?  Executable text  Executable data  Heap space  Stack  Get exact required memory by process :  $ pmap -x <pid>,  $cat /proc/<pid>/status
  • 19. PAGE CACHE MEMORY CONTROL  vm.dirty_expire_centisecs=2000  vm.dirty_writeback_centisecs=400 //how long they’ll wait  vm.dirty_background_ratio=5 // when percentage of total RAM filled, pdflush/flush daemon will start write dirty data on disk  vm.dirty_ratio=20 //when percentage of total RAM filled, process will start write data on disk  vfs_cache_pressure [100] : controls the tendency of the kernel to reclaim the memory which is used for caching of directory and inode objects  Swappiness[60] : controls how kernel will use swap space.  To free pagecache: To free pagecache: echo 1 > /proc/sys/vm/drop_caches To free dentries and inodes : echo 2 > /proc/sys/vm/drop_caches To free pagecache, dentries and inodes: echo 3 > /proc/sys/vm/drop_caches  cache writes done by : kernel thread pdflush/bdflush, now in rhel 6 it is flush.  Life cycle of pages : active---->inactive list -->dirty > clean Link : https://www.kernel.org/doc/Documentation/sysctl/vm.txt
  • 20. PHYSICAL MEMORY ALLOCATION LIMIT  CommitLimit : total mem to be allocated based on ovcercommit_ratio  Committed_AS : currently allocated  overcommit_memory : from 0 to 2 << Start from here 0 = allow available memory on the system to be overloaded //default 1 = no memory over commit handling 2 = allocate best on overcommit_ratio // allocate best on condition  Overcommit_ratio: % of RAM when overcommit_memory is set 2, default value 50 Example : 4 GB RAM, 2 GB Swap, overcommit_memory=2, Overcommit_ratio=50 , so commitLimit = 2+ (4*50/100)=2+2= 4 GB Issue : Application failed to start due to shortage of memory, Needed to disable
  • 21. WHY MEMORY CACHE IS REALLY REQUIRED Speed up processing :  $ cat > XYZ  $ echo 3 > /proc/sys/vm/drop_caches  $ time cat XYZ //much time  $ time cat XYZ //less time
  • 22. MEMORY NODES, ZONES IN 32 BIT & 64 BIT  Below zones are in 32 bits :  Zone_DMA (0-16MB)  Zone_Normal (16MB-896MB)  ZONE_HIGH_MEM (896MB-above) HIGHMEM's lower zone is NORMAL+DMA , NORMAL's lower zone is DMA.  Below zones are in 64 bits :  Normal : Beyond 4 GB  DMA : till 16 MB  DMA32 : till 4GB  $ cat /proc/zoneinfo  $ cat /proc/pagetypeinfo  $cat /proc/<pid>/numa_maps  $ cat /proc/buddyinfo
  • 23. LOW MEMORY, ZONE_RECLAIM  "lowmem" often means NORMAL+DMA  “lowmem” is not present in RHEL 6, 64bit  Reservation is controlled by : lowmem_reserve_ratio [DMA NORMAL HIGMEM]  cat /proc/sys/vm/lowmem_reserve_ratio 256 256 32 // (1/256)*100 % = 0.39% of nearset zone is reserved  zone_reclaim_mode: How more or less aggressive approaches to reclaim memory when a zone runs out of memory 1 = Zone reclaim on 2 = Zone reclaim writes dirty pages out 4 = Zone reclaim swaps pages
  • 24. NON-UNIFORM MEMORY ACCESS(NUMA)  Numa concept : Numa Placement – placement of processor & Memory, manual – application, MPI(Message Passing Interface)  Place application in correct node  Two memory policy – Node Local[after linux boot], Interleave [during kernel boot]  cat /proc/<pid>/numa_maps  numactl -s //show policy  numactl –hardware  numactl [ --interleave nodes ] [ --preferred node ] [ --membind nodes ] [ --cpunodebind nodes ] [ --physcpubind cpus ] [ -- localalloc ] [--] command {arguments ...} Ref : http://www.redhat.com/summit/2012/pdf/2012-DevDay-Lab-NUMA-Hacker.pdf
  • 25. NUMA MANAGEMENT  numactl --physcpubind=0,1,2,3 example_process  numactl --physcpubind=0-3 example_process  numactl --cpunodebind=2 example_process //run on this cpu  numactl --physcpubind=0 --localalloc example_process  numactl --membind=4 example_process  numactl --cpunodebind=0 example_process //Only execute command on the CPUs of 0  numactl --cpubind=0 --membind=0,1 process // Run process on node 0 with memory allocated on node 0 and 1  numactl –hardware  cat /sys/devices/system/node/node*/numastat  Allocation : $watch -n1 numastat
  • 26. KERNEL MEMORY ALLOCATORS  Low-level page allocator :  Buddy system for contiguous multi-page allocations  Provides pages for  in-kernel allocations (slab cache)  vmalloc areas (kernel modules, multi-page data areas)  page cache, anonymous user pages  misc. other users  Slab cache :  Manages allocations of objects of the same type  Large-scale users: inodes, dentries, block I/O, network ...  kmalloc (generic allocator) implemented on top  Tool : slabtop
  • 27. PAGE FAULT HANDLING  Hardware support :  Accessing invalid pages causes 'page translation' check  Writing to protected pages causes 'protection exception'  Translation-exception identification provides address  'Suppression on protection' facility essential!  Linux kernel page fault handler :  Determine address/access validity according to VMA  Invalid accesses cause SIGSEGV delivery  Valid accesses trigger: page-in, swap-in, copy-on-write  Extra support for stack VMA: grows automatically  Out-of-memory if overcommitted causes SIGBUS
  • 28. TOOLS TO CHECK MEMORY USAGE  Report paging statistics : sar -B  Report memory utilization statistics : sar –r  Report memory statistics : sar –R  Report swap space utilization statistics: sar –S  Current memory usage :  free –m|k|g  Cat /proc/meminfo  Memory allocation :  cat /proc/buddyinfo  VM memory allocation:  pmap -x <PID>  Cat /proc/<pid>/status  Display kernel slab cache & memory information in real time:  slabtop  vmstat  ps  top  cat /proc/meminfo  strace, gcore
  • 29. MEMORY LEAK CHECK  Usage check : historical sar report  mtrace : builtin c function.  Valgrind :  valgrind --tool=memcheck --leak-check=full --show-reachable=yes snmpd -f –Lo
  • 30. ISSUES RELATED TO MEMORY  TCP/IP communication delay – RH cluster broken  High cache usage : slowdown application / system  Memory pressure : Memory leak, App is not tuned properly  Memory fragmentation : hugepage not used  OOM killer kills application: Memory pressure, OOM is enabled by default, kills based on badness value.  Segmentation fault : Kernel reclaims in normal/low memory region, hence no room for kernel, encounters segmentation fault.  Faulty Memory : Hardware failure or circuit failure in chip, need a diagnosis and replace chip
  • 31. TROUBLESHOOTING MEMORY ISSUE  Memory & swap usage test : swap_tendency = mapped_ratio/2 + distress + vm_swappiness mapped_ratio= % of physical memory in use distress = how much trouble kernel in freeing memory vm_swappiness= default 60 swap_tendency >= 100, eligible for swap swap_tendency < 100, reclaim from page cache  Sysrq : echo 1 > /proc/sys/kernel/sysrq echo m > /proc/sysrq-trigger  backtrace analysis
  • 32. TROUBLESHOOTING  OOM messages investigation : Messages :  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461588] [] oom_kill_process+0x5c/0x80  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461591] [] out_of_memory+0xc5/0x1c0  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461595] [] __alloc_pages_nodemask+0x72c/0x740  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461599] [] __get_free_pages+0x1c/0x30  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461602] [] get_zeroed_page+0x12/0x20  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461606] [] fill_read_buffer.isra.8+0xaa/0xd0  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461609] [] sysfs_read_file+0x7d/0x90  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461613] [] vfs_read+0x8c/0x160  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461616] [] ? fill_read_buffer.isra.8+0xd0/0xd0  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461619] [] sys_read+0x3d/0x70  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461624] [] sysenter_do_call+0x12/0x28