1. VM and I/O Topics in Linux
Page Replacement, Swap and I/O
Jiannan Ouyang
Ph.D. Student
Computer Science Department
University of Pittsburgh
05/05/2011
2. Outline
• Overview of Linux Memory Management
• Page Reclamation
• Swap & I/O
Jiannan Ouyang, CS PhD@PITT 2
3. Describing Physical Memory
Node: NUMA memory region
Zone: memory type
Struct Page: page frame
Jiannan Ouyang, CS PhD@PITT 3
4. Physical Page Allocation
Binary Buddy Allocator:
• If a block of the desired size is not available, a large block is broken up in half, and the
two blocks are buddies to each other. One half is used for the allocation, and the other is
free. The blocks are continuously halved as necessary until a block of the desired size is
available.
• When a block is later freed, the buddy is examined, and the two are coalesced if it is free.
Jiannan Ouyang, CS PhD@PITT 4
7. User Memory Mapping
kernel
space
stack
stack
mappings
text
data user space 3-GB
physical memory
data
text
Jiannan Ouyang, CS PhD@PITT virtual memory 7
8. User Memory Mapping
virtual memory virtual memory
physical memory
kernel kernel
space space
stack
stack stack
data
data
user space stack user space
data text data
text text
Jiannan Ouyang, CS PhD@PITT 8
9. Outline
• Overview of Linux Memory Management
• Page Reclamation
• Swap & I/O
Jiannan Ouyang, CS PhD@PITT 9
10. Memory Customers
Kernel Code & data
Request Slab Cache
Buddy Icache & dcache
System
Reclaim User Code & Data
Page Cache
• All memory except “User Code & data” are used by the kernel
• “User Code & Data” are managed in user space, i.e. malloc/free,
kernel can only swap out user pages
Jiannan Ouyang, CS PhD@PITT 10
11. Slab Cache
• Cache for commonly used objects kept in an initialized state
available for use by the kernel.
• Save time of allocating, initializing and freeing the same object.
Jiannan Ouyang, CS PhD@PITT 11
12. Disk related caches
• Dcache (metadata): dentry objects
representing filesystem pathnames.
• Icache (metadata): inode objects
representing disk inodes.
• Page Cache (data): data pages from disk,
main disk cache used
Jiannan Ouyang, CS PhD@PITT 12
13. Memory Customers Review
Kernel Code & data
Request Slab Cache
Buddy Icache & dcache
System
Reclaim User Code & Data
Page Cache
We’ll see when will the kernel start reclaim pages, which pages to
reclaim, and the replacement policy.
Jiannan Ouyang, CS PhD@PITT 13
14. Reclamation: When?
Zone Watermarks
• Pages Low: kswapd is woken up by the buddy
allocator to start freeing pages. The value is twice the
value of pages min by default.
• Pages Min: the allocator will do the kswapd work in
a synchronous fashion, sometimes referred to as the
direct-reclaim path.
• Pages High: kswapd will go back to sleep. The
default for pages high is three times the value of pages
min.
Jiannan Ouyang, CS PhD@PITT 14
17. Reclamation: Which? (Con.)
• Mapped & Anonymous Pages
– Mapped: backed up by a file
– Anonymous: anonymous memory region of a
process
• Shared & Non-shared Pages
– Unmapping from all page table entries at once:
reverse mapping, important improvement in Linux
2.6 Kernel
Jiannan Ouyang, CS PhD@PITT 17
18. Reclamation: Which? (Con.)
shrink_caches until given target number of pages is met,
1. slab cache (Kmem_cache_reap)
2. User pages & page cache (refill & shrink_cache)
3. dcache and icache
Jiannan Ouyang, CS PhD@PITT 18
20. Moving pages across the list
mark_page_accessed( ):
on each access increase the (active, ref) counter;
if active=1 move inactive->active;
Refill_inactive_zone():
if (ref=1) {ref=0; move to head of active list;}
else {move active -> inactive;}
Jiannan Ouyang, CS PhD@PITT 20
21. Outline
• Overview of Linux Memory Management
• Page Reclamation
• Swap & I/O
Jiannan Ouyang, CS PhD@PITT 21
22. Swap
• Able to reclaim all the page frames
obtained by a process, and not only those
have an image on disk
– anonymous pages (User stack or heap)
– Dirty pages that belong to a private memory
mapping of a process
– IPC shared pages
Jiannan Ouyang, CS PhD@PITT 22
23. Swap (Con.)
• Set up “swap areas” on disk
• allocating and freeing “page slots” in swap
areas
• Provide functions both to “swap out” pages
from RAM into a swap area and to “swap in”
pages from a swap area into RAM.
• Mark Page Table entries to keep track of the
positions of data in the swap areas.
Jiannan Ouyang, CS PhD@PITT 23