This document discusses Linux huge pages, including:
- What huge pages are and how they can reduce memory management overhead by allocating larger blocks of memory
- How to configure huge pages on Linux, including installing required packages, mounting the huge page filesystem, and setting kernel parameters
- When huge pages should be configured, such as for data-intensive or latency-sensitive applications like databases, but that testing is required due to disadvantages like reduced swappability
2. • What are you talking about?
• Linux kernel map
• Memory Allocation
• Paging Model
• Page Fault
• Swapping
• Why Huge Pages
• How to configure
• When to configure
• Summary
2
Agenda
3. • This is mainly about X86-64
(Intel and AMD CPUs produced after 2004)
• There are some differences on huge pages among different
hardware architectures that are out of our scope
• We will not explore MMU, TLB and all the internals of virtual memory
management
• Some images are outdated
(e.g.: Linux kernel 2.6 while current version is 5.5)
but it illustrates very well the aspects discussed in this presentation
3
Premises
10. • As we can see, memory management is complicated process
involving many ‘round-trips’
• Huge pages is about allocating larger blocks of memory at once
Thus, cutting the ‘round-trips’ associated with small pages
• Huge Pages cannot be swapped out
• A set of 4 KB pages can turn into a single 2 MB (with PAE), 4 MB or
even 1 GB
10
Why Huge Pages
Number of Pages (4 KB) Number of Huge Pages Huge Page Equivalence
512 1 2 MB (2048 KB)
1024 1 4 MB (4096 KB)
262.144 1 1 GB (1024 MB or 1.048.576 KB)
11. • There are 2 huge page variants
• HugeTLB File System
• Works as a pseudo filesystem where you need to manually define the allocation
• We will use this approach
• Transparent Huge Pages
• Works transparently – Linux kernel will decide on its own if the application requires or
not huge pages but it is not recommended for latency sensitive applications
11
Why Huge Pages
12. • Checking if it is possible to enable huge pages
12
How to Configure
netto@bella:~$ getconf PAGESIZE
4096
netto@bella:~$ cat /proc/cpuinfo | grep 'pse|pdpe' | tail -1
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss
ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc
cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch CPUid_fault
epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2
smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln
pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d
getconf returns the standard page size
for a given CPU architecture in bytes
/proc/cpuinfo contains all data related
to CPU
pse => supports huge page of 2MB
Pdpe1gb => supports huge page of 1GB
13. • Installing the required packages to configure huge pages as root
• WARNING: your distribution might require a slightly different setup
(e.g.: different package manager/names, less steps)
13
How to Configure
Red Hat / CentOS Debian / Ubuntu
root@bella:~$ yum -y install libhugetlbfs libhugetlbfs-utils root@bella:~$ apt-get -y install hugepages
14. • In the following case, we can select which huge page size is more
convenient for your application
14
How to Configure
# this is the pseudo directory where huge pages will be mapped, it needs to be an existing directory
# RedHat configuration differs a little
root@bella:~$ mkdir –p /dev/hugepages
# this can be converted to a /etc/fstab entry
root@bella:~$ mount -t hugetlbfs -o gid=<group id>, pagesize=<2M or 1G>,... none /dev/hugepages
# formula: (2 MB / 4 KB) or (1 GB / 4 KB) * size required for your scenario
# there are situations like Oracle DB where it is recommended to allocate huge pages only for SGA
vm.nr_hugepages = <number of pages>
# the same group gid on mount that must be associated with the group where your application is running
vm.hugetlb_shm_group = <group id>
• Add to sysctl.conf
• Reboot
15. # if huge pages are correctly setup, at least one pool will be displayed
netto@bella:~$ hugeadm --pool-list
Size Minimum Current Maximum Default
2097152 0 0 0 *
1073741824 1 1 1
# hugepages enabled if HugePages_Total is > 0
netto@bella:~$ cat /proc/meminfo
...
HugePages_Total: <huge pages pool size>
HugePages_Free: <number of huge pages that are not allocated>
HugePages_Rsvd: <number of huge pages that are reserved but not allocated>
HugePages_Surp: <maximum number of huge pages>
Hugepagesize: 2048 kB
Hugetlb: 1048576 kB
DirectMap4k: 572400 kB
DirectMap2M: 12943360 kB
DirectMap1G: 19922944 kB 15
How to Configure
16. 16
How to Configure
Application Where Syntax
Oracle JDK/OpenJDK Command line argument –XX:+UseLargePages
MySQL my.cnf, inside the block [mysqld] large_pages=ON
PHP php.ini, opcache block opcache.huge_code_pages 1
Python Using mmap module MADV_HUGEPAGE
PostgreSQL postgresql.conf huge_pages=ON
Docker Command line argument --device=/dev/hugepages:/dev/hugepages
17. 17
When to Configure
Advantages Disadvantages
Huge Pages can reduce pressure on TLB/MMU
Internal and external memory fragmentation will be
potentialized if not configured properly
Huge Pages are not swappable
“Swappability” avoids quick memory starvation imposing
some performance cost
Any data-intensive application that properly use mmap(),
madvise(), shmget(), shmat() and some other calls can
benefit from it
It’s a POSIX extension, other Unix like Solaris, FreeBSD and
even Windows have similar feature with a totally different
setup
Any memory-bound application can benefit from it
NUMA (non uniform memory access) systems may not
have all the benefits from an UMA system
(hardware with uniform/unified memory management)
When latency/response time is critical
Transparent Huge Pages is not recommended in general
(has very specific use cases)
• Many other advantages and disadvantages can come up but most importantly: test!
• It might be required to increase memory allocation on /etc/security/limits.conf
18. • Operating System Concepts
Silberschatz, Gagne, Galvin
John Wiley & Sons
• Understanding Linux Kernel
Daniel Bovet, Marco Cesati
O'Reilly Media; 3rd edition
• Professional Linux Kernel Architecture
Wolfgang Mauerer
Wrox Press
• Low level programming
Igor Zhirkov
Apress
• Systems Performance – enterprise and the cloud
Brendan Gregg
Prentice Hall
18
References
19. • Configuring huge pages for your PostgreSQL instance, Debian version
• Performance Tuning: HugePages In Linux
• KVM - Using Hugepages
• LinuxMM: HugePages
• Configuring HugePages for Oracle on Linux (x86-64)
• How to enable huge page support in a Dockerfile
• ZGC
• PostgreSQL and Hugepages: Working with an abundance of memory in
modern servers
• How to configure HugePage using hugeadm (RHEL/CentOS 7)
• RedHat 7 Documentation: Configuring HugeTLB HUGE PAGES
19
References
20. • PHP 7 - runtime configuration
• PostgreSQL 9.4 Resource Consumption
• Python mmap module
• 7 easy steps to configure HugePages for your Oracle Database Server
• Redis latency problems troubleshooting
• Wikipedia: Linux Kernel
• Interactive map of Linux Kernel
• Huge pages part 1 (Introduction)
• Huge pages part 2: Interfaces
• Huge pages part 3: Administration
• Memory part 3: Virtual Memory
20
References