SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
Linux Huge Pages
Why? How? When?
1
• What are you talking about?
• Linux kernel map
• Memory Allocation
• Paging Model
• Page Fault
• Swapping
• Why Huge Pages
• How to configure
• When to configure
• Summary
2
Agenda
• This is mainly about X86-64
(Intel and AMD CPUs produced after 2004)
• There are some differences on huge pages among different
hardware architectures that are out of our scope
• We will not explore MMU, TLB and all the internals of virtual memory
management
• Some images are outdated
(e.g.: Linux kernel 2.6 while current version is 5.5)
but it illustrates very well the aspects discussed in this presentation
3
Premises
4
What are you talking about?
5
This is the Linux
kernel map on
version 2.6.36
While it is dated
by 10 years, it
gives us the big
picture
6
Memory Allocation
.
.
.
.
.
.
7
Paging Model
8
Page Fault
9
Swapping
• As we can see, memory management is complicated process
involving many ‘round-trips’
• Huge pages is about allocating larger blocks of memory at once
Thus, cutting the ‘round-trips’ associated with small pages
• Huge Pages cannot be swapped out
• A set of 4 KB pages can turn into a single 2 MB (with PAE), 4 MB or
even 1 GB
10
Why Huge Pages
Number of Pages (4 KB) Number of Huge Pages Huge Page Equivalence
512 1 2 MB (2048 KB)
1024 1 4 MB (4096 KB)
262.144 1 1 GB (1024 MB or 1.048.576 KB)
• There are 2 huge page variants
• HugeTLB File System
• Works as a pseudo filesystem where you need to manually define the allocation
• We will use this approach
• Transparent Huge Pages
• Works transparently – Linux kernel will decide on its own if the application requires or
not huge pages but it is not recommended for latency sensitive applications
11
Why Huge Pages
• Checking if it is possible to enable huge pages
12
How to Configure
netto@bella:~$ getconf PAGESIZE
4096
netto@bella:~$ cat /proc/cpuinfo | grep 'pse|pdpe' | tail -1
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss
ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc
cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch CPUid_fault
epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2
smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln
pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d
getconf returns the standard page size
for a given CPU architecture in bytes
/proc/cpuinfo contains all data related
to CPU
pse => supports huge page of 2MB
Pdpe1gb => supports huge page of 1GB
• Installing the required packages to configure huge pages as root
• WARNING: your distribution might require a slightly different setup
(e.g.: different package manager/names, less steps)
13
How to Configure
Red Hat / CentOS Debian / Ubuntu
root@bella:~$ yum -y install libhugetlbfs libhugetlbfs-utils root@bella:~$ apt-get -y install hugepages
• In the following case, we can select which huge page size is more
convenient for your application
14
How to Configure
# this is the pseudo directory where huge pages will be mapped, it needs to be an existing directory
# RedHat configuration differs a little
root@bella:~$ mkdir –p /dev/hugepages
# this can be converted to a /etc/fstab entry
root@bella:~$ mount -t hugetlbfs -o gid=<group id>, pagesize=<2M or 1G>,... none /dev/hugepages
# formula: (2 MB / 4 KB) or (1 GB / 4 KB) * size required for your scenario
# there are situations like Oracle DB where it is recommended to allocate huge pages only for SGA
vm.nr_hugepages = <number of pages>
# the same group gid on mount that must be associated with the group where your application is running
vm.hugetlb_shm_group = <group id>
• Add to sysctl.conf
• Reboot
# if huge pages are correctly setup, at least one pool will be displayed
netto@bella:~$ hugeadm --pool-list
Size Minimum Current Maximum Default
2097152 0 0 0 *
1073741824 1 1 1
# hugepages enabled if HugePages_Total is > 0
netto@bella:~$ cat /proc/meminfo
...
HugePages_Total: <huge pages pool size>
HugePages_Free: <number of huge pages that are not allocated>
HugePages_Rsvd: <number of huge pages that are reserved but not allocated>
HugePages_Surp: <maximum number of huge pages>
Hugepagesize: 2048 kB
Hugetlb: 1048576 kB
DirectMap4k: 572400 kB
DirectMap2M: 12943360 kB
DirectMap1G: 19922944 kB 15
How to Configure
16
How to Configure
Application Where Syntax
Oracle JDK/OpenJDK Command line argument –XX:+UseLargePages
MySQL my.cnf, inside the block [mysqld] large_pages=ON
PHP php.ini, opcache block opcache.huge_code_pages 1
Python Using mmap module MADV_HUGEPAGE
PostgreSQL postgresql.conf huge_pages=ON
Docker Command line argument --device=/dev/hugepages:/dev/hugepages
17
When to Configure
Advantages Disadvantages
Huge Pages can reduce pressure on TLB/MMU
Internal and external memory fragmentation will be
potentialized if not configured properly
Huge Pages are not swappable
“Swappability” avoids quick memory starvation imposing
some performance cost
Any data-intensive application that properly use mmap(),
madvise(), shmget(), shmat() and some other calls can
benefit from it
It’s a POSIX extension, other Unix like Solaris, FreeBSD and
even Windows have similar feature with a totally different
setup
Any memory-bound application can benefit from it
NUMA (non uniform memory access) systems may not
have all the benefits from an UMA system
(hardware with uniform/unified memory management)
When latency/response time is critical
Transparent Huge Pages is not recommended in general
(has very specific use cases)
• Many other advantages and disadvantages can come up but most importantly: test!
• It might be required to increase memory allocation on /etc/security/limits.conf
• Operating System Concepts
Silberschatz, Gagne, Galvin
John Wiley & Sons
• Understanding Linux Kernel
Daniel Bovet, Marco Cesati
O'Reilly Media; 3rd edition
• Professional Linux Kernel Architecture
Wolfgang Mauerer
Wrox Press
• Low level programming
Igor Zhirkov
Apress
• Systems Performance – enterprise and the cloud
Brendan Gregg
Prentice Hall
18
References
• Configuring huge pages for your PostgreSQL instance, Debian version
• Performance Tuning: HugePages In Linux
• KVM - Using Hugepages
• LinuxMM: HugePages
• Configuring HugePages for Oracle on Linux (x86-64)
• How to enable huge page support in a Dockerfile
• ZGC
• PostgreSQL and Hugepages: Working with an abundance of memory in
modern servers
• How to configure HugePage using hugeadm (RHEL/CentOS 7)
• RedHat 7 Documentation: Configuring HugeTLB HUGE PAGES
19
References
• PHP 7 - runtime configuration
• PostgreSQL 9.4 Resource Consumption
• Python mmap module
• 7 easy steps to configure HugePages for your Oracle Database Server
• Redis latency problems troubleshooting
• Wikipedia: Linux Kernel
• Interactive map of Linux Kernel
• Huge pages part 1 (Introduction)
• Huge pages part 2: Interfaces
• Huge pages part 3: Administration
• Memory part 3: Virtual Memory
20
References
21
Thank you!
Geraldo Netto
geraldo.netto@gmail.com

Contenu connexe

Tendances

Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixBrendan Gregg
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernelAdrian Huang
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at NetflixBrendan Gregg
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...Adrian Huang
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceBrendan Gregg
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBshimosawa
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)Brendan Gregg
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdfAdrian Huang
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsHisaki Ohara
 
Memory Management with Page Folios
Memory Management with Page FoliosMemory Management with Page Folios
Memory Management with Page FoliosAdrian Huang
 
eBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceSUSE Labs Taipei
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Adrian Huang
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File SystemAdrian Huang
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughThomas Graf
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)shimosawa
 

Tendances (20)

Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernel
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKB
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
spinlock.pdf
spinlock.pdfspinlock.pdf
spinlock.pdf
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
 
Memory Management with Page Folios
Memory Management with Page FoliosMemory Management with Page Folios
Memory Management with Page Folios
 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
 
eBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to Userspace
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
 

Similaire à Linux Huge Pages

PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedEqunix Business Solutions
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimizationLouis liu
 
The Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux KernelThe Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux KernelYasunori Goto
 
Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performancesolarisyougood
 
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...Joao Galdino Mello de Souza
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Colin Charles
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisMike Pittaro
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis PyData
 
LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)Pekka Männistö
 
LizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-webLizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-webSzymon Haly
 
Comparison of foss distributed storage
Comparison of foss distributed storageComparison of foss distributed storage
Comparison of foss distributed storageMarian Marinov
 
Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performancexKinAnx
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 
Open Source Data Deduplication
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data DeduplicationRedWireServices
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big DataDataStax Academy
 

Similaire à Linux Huge Pages (20)

PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimization
 
The Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux KernelThe Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux Kernel
 
Running MySQL on Linux
Running MySQL on LinuxRunning MySQL on Linux
Running MySQL on Linux
 
Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performance
 
os
osos
os
 
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
 
Time For D.I.M.E?
Time For D.I.M.E?Time For D.I.M.E?
Time For D.I.M.E?
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis
 
LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)
 
LizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-webLizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-web
 
Comparison of foss distributed storage
Comparison of foss distributed storageComparison of foss distributed storage
Comparison of foss distributed storage
 
Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performance
 
Time For DIME
Time For DIMETime For DIME
Time For DIME
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Open Source Data Deduplication
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data Deduplication
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
 

Dernier

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...Nitya salvi
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsBert Jan Schrijver
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...masabamasaba
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationShrmpro
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 

Dernier (20)

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions Presentation
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 

Linux Huge Pages

  • 1. Linux Huge Pages Why? How? When? 1
  • 2. • What are you talking about? • Linux kernel map • Memory Allocation • Paging Model • Page Fault • Swapping • Why Huge Pages • How to configure • When to configure • Summary 2 Agenda
  • 3. • This is mainly about X86-64 (Intel and AMD CPUs produced after 2004) • There are some differences on huge pages among different hardware architectures that are out of our scope • We will not explore MMU, TLB and all the internals of virtual memory management • Some images are outdated (e.g.: Linux kernel 2.6 while current version is 5.5) but it illustrates very well the aspects discussed in this presentation 3 Premises
  • 4. 4 What are you talking about?
  • 5. 5 This is the Linux kernel map on version 2.6.36 While it is dated by 10 years, it gives us the big picture
  • 10. • As we can see, memory management is complicated process involving many ‘round-trips’ • Huge pages is about allocating larger blocks of memory at once Thus, cutting the ‘round-trips’ associated with small pages • Huge Pages cannot be swapped out • A set of 4 KB pages can turn into a single 2 MB (with PAE), 4 MB or even 1 GB 10 Why Huge Pages Number of Pages (4 KB) Number of Huge Pages Huge Page Equivalence 512 1 2 MB (2048 KB) 1024 1 4 MB (4096 KB) 262.144 1 1 GB (1024 MB or 1.048.576 KB)
  • 11. • There are 2 huge page variants • HugeTLB File System • Works as a pseudo filesystem where you need to manually define the allocation • We will use this approach • Transparent Huge Pages • Works transparently – Linux kernel will decide on its own if the application requires or not huge pages but it is not recommended for latency sensitive applications 11 Why Huge Pages
  • 12. • Checking if it is possible to enable huge pages 12 How to Configure netto@bella:~$ getconf PAGESIZE 4096 netto@bella:~$ cat /proc/cpuinfo | grep 'pse|pdpe' | tail -1 flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch CPUid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d getconf returns the standard page size for a given CPU architecture in bytes /proc/cpuinfo contains all data related to CPU pse => supports huge page of 2MB Pdpe1gb => supports huge page of 1GB
  • 13. • Installing the required packages to configure huge pages as root • WARNING: your distribution might require a slightly different setup (e.g.: different package manager/names, less steps) 13 How to Configure Red Hat / CentOS Debian / Ubuntu root@bella:~$ yum -y install libhugetlbfs libhugetlbfs-utils root@bella:~$ apt-get -y install hugepages
  • 14. • In the following case, we can select which huge page size is more convenient for your application 14 How to Configure # this is the pseudo directory where huge pages will be mapped, it needs to be an existing directory # RedHat configuration differs a little root@bella:~$ mkdir –p /dev/hugepages # this can be converted to a /etc/fstab entry root@bella:~$ mount -t hugetlbfs -o gid=<group id>, pagesize=<2M or 1G>,... none /dev/hugepages # formula: (2 MB / 4 KB) or (1 GB / 4 KB) * size required for your scenario # there are situations like Oracle DB where it is recommended to allocate huge pages only for SGA vm.nr_hugepages = <number of pages> # the same group gid on mount that must be associated with the group where your application is running vm.hugetlb_shm_group = <group id> • Add to sysctl.conf • Reboot
  • 15. # if huge pages are correctly setup, at least one pool will be displayed netto@bella:~$ hugeadm --pool-list Size Minimum Current Maximum Default 2097152 0 0 0 * 1073741824 1 1 1 # hugepages enabled if HugePages_Total is > 0 netto@bella:~$ cat /proc/meminfo ... HugePages_Total: <huge pages pool size> HugePages_Free: <number of huge pages that are not allocated> HugePages_Rsvd: <number of huge pages that are reserved but not allocated> HugePages_Surp: <maximum number of huge pages> Hugepagesize: 2048 kB Hugetlb: 1048576 kB DirectMap4k: 572400 kB DirectMap2M: 12943360 kB DirectMap1G: 19922944 kB 15 How to Configure
  • 16. 16 How to Configure Application Where Syntax Oracle JDK/OpenJDK Command line argument –XX:+UseLargePages MySQL my.cnf, inside the block [mysqld] large_pages=ON PHP php.ini, opcache block opcache.huge_code_pages 1 Python Using mmap module MADV_HUGEPAGE PostgreSQL postgresql.conf huge_pages=ON Docker Command line argument --device=/dev/hugepages:/dev/hugepages
  • 17. 17 When to Configure Advantages Disadvantages Huge Pages can reduce pressure on TLB/MMU Internal and external memory fragmentation will be potentialized if not configured properly Huge Pages are not swappable “Swappability” avoids quick memory starvation imposing some performance cost Any data-intensive application that properly use mmap(), madvise(), shmget(), shmat() and some other calls can benefit from it It’s a POSIX extension, other Unix like Solaris, FreeBSD and even Windows have similar feature with a totally different setup Any memory-bound application can benefit from it NUMA (non uniform memory access) systems may not have all the benefits from an UMA system (hardware with uniform/unified memory management) When latency/response time is critical Transparent Huge Pages is not recommended in general (has very specific use cases) • Many other advantages and disadvantages can come up but most importantly: test! • It might be required to increase memory allocation on /etc/security/limits.conf
  • 18. • Operating System Concepts Silberschatz, Gagne, Galvin John Wiley & Sons • Understanding Linux Kernel Daniel Bovet, Marco Cesati O'Reilly Media; 3rd edition • Professional Linux Kernel Architecture Wolfgang Mauerer Wrox Press • Low level programming Igor Zhirkov Apress • Systems Performance – enterprise and the cloud Brendan Gregg Prentice Hall 18 References
  • 19. • Configuring huge pages for your PostgreSQL instance, Debian version • Performance Tuning: HugePages In Linux • KVM - Using Hugepages • LinuxMM: HugePages • Configuring HugePages for Oracle on Linux (x86-64) • How to enable huge page support in a Dockerfile • ZGC • PostgreSQL and Hugepages: Working with an abundance of memory in modern servers • How to configure HugePage using hugeadm (RHEL/CentOS 7) • RedHat 7 Documentation: Configuring HugeTLB HUGE PAGES 19 References
  • 20. • PHP 7 - runtime configuration • PostgreSQL 9.4 Resource Consumption • Python mmap module • 7 easy steps to configure HugePages for your Oracle Database Server • Redis latency problems troubleshooting • Wikipedia: Linux Kernel • Interactive map of Linux Kernel • Huge pages part 1 (Introduction) • Huge pages part 2: Interfaces • Huge pages part 3: Administration • Memory part 3: Virtual Memory 20 References