SlideShare une entreprise Scribd logo
1  sur  11
Télécharger pour lire hors ligne
Transparent hugepages.
Huge zero page.
Kirill A. Shutemov
29 December 2012
Minsk, Belarus
Virtual to physical address translation
TLBVirtual address Physical addressTLB hit
Page tables walk
TLB missTLB write
TLB
• TLB is Fast:
– Three virtual to physical address translations every cycle (2 load, 1 store);
– TLB miss, STLB hit penalty is 7 cycles; can usually be hidden by OOO execution;
• Page table walk is Slow:
– Hundreds cycles;
– In the worst case a page table can be in swap;
Page table walk, 4k pages
PGD Table PUD Table PMD Table PTE Table
213948
Page
0123063
offsetptepmdpudpgd
mm->pgd
pte
Virtual address
Huge pages
• Pros:
– Reduce TLB pressure:
• 1 TLB entry instead of 512 for every 2M;
• 4k and 2M pages have separate TLB entires;
– Faster page tables walk: 3 data access instead of 4;
– Reduce cache pressure;
• Cons:
– Memory footprint;
– clear_page/copy_page is not cache friendly;
Page table walk, 2M pages
02130394863
offsetpmdpudpgd
mm->pgd
PGD Table PUD Table PMD Table Page
Virtual address
Hugetlbfs
• Cannot be swapped out;
• Need to be reserved on boot time;
• Requires to root access;
• Explicit application support is required;
• No fallback to 4k pages;
Transparent Hugepages
• No special enabling is required;
• Graceful fallback to 4k pages;
• Hugepage can be splitted to 4k pages at any time;
• Hugepages can be swapped out as 4k pages;
• MADV_HUGEPAGE and MADV_NOHUGEPAGE for tuning;
Zero page
#include <assert.h>
#include <stdlib.h>
#include <unistd.h>
#define MB (1024*1024)
int main(int argc, char **argv)
{
char *p;
int i;
posix_memalign((void **)&p, 2 * MB, 200 * MB);
for (i = 0; i < 200 * MB; i += 4096)
assert(p[i] == 0);
pause();
return 0;
}
0
Huge zero page
• Two designs have been evaluated:
• “physical” huge zero page;
• “virtual” huge zero page;
• Lockless refcounting:
• free huge zero page in shrinker callback;
• ~1% slowdown on 40 parallel read page faulting processes;
• Merged in v3.8;
• LWN: http://lwn.net/Articles/517465/
Questions?

Contenu connexe

Tendances

Memory: The New Disk
Memory: The New DiskMemory: The New Disk
Memory: The New Disk
Tim Lossen
 
Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger
Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChangerZero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger
Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger
MongoDB
 
Mongo db in 3 minutes BoilerMake
Mongo db in 3 minutes   BoilerMakeMongo db in 3 minutes   BoilerMake
Mongo db in 3 minutes BoilerMake
Valeri Karpov
 
How to randomly access data in close-to-RAM speeds but a lower cost with SSD’...
How to randomly access data in close-to-RAM speeds but a lower cost with SSD’...How to randomly access data in close-to-RAM speeds but a lower cost with SSD’...
How to randomly access data in close-to-RAM speeds but a lower cost with SSD’...
JAXLondon2014
 
Cassandra vs. Redis
Cassandra vs. RedisCassandra vs. Redis
Cassandra vs. Redis
Tim Lossen
 
Mysql cluster
Mysql clusterMysql cluster
Mysql cluster
JS Lee
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS Applications
Steven Francia
 
Speeding up Page Load Times by Using Starling
Speeding up Page Load Times by Using StarlingSpeeding up Page Load Times by Using Starling
Speeding up Page Load Times by Using Starling
Erik Osterman
 

Tendances (20)

Memory: The New Disk
Memory: The New DiskMemory: The New Disk
Memory: The New Disk
 
Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger
Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChangerZero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger
Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger
 
NoSQL solutions
NoSQL solutionsNoSQL solutions
NoSQL solutions
 
MongoUK - Approaching 1 billion documents with MongoDB1 Billion Documents
MongoUK - Approaching 1 billion documents with MongoDB1 Billion DocumentsMongoUK - Approaching 1 billion documents with MongoDB1 Billion Documents
MongoUK - Approaching 1 billion documents with MongoDB1 Billion Documents
 
Mongo db in 3 minutes BoilerMake
Mongo db in 3 minutes   BoilerMakeMongo db in 3 minutes   BoilerMake
Mongo db in 3 minutes BoilerMake
 
Webinar - Approaching 1 billion documents with MongoDB
Webinar - Approaching 1 billion documents with MongoDBWebinar - Approaching 1 billion documents with MongoDB
Webinar - Approaching 1 billion documents with MongoDB
 
Lessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistLessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at Craigslist
 
How to randomly access data in close-to-RAM speeds but a lower cost with SSD’...
How to randomly access data in close-to-RAM speeds but a lower cost with SSD’...How to randomly access data in close-to-RAM speeds but a lower cost with SSD’...
How to randomly access data in close-to-RAM speeds but a lower cost with SSD’...
 
SSDs, IMDGs and All the Rest - Jax London
SSDs, IMDGs and All the Rest - Jax LondonSSDs, IMDGs and All the Rest - Jax London
SSDs, IMDGs and All the Rest - Jax London
 
WebRender (MadRust)
WebRender (MadRust)WebRender (MadRust)
WebRender (MadRust)
 
Edge performance with in memory nosql
Edge performance with in memory nosqlEdge performance with in memory nosql
Edge performance with in memory nosql
 
Cassandra vs. Redis
Cassandra vs. RedisCassandra vs. Redis
Cassandra vs. Redis
 
What to know about Amazon Elastic Block Store (EBS)
What to know about Amazon Elastic Block Store (EBS)What to know about Amazon Elastic Block Store (EBS)
What to know about Amazon Elastic Block Store (EBS)
 
Mysql cluster
Mysql clusterMysql cluster
Mysql cluster
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS Applications
 
Speeding up Page Load Times by Using Starling
Speeding up Page Load Times by Using StarlingSpeeding up Page Load Times by Using Starling
Speeding up Page Load Times by Using Starling
 
Sharding
ShardingSharding
Sharding
 
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity Planning
 

Similaire à Кирилл Шутемов aka “kas” - О Transparent Hugepages и Huge zero page

Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014
marvin herrera
 
Tales from the Field
Tales from the FieldTales from the Field
Tales from the Field
MongoDB
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL Tuning
FromDual GmbH
 
MASK-address-space-aware-gpu-memory-hierarchy_asplos18-talk-nobackup.pptx
MASK-address-space-aware-gpu-memory-hierarchy_asplos18-talk-nobackup.pptxMASK-address-space-aware-gpu-memory-hierarchy_asplos18-talk-nobackup.pptx
MASK-address-space-aware-gpu-memory-hierarchy_asplos18-talk-nobackup.pptx
AhmedMamdouh829955
 

Similaire à Кирилл Шутемов aka “kas” - О Transparent Hugepages и Huge zero page (20)

Practical MySQL
Practical MySQLPractical MySQL
Practical MySQL
 
LECTURE13nvjlfdihbkzbjvzbfmdnmzbxckbn.ppt
LECTURE13nvjlfdihbkzbjvzbfmdnmzbxckbn.pptLECTURE13nvjlfdihbkzbjvzbfmdnmzbxckbn.ppt
LECTURE13nvjlfdihbkzbjvzbfmdnmzbxckbn.ppt
 
cache memory management
cache memory managementcache memory management
cache memory management
 
M|18 Battle of the Online Schema Change Methods
M|18 Battle of the Online Schema Change MethodsM|18 Battle of the Online Schema Change Methods
M|18 Battle of the Online Schema Change Methods
 
An Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media PlatformAn Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media Platform
 
Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014
 
FlashSQL 소개 & TechTalk
FlashSQL 소개 & TechTalkFlashSQL 소개 & TechTalk
FlashSQL 소개 & TechTalk
 
Tales from the Field
Tales from the FieldTales from the Field
Tales from the Field
 
Hadoop Hardware @Twitter: Size does matter.
Hadoop Hardware @Twitter: Size does matter.Hadoop Hardware @Twitter: Size does matter.
Hadoop Hardware @Twitter: Size does matter.
 
Migrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at FacebookMigrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at Facebook
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL Tuning
 
MASK-address-space-aware-gpu-memory-hierarchy_asplos18-talk-nobackup.pptx
MASK-address-space-aware-gpu-memory-hierarchy_asplos18-talk-nobackup.pptxMASK-address-space-aware-gpu-memory-hierarchy_asplos18-talk-nobackup.pptx
MASK-address-space-aware-gpu-memory-hierarchy_asplos18-talk-nobackup.pptx
 
Shootout at the PAAS Corral
Shootout at the PAAS CorralShootout at the PAAS Corral
Shootout at the PAAS Corral
 
Resolving Firebird performance problems
Resolving Firebird performance problemsResolving Firebird performance problems
Resolving Firebird performance problems
 
Making the case for write-optimized database algorithms / Mark Callaghan (Fac...
Making the case for write-optimized database algorithms / Mark Callaghan (Fac...Making the case for write-optimized database algorithms / Mark Callaghan (Fac...
Making the case for write-optimized database algorithms / Mark Callaghan (Fac...
 
The Expert Guide to Fast Data
The Expert Guide to Fast Data The Expert Guide to Fast Data
The Expert Guide to Fast Data
 
LDAP at Lightning Speed
 LDAP at Lightning Speed LDAP at Lightning Speed
LDAP at Lightning Speed
 
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp0220140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
 
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
 
Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!
 

Plus de Minsk Linux User Group

Plus de Minsk Linux User Group (20)

Vladimir ’mend0za’ Shakhov — Linux firmware for iRMC controller on Fujitsu P...
 Vladimir ’mend0za’ Shakhov — Linux firmware for iRMC controller on Fujitsu P... Vladimir ’mend0za’ Shakhov — Linux firmware for iRMC controller on Fujitsu P...
Vladimir ’mend0za’ Shakhov — Linux firmware for iRMC controller on Fujitsu P...
 
Андрэй Захарэвіч — Hack the Hackpad: Першая спроба публічнага кіравання задач...
Андрэй Захарэвіч — Hack the Hackpad: Першая спроба публічнага кіравання задач...Андрэй Захарэвіч — Hack the Hackpad: Першая спроба публічнага кіравання задач...
Андрэй Захарэвіч — Hack the Hackpad: Першая спроба публічнага кіравання задач...
 
Святлана Ермаковіч — Вікі-дапаможнік. Як узмацніць беларускую вікі-супольнасць
Святлана Ермаковіч — Вікі-дапаможнік. Як узмацніць беларускую вікі-супольнасцьСвятлана Ермаковіч — Вікі-дапаможнік. Як узмацніць беларускую вікі-супольнасць
Святлана Ермаковіч — Вікі-дапаможнік. Як узмацніць беларускую вікі-супольнасць
 
Тимофей Титовец — Elastic+Logstash+Kibana: Архитектура и опыт использования
Тимофей Титовец — Elastic+Logstash+Kibana: Архитектура и опыт использованияТимофей Титовец — Elastic+Logstash+Kibana: Архитектура и опыт использования
Тимофей Титовец — Elastic+Logstash+Kibana: Архитектура и опыт использования
 
Андрэй Захарэвіч - Як мы ставілі KDE пад FreeBSD
Андрэй Захарэвіч - Як мы ставілі KDE пад FreeBSDАндрэй Захарэвіч - Як мы ставілі KDE пад FreeBSD
Андрэй Захарэвіч - Як мы ставілі KDE пад FreeBSD
 
Vitaly ̈_Vi ̈ Shukela - My FOSS projects
Vitaly  ̈_Vi ̈ Shukela - My FOSS projectsVitaly  ̈_Vi ̈ Shukela - My FOSS projects
Vitaly ̈_Vi ̈ Shukela - My FOSS projects
 
Vitaly ̈_Vi ̈ Shukela - Dive
Vitaly  ̈_Vi ̈ Shukela - DiveVitaly  ̈_Vi ̈ Shukela - Dive
Vitaly ̈_Vi ̈ Shukela - Dive
 
Alexander Lomov - Cloud Foundry и BOSH: истории из жизни
Alexander Lomov - Cloud Foundry и BOSH: истории из жизниAlexander Lomov - Cloud Foundry и BOSH: истории из жизни
Alexander Lomov - Cloud Foundry и BOSH: истории из жизни
 
Vikentsi Lapa — How does software testing become software development?
Vikentsi Lapa — How does software testing  become software development?Vikentsi Lapa — How does software testing  become software development?
Vikentsi Lapa — How does software testing become software development?
 
Михаил Волчек — Свободные лицензии. быть или не быть? Продолжение
Михаил Волчек — Свободные лицензии. быть или не быть? ПродолжениеМихаил Волчек — Свободные лицензии. быть или не быть? Продолжение
Михаил Волчек — Свободные лицензии. быть или не быть? Продолжение
 
Максим Мельников — IPv6 at Home: NAT64, DNS64, OpenVPN
Максим Мельников — IPv6 at Home: NAT64, DNS64, OpenVPNМаксим Мельников — IPv6 at Home: NAT64, DNS64, OpenVPN
Максим Мельников — IPv6 at Home: NAT64, DNS64, OpenVPN
 
Слава Машканов — “Wubuntu”: Построение гетерогенной среды Windows+Linux на н...
Слава Машканов — “Wubuntu”: Построение гетерогенной среды  Windows+Linux на н...Слава Машканов — “Wubuntu”: Построение гетерогенной среды  Windows+Linux на н...
Слава Машканов — “Wubuntu”: Построение гетерогенной среды Windows+Linux на н...
 
MajorDoMo: Открытая платформа Умного Дома
MajorDoMo: Открытая платформа Умного ДомаMajorDoMo: Открытая платформа Умного Дома
MajorDoMo: Открытая платформа Умного Дома
 
Максим Салов - Отладочный монитор
Максим Салов - Отладочный мониторМаксим Салов - Отладочный монитор
Максим Салов - Отладочный монитор
 
Максим Мельников - FOSDEM 2014 overview
Максим Мельников - FOSDEM 2014 overviewМаксим Мельников - FOSDEM 2014 overview
Максим Мельников - FOSDEM 2014 overview
 
Константин Шевцов - Пара слов о Jenkins
Константин Шевцов - Пара слов о JenkinsКонстантин Шевцов - Пара слов о Jenkins
Константин Шевцов - Пара слов о Jenkins
 
Ермакович Света - Операция «Пингвин»
Ермакович Света - Операция «Пингвин»Ермакович Света - Операция «Пингвин»
Ермакович Света - Операция «Пингвин»
 
Михаил Волчек - Смогут ли беларусы вкусить плоды Творческих Общин? Creative C...
Михаил Волчек - Смогут ли беларусы вкусить плоды Творческих Общин? Creative C...Михаил Волчек - Смогут ли беларусы вкусить плоды Творческих Общин? Creative C...
Михаил Волчек - Смогут ли беларусы вкусить плоды Творческих Общин? Creative C...
 
Vikentsi Lapa - Tools for testing
Vikentsi Lapa - Tools for testingVikentsi Lapa - Tools for testing
Vikentsi Lapa - Tools for testing
 
Алексей Туля - А нужен ли вам erlang?
Алексей Туля - А нужен ли вам erlang?Алексей Туля - А нужен ли вам erlang?
Алексей Туля - А нужен ли вам erlang?
 

Dernier

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Dernier (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Кирилл Шутемов aka “kas” - О Transparent Hugepages и Huge zero page

  • 1. Transparent hugepages. Huge zero page. Kirill A. Shutemov 29 December 2012 Minsk, Belarus
  • 2. Virtual to physical address translation TLBVirtual address Physical addressTLB hit Page tables walk TLB missTLB write
  • 3. TLB • TLB is Fast: – Three virtual to physical address translations every cycle (2 load, 1 store); – TLB miss, STLB hit penalty is 7 cycles; can usually be hidden by OOO execution; • Page table walk is Slow: – Hundreds cycles; – In the worst case a page table can be in swap;
  • 4. Page table walk, 4k pages PGD Table PUD Table PMD Table PTE Table 213948 Page 0123063 offsetptepmdpudpgd mm->pgd pte Virtual address
  • 5. Huge pages • Pros: – Reduce TLB pressure: • 1 TLB entry instead of 512 for every 2M; • 4k and 2M pages have separate TLB entires; – Faster page tables walk: 3 data access instead of 4; – Reduce cache pressure; • Cons: – Memory footprint; – clear_page/copy_page is not cache friendly;
  • 6. Page table walk, 2M pages 02130394863 offsetpmdpudpgd mm->pgd PGD Table PUD Table PMD Table Page Virtual address
  • 7. Hugetlbfs • Cannot be swapped out; • Need to be reserved on boot time; • Requires to root access; • Explicit application support is required; • No fallback to 4k pages;
  • 8. Transparent Hugepages • No special enabling is required; • Graceful fallback to 4k pages; • Hugepage can be splitted to 4k pages at any time; • Hugepages can be swapped out as 4k pages; • MADV_HUGEPAGE and MADV_NOHUGEPAGE for tuning;
  • 9. Zero page #include <assert.h> #include <stdlib.h> #include <unistd.h> #define MB (1024*1024) int main(int argc, char **argv) { char *p; int i; posix_memalign((void **)&p, 2 * MB, 200 * MB); for (i = 0; i < 200 * MB; i += 4096) assert(p[i] == 0); pause(); return 0; }
  • 10. 0 Huge zero page • Two designs have been evaluated: • “physical” huge zero page; • “virtual” huge zero page; • Lockless refcounting: • free huge zero page in shrinker callback; • ~1% slowdown on 40 parallel read page faulting processes; • Merged in v3.8; • LWN: http://lwn.net/Articles/517465/