SlideShare une entreprise Scribd logo
1  sur  24
Outline
• Non-Uniform Cache Architecture (NUCA)
• Cache Coherence
• Implementation of directories in multicore
architecture
1
Non-Uniform Cache Architecture [1]
• Uniform Cache Architecture
▫ Multi-level cache hierarchies
 Organized into a few discrete levels
 Each level reduces access to the lower level
 Inclusion overhead
 Internal wire delays
 Restricted number of ports
▫ Large on-chip cache
 Single and discrete hit latency
 Undesirable due to increasing wire delays
2
Non-Uniform Cache Architecture [1]
• Non-uniform cache architecture (NUCA)
▫ Exploit non-uniformity
 Data in large cache closer to processor is accessed
faster than data residing physically farther
Level 2 caches architectures, 16MB with 50nm technology (taken from [1])
3
Non-Uniform Cache Architecture [1]
• Static NUCA
▫ Each bank can be accessed at different speeds
 Proportional to the distance from the controller
 Lower latency when closer to controller
▫ Mapping of data into banks based on block index
▫ Banks are independently addressable
▫ Access to banks may proceed in parallel
Banks have private channels
▫ Large number of wires
▫ Access time and routing delay increase with time
 Best organization at smaller technologies uses larger
banks
4
Non-Uniform Cache Architecture [1]
Static NUCA design (taken from [1])
5
Non-Uniform Cache Architecture [1]
• Switched Static NUCA
▫ 2D Mesh, point-to-point links
▫ Removes most of the large number of wires
▫ Allows a large number of faster, smaller banks
• Dynamic NUCA
▫ Allows data to be mapped to many banks
▫ Allows data to migrate among the banks
▫ Frequently used data can be promoted to faster
banks
6
Non-Uniform Cache Architecture [1]
Switched NUCA design (taken from [1])
7
Non-Uniform Cache Architecture [2]
• Policies
▫ Bank placement policy
 Where is data placed in the NUCA cache memory
▫ Bank access policy
 Determines bank-searching algorithm
▫ Bank migration policy
 Determines if a data element is allowed to change its
placement from one bank to another
 Regulates migration of data
▫ Bank replacement policy
 How NUCA behaves when there is a data eviction from
one of the banks
8
Taken from [2]
Non-Uniform Cache Architecture [2]
9
Cache Coherence
• Cache-coherence problem
• Support for large number of processors
▫ Need for high bandwidth
▫ Bus architecture insufficient
• Point-to-Point networks
▫ No broadcast mechanism
▫ Snooping protocol unusable
• Directory
▫ Solution for point-to-point networks
▫ Stores location of cache copies of blocks of data
▫ Centralized or distributed
10
Implementation of directories in
multicore architectures [3]
• DRAM (off-chip) directory
▫ Stores directory information in DRAM
 Ex: full-map protocol
▫ Does not exploit distance locality
▫ Treats each tile as a potential sharer of data
▫ Directory can be cached in on-chip SRAM
 Do not need to access off-chip memory each time
11
Implementation of directories in
multicore architectures [3]
Taken from [3]
12
Implementation of directories in
multicore architecture [4]
• DRAM (off-chip) directory with directory caches
▫ Private cache
▫ Directory is cached in each tile
 Do not need to access off-chip memory each time
 Non-coherent caches
 Home node for any given cache line
 Different range of memory address for each tile
▫ Directory controller in each tile
 Controls coherency between private caches
13
Implementation of directories in
multicore architecture [4]
Taken from [4]
14
Implementation of directories in
multicore architectures [3]
• Duplicate tag directory
▫ Directory centrally located in SRAM
▫ Connected to individual cores
▫ Exact duplicate tag store
 Directory state for a block is determined by examining
copy of tags of every possible cache that can hold the
block
 Keep copied tags up-to-date
▫ No more need to read states from DRAM memory
▫ Challenging as the number of cores increases
 64 cores, 16-way associative cache = 1024 aggregate
associativity of all tiles
15
Implementation of directories in
multicore architectures [3]
Taken from [3]
16
Implementation of directories in
multicore architecture [5]
Directory memory, 4-way associative caches (taken from [5])
17
Implementation of directories in
multicore architectures [3]
• Static cache bank directory
▫ Distributed directory among the tiles
 Mapping block address to a tile (called the home tile)
 Home tiles selected by simple interleaving
 Location can be sub-optimal (see next slide)
 Tile’s cache extended to contain directory
information
 Integrates directory states with cache tags
 Avoids SRAM or DRAM separate directory
18
Implementation of directories in
multicore architectures [3,6]
Taken from [3]
19
Taken from [6]
Implementation of directories in
multicore architecture [7]
• SGI Origin2000 multiprocessor system
▫ Directory memory connected to on-chip memory
 Shared L2 cache
 Directory memory distributed over multiple tiles
 Cache coherence controller
 Home tile sends appropriate messages to cores
20
Implementation of directories in
multicore architecture [7]
SGI Origin2000 multiprocessor system (taken from [7])
21
Implementation of directories in
multicore architecture [8]
• Tilera Tile64 architecture
▫ 2d mesh network (8X8)
▫ Provides coherent shared-memory environment
▫ Uses neighborhood caching
 Provides on-chip distributed shared cache
▫ Coherency is maintained at the home tile
 Data is not cached at non-home tiles
▫ Communication over a Tile Dynamic Network
22
Implementation of directories in
multicore architecture [9]
23
Tilera Tile64 (taken from)
References
• [1] C. Kim, D. Burger, S.W. Keckler, “An Adaptative, Non-Uniform Cache Structure for Wire-Delay Dominated On-Chip
Caches”, in Proc. 10th Int. Conf. ASPLOS, San Jose, CA, 2002, pp. 1-12
• [2] J. Lira, C. Molina, A. Gonzalez, “Analysis of Non-Uniform Cache Architecture Policies for Chip-Multiprocessors Using
the Parsec Benchmark Suite”, MMCS’09, Mar. 2009, pp. 1-8
• [3] M.R. Marty, M.D. Hill, “Virtual Hierarchies to Support Server Consolidation”, ISCA’07, June 2007, pp. 1-11
• [4] J.A. Brown, R. Kumar, D. Tullsen, “Proximity-Aware Directory-based Coherence for Multi-core Processor Architectures”,
SPAA’07, June 2007, pp. 1-9
• [5] J. Chang, G.S. Sophi, “Cooperative Caching for Chip Multiprocessors”, Computer Architecture, ISCA '06. 33rd
International Symposium on, 2006, pp.264-276
• [6] S. Cho, L. Jin, "Managing Distributed, Shared L2 Caches through OS-Level Page Allocation“, Microarchitecture, 2006.
MICRO-39. 39th Annual IEEE/ACM International Symposium on, Dec. 2006, pp.455-468
• [7] H. Lee, S. Cho, B.R. Childers, "PERFECTORY: A Fault-Tolerant Directory Memory Architecture“, Computers, IEEE
Transactions on , vol.59, no.5, May 2010, p.638-650
• [8] D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C.C. Miao, J.F. Brown, A. Agarwal,
"On-Chip Interconnection Architecture of the Tile Processor“, Micro, IEEE , vol.27, no.5, Sept.-Oct. 2007, pp.15-31
• [9] Linux Devices, “4-way chip gains Linux IDE, dev cards, design wins” [online], Linux Devices, Apr. 2008 [cited Oct. 21
2010] , available from World Wide Web: < http://thing1.linuxdevices.com/news/NS4811855366.html >
24

Contenu connexe

Tendances

Block Level Storage Vs File Level Storage
Block Level Storage Vs File Level StorageBlock Level Storage Vs File Level Storage
Block Level Storage Vs File Level StoragePradeep Jagan
 
Gluster fs hadoop_fifth-elephant
Gluster fs hadoop_fifth-elephantGluster fs hadoop_fifth-elephant
Gluster fs hadoop_fifth-elephantGluster.org
 
HDFS for Geographically Distributed File System
HDFS for Geographically Distributed File SystemHDFS for Geographically Distributed File System
HDFS for Geographically Distributed File SystemKonstantin V. Shvachko
 
Recent advancements in cache technology
Recent advancements in cache technologyRecent advancements in cache technology
Recent advancements in cache technologyParas Nath Chaudhary
 
Sdc challenges-2012
Sdc challenges-2012Sdc challenges-2012
Sdc challenges-2012Gluster.org
 
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vosOSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vosNETWAYS
 
Comparison between OCFS2 and GFS2
Comparison between OCFS2 and GFS2Comparison between OCFS2 and GFS2
Comparison between OCFS2 and GFS2Gang He
 
Database management-system
Database management-systemDatabase management-system
Database management-systemkalasalingam
 
The Future of GlusterFS and Gluster.org
The Future of GlusterFS and Gluster.orgThe Future of GlusterFS and Gluster.org
The Future of GlusterFS and Gluster.orgJohn Mark Walker
 
file sharing semantics by Umar Danjuma Maiwada
file sharing semantics by Umar Danjuma Maiwada file sharing semantics by Umar Danjuma Maiwada
file sharing semantics by Umar Danjuma Maiwada umardanjumamaiwada
 
Lisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionLisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionGluster.org
 
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized StoreGlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized StoreAtin Mukherjee
 

Tendances (16)

Block Level Storage Vs File Level Storage
Block Level Storage Vs File Level StorageBlock Level Storage Vs File Level Storage
Block Level Storage Vs File Level Storage
 
Gluster fs hadoop_fifth-elephant
Gluster fs hadoop_fifth-elephantGluster fs hadoop_fifth-elephant
Gluster fs hadoop_fifth-elephant
 
HDFS for Geographically Distributed File System
HDFS for Geographically Distributed File SystemHDFS for Geographically Distributed File System
HDFS for Geographically Distributed File System
 
Recent advancements in cache technology
Recent advancements in cache technologyRecent advancements in cache technology
Recent advancements in cache technology
 
Dumitru Enache - Bacula
Dumitru Enache - BaculaDumitru Enache - Bacula
Dumitru Enache - Bacula
 
CNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAsCNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAs
 
Sdc challenges-2012
Sdc challenges-2012Sdc challenges-2012
Sdc challenges-2012
 
CNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAsCNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAs
 
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vosOSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
 
Comparison between OCFS2 and GFS2
Comparison between OCFS2 and GFS2Comparison between OCFS2 and GFS2
Comparison between OCFS2 and GFS2
 
Database management-system
Database management-systemDatabase management-system
Database management-system
 
The Future of GlusterFS and Gluster.org
The Future of GlusterFS and Gluster.orgThe Future of GlusterFS and Gluster.org
The Future of GlusterFS and Gluster.org
 
file sharing semantics by Umar Danjuma Maiwada
file sharing semantics by Umar Danjuma Maiwada file sharing semantics by Umar Danjuma Maiwada
file sharing semantics by Umar Danjuma Maiwada
 
Lisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionLisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introduction
 
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized StoreGlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
 
MySQL/JVM
MySQL/JVMMySQL/JVM
MySQL/JVM
 

En vedette

Andre Childs Journal_of_Raman_Spectroscopy
Andre Childs Journal_of_Raman_SpectroscopyAndre Childs Journal_of_Raman_Spectroscopy
Andre Childs Journal_of_Raman_SpectroscopyAndre Childs
 
Curriculum Vitae ahmed afifi 50446440 new 2016
Curriculum Vitae ahmed afifi 50446440 new 2016Curriculum Vitae ahmed afifi 50446440 new 2016
Curriculum Vitae ahmed afifi 50446440 new 2016Ahmed Afifi
 
Text classification methods
Text classification methodsText classification methods
Text classification methodsLuis Goldster
 
Concurrency with java
Concurrency with javaConcurrency with java
Concurrency with javaJames Wong
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nbJames Wong
 
Behaviour driven development
Behaviour driven developmentBehaviour driven development
Behaviour driven developmentJames Wong
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingHarry Potter
 
Data visualization
Data visualizationData visualization
Data visualizationJames Wong
 
Віртуальна виставка нових надходжень
Віртуальна виставка нових надходженьВіртуальна виставка нових надходжень
Віртуальна виставка нових надходженьГригорий Зубрицкий
 
Object oriented programming
Object oriented programmingObject oriented programming
Object oriented programmingYoung Alista
 
Directory based cache coherence
Directory based cache coherenceDirectory based cache coherence
Directory based cache coherenceYoung Alista
 
How analysis services caching works
How analysis services caching worksHow analysis services caching works
How analysis services caching worksYoung Alista
 

En vedette (16)

Andre Childs Journal_of_Raman_Spectroscopy
Andre Childs Journal_of_Raman_SpectroscopyAndre Childs Journal_of_Raman_Spectroscopy
Andre Childs Journal_of_Raman_Spectroscopy
 
Curriculum Vitae ahmed afifi 50446440 new 2016
Curriculum Vitae ahmed afifi 50446440 new 2016Curriculum Vitae ahmed afifi 50446440 new 2016
Curriculum Vitae ahmed afifi 50446440 new 2016
 
Text classification methods
Text classification methodsText classification methods
Text classification methods
 
Concurrency with java
Concurrency with javaConcurrency with java
Concurrency with java
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nb
 
Behaviour driven development
Behaviour driven developmentBehaviour driven development
Behaviour driven development
 
Cheryl Holzknecht Resume 1
Cheryl Holzknecht Resume 1Cheryl Holzknecht Resume 1
Cheryl Holzknecht Resume 1
 
SOA2010 SOA with REST
SOA2010 SOA with RESTSOA2010 SOA with REST
SOA2010 SOA with REST
 
Memory caching
Memory cachingMemory caching
Memory caching
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data visualization
Data visualizationData visualization
Data visualization
 
Hash crypto
Hash cryptoHash crypto
Hash crypto
 
Віртуальна виставка нових надходжень
Віртуальна виставка нових надходженьВіртуальна виставка нових надходжень
Віртуальна виставка нових надходжень
 
Object oriented programming
Object oriented programmingObject oriented programming
Object oriented programming
 
Directory based cache coherence
Directory based cache coherenceDirectory based cache coherence
Directory based cache coherence
 
How analysis services caching works
How analysis services caching worksHow analysis services caching works
How analysis services caching works
 

Similaire à Directory based cache coherence

Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Dharma Shukla
 
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.pptmy no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.pptwondimagegndesta
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talkSatish Mehta
 
NoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementNoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementDATAVERSITY
 
CASSANDRA - Next to RDBMS
CASSANDRA - Next to RDBMSCASSANDRA - Next to RDBMS
CASSANDRA - Next to RDBMSVipul Thakur
 
Project Presentation Final
Project Presentation FinalProject Presentation Final
Project Presentation FinalDhritiman Halder
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservicesBigstep
 
D108636GC10_les01.pptx
D108636GC10_les01.pptxD108636GC10_les01.pptx
D108636GC10_les01.pptxSuresh569521
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overviewPritamKathar
 
409793049-Storage-Virtualization-pptx.pptx
409793049-Storage-Virtualization-pptx.pptx409793049-Storage-Virtualization-pptx.pptx
409793049-Storage-Virtualization-pptx.pptxson2483
 
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...Eric D. Schabell
 
Nosql query processing system for wireless sensor networks
Nosql query processing system for wireless sensor networksNosql query processing system for wireless sensor networks
Nosql query processing system for wireless sensor networksNikhil Bhaware
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...raghdooosh
 

Similaire à Directory based cache coherence (20)

Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019
 
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.pptmy no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talk
 
NoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementNoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application Enablement
 
CASSANDRA - Next to RDBMS
CASSANDRA - Next to RDBMSCASSANDRA - Next to RDBMS
CASSANDRA - Next to RDBMS
 
Project Presentation Final
Project Presentation FinalProject Presentation Final
Project Presentation Final
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
D108636GC10_les01.pptx
D108636GC10_les01.pptxD108636GC10_les01.pptx
D108636GC10_les01.pptx
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
409793049-Storage-Virtualization-pptx.pptx
409793049-Storage-Virtualization-pptx.pptx409793049-Storage-Virtualization-pptx.pptx
409793049-Storage-Virtualization-pptx.pptx
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
 
Factored operating systems
Factored operating systemsFactored operating systems
Factored operating systems
 
Vaibhav (2)
Vaibhav (2)Vaibhav (2)
Vaibhav (2)
 
Data Center
Data CenterData Center
Data Center
 
Nosql query processing system for wireless sensor networks
Nosql query processing system for wireless sensor networksNosql query processing system for wireless sensor networks
Nosql query processing system for wireless sensor networks
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
NoSQL Consepts
NoSQL ConseptsNoSQL Consepts
NoSQL Consepts
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
 

Plus de James Wong

Multi threaded rtos
Multi threaded rtosMulti threaded rtos
Multi threaded rtosJames Wong
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningJames Wong
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryJames Wong
 
Big picture of data mining
Big picture of data miningBig picture of data mining
Big picture of data miningJames Wong
 
How analysis services caching works
How analysis services caching worksHow analysis services caching works
How analysis services caching worksJames Wong
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsJames Wong
 
Abstract data types
Abstract data typesAbstract data types
Abstract data typesJames Wong
 
Abstraction file
Abstraction fileAbstraction file
Abstraction fileJames Wong
 
Hardware managed cache
Hardware managed cacheHardware managed cache
Hardware managed cacheJames Wong
 
Abstract class
Abstract classAbstract class
Abstract classJames Wong
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysisJames Wong
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithmsJames Wong
 
Cobol, lisp, and python
Cobol, lisp, and pythonCobol, lisp, and python
Cobol, lisp, and pythonJames Wong
 
Learning python
Learning pythonLearning python
Learning pythonJames Wong
 

Plus de James Wong (20)

Data race
Data raceData race
Data race
 
Multi threaded rtos
Multi threaded rtosMulti threaded rtos
Multi threaded rtos
 
Recursion
RecursionRecursion
Recursion
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Cache recap
Cache recapCache recap
Cache recap
 
Big picture of data mining
Big picture of data miningBig picture of data mining
Big picture of data mining
 
How analysis services caching works
How analysis services caching worksHow analysis services caching works
How analysis services caching works
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
Abstract data types
Abstract data typesAbstract data types
Abstract data types
 
Abstraction file
Abstraction fileAbstraction file
Abstraction file
 
Hardware managed cache
Hardware managed cacheHardware managed cache
Hardware managed cache
 
Object model
Object modelObject model
Object model
 
Abstract class
Abstract classAbstract class
Abstract class
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysis
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
 
Cobol, lisp, and python
Cobol, lisp, and pythonCobol, lisp, and python
Cobol, lisp, and python
 
Inheritance
InheritanceInheritance
Inheritance
 
Api crash
Api crashApi crash
Api crash
 
Learning python
Learning pythonLearning python
Learning python
 

Dernier

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Dernier (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Directory based cache coherence

  • 1. Outline • Non-Uniform Cache Architecture (NUCA) • Cache Coherence • Implementation of directories in multicore architecture 1
  • 2. Non-Uniform Cache Architecture [1] • Uniform Cache Architecture ▫ Multi-level cache hierarchies  Organized into a few discrete levels  Each level reduces access to the lower level  Inclusion overhead  Internal wire delays  Restricted number of ports ▫ Large on-chip cache  Single and discrete hit latency  Undesirable due to increasing wire delays 2
  • 3. Non-Uniform Cache Architecture [1] • Non-uniform cache architecture (NUCA) ▫ Exploit non-uniformity  Data in large cache closer to processor is accessed faster than data residing physically farther Level 2 caches architectures, 16MB with 50nm technology (taken from [1]) 3
  • 4. Non-Uniform Cache Architecture [1] • Static NUCA ▫ Each bank can be accessed at different speeds  Proportional to the distance from the controller  Lower latency when closer to controller ▫ Mapping of data into banks based on block index ▫ Banks are independently addressable ▫ Access to banks may proceed in parallel Banks have private channels ▫ Large number of wires ▫ Access time and routing delay increase with time  Best organization at smaller technologies uses larger banks 4
  • 5. Non-Uniform Cache Architecture [1] Static NUCA design (taken from [1]) 5
  • 6. Non-Uniform Cache Architecture [1] • Switched Static NUCA ▫ 2D Mesh, point-to-point links ▫ Removes most of the large number of wires ▫ Allows a large number of faster, smaller banks • Dynamic NUCA ▫ Allows data to be mapped to many banks ▫ Allows data to migrate among the banks ▫ Frequently used data can be promoted to faster banks 6
  • 7. Non-Uniform Cache Architecture [1] Switched NUCA design (taken from [1]) 7
  • 8. Non-Uniform Cache Architecture [2] • Policies ▫ Bank placement policy  Where is data placed in the NUCA cache memory ▫ Bank access policy  Determines bank-searching algorithm ▫ Bank migration policy  Determines if a data element is allowed to change its placement from one bank to another  Regulates migration of data ▫ Bank replacement policy  How NUCA behaves when there is a data eviction from one of the banks 8
  • 9. Taken from [2] Non-Uniform Cache Architecture [2] 9
  • 10. Cache Coherence • Cache-coherence problem • Support for large number of processors ▫ Need for high bandwidth ▫ Bus architecture insufficient • Point-to-Point networks ▫ No broadcast mechanism ▫ Snooping protocol unusable • Directory ▫ Solution for point-to-point networks ▫ Stores location of cache copies of blocks of data ▫ Centralized or distributed 10
  • 11. Implementation of directories in multicore architectures [3] • DRAM (off-chip) directory ▫ Stores directory information in DRAM  Ex: full-map protocol ▫ Does not exploit distance locality ▫ Treats each tile as a potential sharer of data ▫ Directory can be cached in on-chip SRAM  Do not need to access off-chip memory each time 11
  • 12. Implementation of directories in multicore architectures [3] Taken from [3] 12
  • 13. Implementation of directories in multicore architecture [4] • DRAM (off-chip) directory with directory caches ▫ Private cache ▫ Directory is cached in each tile  Do not need to access off-chip memory each time  Non-coherent caches  Home node for any given cache line  Different range of memory address for each tile ▫ Directory controller in each tile  Controls coherency between private caches 13
  • 14. Implementation of directories in multicore architecture [4] Taken from [4] 14
  • 15. Implementation of directories in multicore architectures [3] • Duplicate tag directory ▫ Directory centrally located in SRAM ▫ Connected to individual cores ▫ Exact duplicate tag store  Directory state for a block is determined by examining copy of tags of every possible cache that can hold the block  Keep copied tags up-to-date ▫ No more need to read states from DRAM memory ▫ Challenging as the number of cores increases  64 cores, 16-way associative cache = 1024 aggregate associativity of all tiles 15
  • 16. Implementation of directories in multicore architectures [3] Taken from [3] 16
  • 17. Implementation of directories in multicore architecture [5] Directory memory, 4-way associative caches (taken from [5]) 17
  • 18. Implementation of directories in multicore architectures [3] • Static cache bank directory ▫ Distributed directory among the tiles  Mapping block address to a tile (called the home tile)  Home tiles selected by simple interleaving  Location can be sub-optimal (see next slide)  Tile’s cache extended to contain directory information  Integrates directory states with cache tags  Avoids SRAM or DRAM separate directory 18
  • 19. Implementation of directories in multicore architectures [3,6] Taken from [3] 19 Taken from [6]
  • 20. Implementation of directories in multicore architecture [7] • SGI Origin2000 multiprocessor system ▫ Directory memory connected to on-chip memory  Shared L2 cache  Directory memory distributed over multiple tiles  Cache coherence controller  Home tile sends appropriate messages to cores 20
  • 21. Implementation of directories in multicore architecture [7] SGI Origin2000 multiprocessor system (taken from [7]) 21
  • 22. Implementation of directories in multicore architecture [8] • Tilera Tile64 architecture ▫ 2d mesh network (8X8) ▫ Provides coherent shared-memory environment ▫ Uses neighborhood caching  Provides on-chip distributed shared cache ▫ Coherency is maintained at the home tile  Data is not cached at non-home tiles ▫ Communication over a Tile Dynamic Network 22
  • 23. Implementation of directories in multicore architecture [9] 23 Tilera Tile64 (taken from)
  • 24. References • [1] C. Kim, D. Burger, S.W. Keckler, “An Adaptative, Non-Uniform Cache Structure for Wire-Delay Dominated On-Chip Caches”, in Proc. 10th Int. Conf. ASPLOS, San Jose, CA, 2002, pp. 1-12 • [2] J. Lira, C. Molina, A. Gonzalez, “Analysis of Non-Uniform Cache Architecture Policies for Chip-Multiprocessors Using the Parsec Benchmark Suite”, MMCS’09, Mar. 2009, pp. 1-8 • [3] M.R. Marty, M.D. Hill, “Virtual Hierarchies to Support Server Consolidation”, ISCA’07, June 2007, pp. 1-11 • [4] J.A. Brown, R. Kumar, D. Tullsen, “Proximity-Aware Directory-based Coherence for Multi-core Processor Architectures”, SPAA’07, June 2007, pp. 1-9 • [5] J. Chang, G.S. Sophi, “Cooperative Caching for Chip Multiprocessors”, Computer Architecture, ISCA '06. 33rd International Symposium on, 2006, pp.264-276 • [6] S. Cho, L. Jin, "Managing Distributed, Shared L2 Caches through OS-Level Page Allocation“, Microarchitecture, 2006. MICRO-39. 39th Annual IEEE/ACM International Symposium on, Dec. 2006, pp.455-468 • [7] H. Lee, S. Cho, B.R. Childers, "PERFECTORY: A Fault-Tolerant Directory Memory Architecture“, Computers, IEEE Transactions on , vol.59, no.5, May 2010, p.638-650 • [8] D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C.C. Miao, J.F. Brown, A. Agarwal, "On-Chip Interconnection Architecture of the Tile Processor“, Micro, IEEE , vol.27, no.5, Sept.-Oct. 2007, pp.15-31 • [9] Linux Devices, “4-way chip gains Linux IDE, dev cards, design wins” [online], Linux Devices, Apr. 2008 [cited Oct. 21 2010] , available from World Wide Web: < http://thing1.linuxdevices.com/news/NS4811855366.html > 24

Notes de l'éditeur

  1. [1] ftp://ftp.cs.utexas.edu/pub/dburger/papers/ASPLOS02.pdf
  2. [2] http://www.cercs.gatech.edu/mmcs09/papers/lira.pdf
  3. [3] http://www.cs.wisc.edu/multifacet/papers/isca07_virtual_hierarchy.pdf
  4. http://www.cs.wisc.edu/multifacet/papers/isca07_virtual_hierarchy.pdf
  5. http://cseweb.ucsd.edu/users/tullsen/spaa07.pdf
  6. [4] http://cseweb.ucsd.edu/users/tullsen/spaa07.pdf
  7. http://www.cs.wisc.edu/multifacet/papers/isca07_virtual_hierarchy.pdf
  8. http://www.cs.wisc.edu/multifacet/papers/isca07_virtual_hierarchy.pdf
  9. [5] http://pages.cs.wisc.edu/~mscalar/papers/2006/isca2006-coop-caching.pdf
  10. [3] http://www.cs.wisc.edu/multifacet/papers/isca07_virtual_hierarchy.pdf
  11. 1- http://www.cs.pitt.edu/cast/papers/cho-micro06.pdf 2- http://www.cs.wisc.edu/multifacet/papers/isca07_virtual_hierarchy.pdf
  12. http://www.cs.pitt.edu/cast/papers/lee-tc10.pdf
  13. http://www.cs.pitt.edu/cast/papers/lee-tc10.pdf
  14. [8] http://www.ieeexplore.ieee.org.proxy.bib.uottawa.ca/stamp/stamp.jsp?tp=&arnumber=4378780
  15. [9] http://www.linuxfordevices.com/files/misc/tilera_tile64_arch_diag2.gif