SlideShare une entreprise Scribd logo
1  sur  15
GOOGLE FILE SYSTEM 
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 
Presented By – Ankit Thiranh
OVERVIEW 
• Introduction 
• Architecture 
• Characteristics 
• System Interaction 
• Master Operation and Fault tolerance and diagnosis 
• Measurements 
• Some Real world clusters and their performance
INTRODUCTION 
• Google – large amount of data 
• Need a good file distribution system to process its data 
• Solution: Google File System 
• GFS is : 
• Large 
• Distributed 
• Highly fault tolerant system
ASSUMPTIONS 
• The system is built from many inexpensive commodity components that often fail. 
• The system stores a modest number of large files. 
• Primarily two kind of reads: large streaming reads and small random needs. 
• Many large sequential writes append data to files. 
• The system must efficiently implement well-defined semantics for multiple clients that 
concurrently append to the same file. 
• High sustained bandwidth is more important than low latency.
ARCHITECTURE
CHARACTERISTICS 
• Single master 
• Chunk size 
• Metadata 
• In-Memory Data structures 
• Chunk Locations 
• Operational Log 
• Consistency Model (figure) 
• Guarantees by GFS 
• Implications for Applications 
Write Record Append 
Serial Success defined Defined 
interspersed with 
inconsistent 
Concurrent 
successes 
Consistent but 
undefined 
Failure inconsistent 
File Region State After Mutation
SYSTEM INTERACTION 
• Leases and Mutation Order 
• Data flow 
• Atomic Record appends 
• Snapshot 
Figure 2: Write Control and Data Flow
MASTER OPERATION 
• Namespace Management and Locking 
• Replica Placement 
• Creation, Re-replication, Rebalancing 
• Garbage Collection 
• Mechanism 
• Discussion 
• State Replica Detection
FAULT TOLERANCE AND DIAGNOSIS 
• High Availability 
• Fast Recovery 
• Chunk Replication 
• Master Replication 
• Data Integrity 
• Diagnostics tools
MEASUREMENTS 
Aggregate Throughputs. Top curves show theoretical limits imposed by the network topology. Bottom curves 
show measured throughputs. They have error bars that show 95% confidence intervals, which are illegible in 
some cases because of low variance in measurements.
REAL WORLD CLUSTERS 
• Two clusters were examined: 
• Cluster A used for Research and development by over a hundred users. 
• Cluster B is used for production data processing with occasional human 
intervention 
• Storage 
• Metadata 
Cluster A B 
Chunkservers 342 227 
Available disk Size 
72 TB 
Used Disk Space 
55 TB 
Characteristics of two GFS clusters 
180 TB 
155 TB 
Number of Files 
Number of Dead Files 
Number of chunks 
735 k 
22 k 
992 k 
737 k 
232 k 
1550 k 
Metadata at chunkservers 
Metadata at master 
13 GB 
48 MB 
21 GB 
60 MB
PERFORMANCE EVALUATION OF TWO 
CLUSTERS 
• Read and write rates and Master load 
Cluster A B 
Read Rate (last minute) 583 MB/s 380 MB/s 
Read Rate (last hour) 562 MB/s 384 MB/s 
Read Rate (since start) 589 MB/s 49 MB/s 
Write Rate (last minute) 1 MB/s 101 MB/s 
Write Rate (last hour) 2 MB/s 117 MB/s 
Write Rate (since start) 25 MB/s 13 MB/s 
Master ops (last minute) 325 Ops/s 533 Ops/s 
Master ops (last hour) 381 Ops/s 518 Ops/s 
Master ops (since start) 202 Ops/s 347 Ops/s 
Performance Metrics for Two GFS Clusters
WORKLOAD BREAKDOWN 
• Chunkserver Workload 
Operation Read Write Record Append 
Cluster X Y X Y X Y 
0K 0.4 2.6 0 0 0 0 
1B….1K 0.1 4.1 6.6 4.9 0.2 9.2 
1K…8K 65.2 38.5 0.4 1.0 18.9 15.2 
8K…64K 29.9 45.1 17.8 43.0 78.0 2.8 
64K….128K 0.1 0.7 2.3 1.9 < 0.1 4.3 
128K….256K 0.2 0.3 31.6 0.4 < 0.1 10.6 
256K…512K 0.1 0.1 4.2 7.7 < 0.1 31.2 
512K….1M 3.9 6.9 35.5 28.7 2.2 25.5 
1M..inf 0.1 1.8 1.5 12.3 0.7 2.2 
Operation Read Write Record Append 
Cluster X Y X Y X Y 
1B….1K < 0.1 <0.1 < 0.1 <0.1 < 0.1 <0.1 
1K…8K 13.8 3.9 < 0.1 <0.1 < 0.1 0.1 
8K…64K 11.4 9.3 2.4 5.9 78.0 0.3 
64K….128K 0.3 0.7 0.3 0.3 < 0.1 1.2 
128K….256K 0.8 0.6 16.5 0.2 < 0.1 5.8 
256K…512K 1.4 0.3 3.4 7.7 < 0.1 38.4 
512K….1M 65.9 55.1 74.1 58.0 0.1 46.8 
1M..inf 6.4 28.0 3.3 28.0 53.9 7.4 
Operations Break down by Size (% ) Bytes Transferred Breakdown by Operation Size (% )
WORKLOAD BREAKDOWN 
• Master Workload 
Cluster X Y 
Open 26.1 16.3 
Delete 0.7 1.5 
FindLocation 64.3 65.8 
FindLeaseHolder 7.8 13.4 
FindMatchingFiles 0.6 2.2 
All other combined 0.5 0.8 
Master Requests Break down by Type (% )
Google file system

Contenu connexe

Tendances

Processes in unix
Processes in unixProcesses in unix
Processes in unix
miau_max
 
Distributed document based system
Distributed document based systemDistributed document based system
Distributed document based system
Chetan Selukar
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File System
tutchiio
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systems
sumitjain2013
 

Tendances (20)

google file system
google file systemgoogle file system
google file system
 
Intel® hyper threading technology
Intel® hyper threading technologyIntel® hyper threading technology
Intel® hyper threading technology
 
Google File System
Google File SystemGoogle File System
Google File System
 
Processes in unix
Processes in unixProcesses in unix
Processes in unix
 
Operating system memory management
Operating system memory managementOperating system memory management
Operating system memory management
 
Memory Management in OS
Memory Management in OSMemory Management in OS
Memory Management in OS
 
Operating System Operations ppt.pptx
Operating System Operations ppt.pptxOperating System Operations ppt.pptx
Operating System Operations ppt.pptx
 
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters
 
Hadoop Presentation - PPT
Hadoop Presentation - PPTHadoop Presentation - PPT
Hadoop Presentation - PPT
 
Transport layer
Transport layerTransport layer
Transport layer
 
Distributed document based system
Distributed document based systemDistributed document based system
Distributed document based system
 
Big Data: Getting started with Big SQL self-study guide
Big Data:  Getting started with Big SQL self-study guideBig Data:  Getting started with Big SQL self-study guide
Big Data: Getting started with Big SQL self-study guide
 
Seminar Report on Google File System
Seminar Report on Google File SystemSeminar Report on Google File System
Seminar Report on Google File System
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File System
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systems
 
cluster computing
cluster computingcluster computing
cluster computing
 
Transmission Control Protocol (TCP)
Transmission Control Protocol (TCP)Transmission Control Protocol (TCP)
Transmission Control Protocol (TCP)
 
IEEE 802 Standard for Computer Networks
IEEE 802 Standard for Computer NetworksIEEE 802 Standard for Computer Networks
IEEE 802 Standard for Computer Networks
 
Notes on NUMA architecture
Notes on NUMA architectureNotes on NUMA architecture
Notes on NUMA architecture
 
Distributed computing environment
Distributed computing environmentDistributed computing environment
Distributed computing environment
 

En vedette

Google file system
Google file systemGoogle file system
Google file system
Dhan V Sagar
 

En vedette (8)

Google file system
Google file systemGoogle file system
Google file system
 
The google file system
The google file systemThe google file system
The google file system
 
Google File System
Google File SystemGoogle File System
Google File System
 
GFS
GFSGFS
GFS
 
The Google File System (GFS)
The Google File System (GFS)The Google File System (GFS)
The Google File System (GFS)
 
Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...
 
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukCloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
 
10 Tips for Making Beautiful Slideshow Presentations by www.visuali.se
10 Tips for Making Beautiful Slideshow Presentations by www.visuali.se10 Tips for Making Beautiful Slideshow Presentations by www.visuali.se
10 Tips for Making Beautiful Slideshow Presentations by www.visuali.se
 

Similaire à Google file system

Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014
marvin herrera
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
Chester Chen
 

Similaire à Google file system (20)

Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
 
Cassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionCassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in Production
 
Cassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in ProductionCassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in Production
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
What's new in JBoss ON 3.2
What's new in JBoss ON 3.2What's new in JBoss ON 3.2
What's new in JBoss ON 3.2
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
 
August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation
 
Toronto High Scalability meetup - Scaling ELK
Toronto High Scalability meetup - Scaling ELKToronto High Scalability meetup - Scaling ELK
Toronto High Scalability meetup - Scaling ELK
 
Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
Introduction to STINGER
Introduction to STINGERIntroduction to STINGER
Introduction to STINGER
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 

Dernier

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Dernier (20)

Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 

Google file system

  • 1. GOOGLE FILE SYSTEM Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presented By – Ankit Thiranh
  • 2. OVERVIEW • Introduction • Architecture • Characteristics • System Interaction • Master Operation and Fault tolerance and diagnosis • Measurements • Some Real world clusters and their performance
  • 3. INTRODUCTION • Google – large amount of data • Need a good file distribution system to process its data • Solution: Google File System • GFS is : • Large • Distributed • Highly fault tolerant system
  • 4. ASSUMPTIONS • The system is built from many inexpensive commodity components that often fail. • The system stores a modest number of large files. • Primarily two kind of reads: large streaming reads and small random needs. • Many large sequential writes append data to files. • The system must efficiently implement well-defined semantics for multiple clients that concurrently append to the same file. • High sustained bandwidth is more important than low latency.
  • 6. CHARACTERISTICS • Single master • Chunk size • Metadata • In-Memory Data structures • Chunk Locations • Operational Log • Consistency Model (figure) • Guarantees by GFS • Implications for Applications Write Record Append Serial Success defined Defined interspersed with inconsistent Concurrent successes Consistent but undefined Failure inconsistent File Region State After Mutation
  • 7. SYSTEM INTERACTION • Leases and Mutation Order • Data flow • Atomic Record appends • Snapshot Figure 2: Write Control and Data Flow
  • 8. MASTER OPERATION • Namespace Management and Locking • Replica Placement • Creation, Re-replication, Rebalancing • Garbage Collection • Mechanism • Discussion • State Replica Detection
  • 9. FAULT TOLERANCE AND DIAGNOSIS • High Availability • Fast Recovery • Chunk Replication • Master Replication • Data Integrity • Diagnostics tools
  • 10. MEASUREMENTS Aggregate Throughputs. Top curves show theoretical limits imposed by the network topology. Bottom curves show measured throughputs. They have error bars that show 95% confidence intervals, which are illegible in some cases because of low variance in measurements.
  • 11. REAL WORLD CLUSTERS • Two clusters were examined: • Cluster A used for Research and development by over a hundred users. • Cluster B is used for production data processing with occasional human intervention • Storage • Metadata Cluster A B Chunkservers 342 227 Available disk Size 72 TB Used Disk Space 55 TB Characteristics of two GFS clusters 180 TB 155 TB Number of Files Number of Dead Files Number of chunks 735 k 22 k 992 k 737 k 232 k 1550 k Metadata at chunkservers Metadata at master 13 GB 48 MB 21 GB 60 MB
  • 12. PERFORMANCE EVALUATION OF TWO CLUSTERS • Read and write rates and Master load Cluster A B Read Rate (last minute) 583 MB/s 380 MB/s Read Rate (last hour) 562 MB/s 384 MB/s Read Rate (since start) 589 MB/s 49 MB/s Write Rate (last minute) 1 MB/s 101 MB/s Write Rate (last hour) 2 MB/s 117 MB/s Write Rate (since start) 25 MB/s 13 MB/s Master ops (last minute) 325 Ops/s 533 Ops/s Master ops (last hour) 381 Ops/s 518 Ops/s Master ops (since start) 202 Ops/s 347 Ops/s Performance Metrics for Two GFS Clusters
  • 13. WORKLOAD BREAKDOWN • Chunkserver Workload Operation Read Write Record Append Cluster X Y X Y X Y 0K 0.4 2.6 0 0 0 0 1B….1K 0.1 4.1 6.6 4.9 0.2 9.2 1K…8K 65.2 38.5 0.4 1.0 18.9 15.2 8K…64K 29.9 45.1 17.8 43.0 78.0 2.8 64K….128K 0.1 0.7 2.3 1.9 < 0.1 4.3 128K….256K 0.2 0.3 31.6 0.4 < 0.1 10.6 256K…512K 0.1 0.1 4.2 7.7 < 0.1 31.2 512K….1M 3.9 6.9 35.5 28.7 2.2 25.5 1M..inf 0.1 1.8 1.5 12.3 0.7 2.2 Operation Read Write Record Append Cluster X Y X Y X Y 1B….1K < 0.1 <0.1 < 0.1 <0.1 < 0.1 <0.1 1K…8K 13.8 3.9 < 0.1 <0.1 < 0.1 0.1 8K…64K 11.4 9.3 2.4 5.9 78.0 0.3 64K….128K 0.3 0.7 0.3 0.3 < 0.1 1.2 128K….256K 0.8 0.6 16.5 0.2 < 0.1 5.8 256K…512K 1.4 0.3 3.4 7.7 < 0.1 38.4 512K….1M 65.9 55.1 74.1 58.0 0.1 46.8 1M..inf 6.4 28.0 3.3 28.0 53.9 7.4 Operations Break down by Size (% ) Bytes Transferred Breakdown by Operation Size (% )
  • 14. WORKLOAD BREAKDOWN • Master Workload Cluster X Y Open 26.1 16.3 Delete 0.7 1.5 FindLocation 64.3 65.8 FindLeaseHolder 7.8 13.4 FindMatchingFiles 0.6 2.2 All other combined 0.5 0.8 Master Requests Break down by Type (% )

Notes de l'éditeur

  1. GFS – single master, multiple chunkservers, multiple client. Files- divided into chunks, chunks- immutable and globally unique 64 bit chunk handle. Stored in multiple chunkservers, master- contains metadata includes the namespace, access control information, mapping of file to chunks and current location of chunks
  2. Single Master- can make sophisticated chunk replacement and replication decisions using global knowledge. Read example Chunk Size – 64 MB, advantages – reduces client-master interation, client more likely to perform many operations on given chunk, reduces metadata size. Metadata – stores file and chunk namespaces, mapping from files to chunks, location to chunk’s relica, metadata stored in memory to do fast operations, chunk location – does not keep a record, polls at startup, monitor by sending heartbeat messages,operation log- contains a history of critical metadata changes. Guarantee- application mutation on same order to all the replicas , using chunk version numbers to detect any replica Consistent – all replicas have the same data, defined – consistent – defined and client can see what the mutation has written
  3. Mutation – operation that changes the content of metadata Data flow – bandwidth – data is [pushed linearly along the server, avoid bottlenecks and high-latency links- each machine forwards the data to closest possible, latency min – pipelining the data transfer over TCP connections. Record append – client specifies the data, GFS appends automatically, same way as control flow Snapshots – makes a copy of file or ‘directory tree’ minimizing any interruption with ongoing mutations
  4. Master – executes all namespace operations, manages chunk replicas, Namespace – GFS logically represent its namespace as a look up table mapping full path names to metadata. Replica placement - 1) maximise data reliability and availability, and 2) maximum bandwidth utilization Creation, re-replication – replicas on severs with below average disk utilization, limit recent creation on each chunk server, spread replicas of a chunk across racks Garbage collection – after deletion, file renamed to a hidden file, deleted after 3 days, orphaned chunks, State replica detection – chunkserver failure missing mutation while it is down, master assigns – chunk server numbers to distinguish
  5. Fast recovery – mast and chunk server designed such that they restore their data and start in two seconds Chunk replication – discussed earlier Master replication – operations log and checkpoints are replicated on multiple machines, shadow masters – provide read-only access Data integrity – uses checksumming to detect corruption of stored data, we can recover from corruption using replicas, but it is impractical Diagnostic tools – generate diagnostic logs that record many significant events. The RPC logs include the exact requests and responses sent on the wire, except for the file data being read or written.
  6. The two clusters have similar numbers of files, though B has a larger proportion of dead files, namely files which were deleted or replaced by a new version but whose storage have not yet been reclaimed. It also has more chunks because its files tend to be larger
  7. Read returns no data in Y b’coz applications in production system use file as producer-consumer queues cluster Y sees a much higher percentage of large record appends than cluster X does because our production systems, which use cluster Y, are more aggressively tuned for GFS