SlideShare a Scribd company logo
1 of 38
THE GOOGLE FILE SYSTEM
By Sanjay Ghemawat, Howard Gobioff, and
Shun-Tak Leung
1
INTRODUCTION
• Google
• Applications process lots of data
• Need good file system
• Solution : Google File System
Large, distributed, highly fault tolerant file system.
2
DESIGN MOTIVATIONS
1. Fault-tolerance and auto-recovery need to be built
into the system.
2. Standard I/O assumptions (e.g. block size) have
to be re-examined.
3. Record appends are the prevalent form of
writing.
4. Google applications and GFS should be co-
designed.
3
INTERFACE
 Create
 Delete
 Open
 Close
 Read
 Write
 Snapshot
 Record Append
4
GFS ARCHITECTURE
On a single-machine FS:
 An upper layer maintains the metadata.
 A lower layer (i.e. disk) stores the data in units
called “blocks”.
In the GFS:
 A master process maintains the metadata.
A lower layer (i.e. a set of chunk servers) stores the
data in units called “chunks”.
5
GFS ARCHITECTURE
6
CHUNK
 Analogous to block, except larger.
 Size: 64 MB
 Stored on chunk server as file
 Chunk handle ( chunk file name) is used to
reference chunk.
 Replicated across multiple chunk servers
7
CHUNK SIZE
• Advantages
o Reduce client-master interaction
o Reduce the size of the metadata
• Disadvantages
o Hot Spots
Solution:
Higher replication factor
8
MASTER
 Single master is centralized
 Stores all metadata:
o File namespace
o File to chunk mappings
o Chunk location information
9
GFS ARCHITECTURE
10
System Interactions
Current lease holder?
identity of primary
location of replicas
(cached by client)
3a. data
3b. data
3c. data
Write request
Primary assign mutations
Applies it
Forward write request
Operation completed
Operation completed
Operation completed
or Error report
11
SYSTEM INTERACTIONS
 Record appends
- Client specifies only data
 Snapshot
-Makes a copy of a file or a directory
tree
12
OPERATION LOG
 Historical record of critical metadata changes
 Defines the order of concurrent operations
 Critical
 Replicated on multiple remote machines
 Respond to client only when log locally and remotely
 Fast recovery by using checkpoints
 Use a compact B-tree like form directly mapping into
memory
 Switch to a new log, Create new checkpoints in a
separate threads
13
MASTER OPERATIONS
 Namespace Management and Locking
 Chunk Creation
 Chunk Re-replication
 Chunk Rebalancing
 Garbage Collection
14
FAULT TOLERANCE AND DIAGNOSIS
1.High Availability
They keep the overall system highly
available with two simple yet effective
strategies.
Fast Recovery and replication
15
1.1 Fast Recovery : Master and chunk
servers are designed to restart and restore
states in a few seconds.
1.2 Chunk Replication : Across multiple
machines, across multiple racks.
16
1.3 Master Replication:
 Log of all changes made to
metadata.
 Log replicated on multiple
machines.
 “Shadow” masters for reading
data if “real” master is down.
17
18
2. Data Integrity
Each chunk has an associated checksum.
3. Diagnostic Logging
Logging is maintained for keeping the details
of interactions between machines. (exact
request and responses sent on the wire
except data being transferred.)
19
MEASUREMENTS
They measured performance on a GFS
cluster consisting one master, two master
replicas, 16 chunk servers and 16 clients.
20
All machines are configured with
1.Dual 1.4 GHz PIII processors
2. 2 GB memory
3. Two 80 GB 5400 rpm disks
4. 100 Mbps full duplex
Ethernet connection to an HP 2524
switch.
21
22
23
Here also rate will drop when the number of clients
increases up to 16 , append rate drops due to
congestion and variance in network transfer rates
seen by different clients.
24
REAL WORLD CLUSTERS
Table 1-Characteristics of two GFS clusters
25
Table 2 –Performance Metrics for A and B clusters
26
RESULTS
1.Read and Write Rates
• Average write rate was 30 MB/s.
• When the measurements were taken B
was in a middle of a write.
• Read rates were high, both clusters
were in the middle of a heavy read
activity.
• A is using resources efficiently than B.
27
2. Master Loads
Master can easily keep up with 200 to 500
operations per second.
28
3. Recovery Time.
• Killed a single chunk server
( 15, 000 chunks containing 600 GB of
data) in cluster B.
•All chunks were replicated in 23.2
minutes at an effective replication rate
of 440 MB/s.
29
Killed two chunk servers (16 000 chunks
and 660 GB of data).
Failure reduced 266 chunks to having a
single replica.
30
These 266 chunks were cloned at a
higher priority and all restored within 2
minutes.
Putting the cluster in a state where it
could tolerate another chunk server
failure
31
WORKLOAD BREAKDOWN
Cluster X and Y are used to represent
breakdown of the workloads on two
GFS. Cluster X is for research and
development while Y is for production
data processing.
32
Operations Breakdown by Size
Table 3 – Operation Breakdown by Size (%)
33
Bytes transferred breakdown by operation size
Table 4 – Bytes Transferred Breakdown by Operation
Size(%) 34
Master Requests Breakdown by Type (%)
Table 5 : Master request Breakdown by Type (%)
35
CONCLUSIONS
• GFS demonstrates the qualities essential for
supporting large scale data processing
workloads on commodity hardware.
• It provides fault tolerance by constant
monitoring, replicating crucial data and fast,
automatic recovery.
• It delivers high aggregate throughput to many
concurrent readers and writers by separating file
system control from data transfer. 36
Thank You.
37
Q and A
38

More Related Content

What's hot

The Google File System (GFS)
The Google File System (GFS)The Google File System (GFS)
The Google File System (GFS)Romain Jacotin
 
Introduction to distributed file systems
Introduction to distributed file systemsIntroduction to distributed file systems
Introduction to distributed file systemsViet-Trung TRAN
 
Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS  Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS Dr Neelesh Jain
 
4.file service architecture
4.file service architecture4.file service architecture
4.file service architectureAbDul ThaYyal
 
Data storage in cloud computing
Data storage in cloud computingData storage in cloud computing
Data storage in cloud computingjamunaashok
 
Ntfs and computer forensics
Ntfs and computer forensicsNtfs and computer forensics
Ntfs and computer forensicsGaurav Ragtah
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File Systemtutchiio
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems confluent
 
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersAshraf Uddin
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 

What's hot (20)

The Google File System (GFS)
The Google File System (GFS)The Google File System (GFS)
The Google File System (GFS)
 
Google file system
Google file systemGoogle file system
Google file system
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
Google file system
Google file systemGoogle file system
Google file system
 
Introduction to distributed file systems
Introduction to distributed file systemsIntroduction to distributed file systems
Introduction to distributed file systems
 
Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS  Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS
 
Gfs vs hdfs
Gfs vs hdfsGfs vs hdfs
Gfs vs hdfs
 
GFS
GFSGFS
GFS
 
4.file service architecture
4.file service architecture4.file service architecture
4.file service architecture
 
HDFS Architecture
HDFS ArchitectureHDFS Architecture
HDFS Architecture
 
GFS & HDFS Introduction
GFS & HDFS IntroductionGFS & HDFS Introduction
GFS & HDFS Introduction
 
Data storage in cloud computing
Data storage in cloud computingData storage in cloud computing
Data storage in cloud computing
 
HDFS Federation
HDFS FederationHDFS Federation
HDFS Federation
 
Ntfs and computer forensics
Ntfs and computer forensicsNtfs and computer forensics
Ntfs and computer forensics
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File System
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems
 
GOOGLE BIGTABLE
GOOGLE BIGTABLEGOOGLE BIGTABLE
GOOGLE BIGTABLE
 
Bigtable and Dynamo
Bigtable and DynamoBigtable and Dynamo
Bigtable and Dynamo
 
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 

Viewers also liked

Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Antonio Cesarano
 
Research Papers Recommender based on Digital Repositories Metadata
Research Papers Recommender based on Digital Repositories MetadataResearch Papers Recommender based on Digital Repositories Metadata
Research Papers Recommender based on Digital Repositories MetadataRicard de la Vega
 
Data mining paper survey for Health Care Support System
Data mining paper survey for Health Care Support SystemData mining paper survey for Health Care Support System
Data mining paper survey for Health Care Support System鴻鈞 王
 
Teaching with Google Books: research, copyright, and data mining
Teaching with Google Books: research, copyright, and data miningTeaching with Google Books: research, copyright, and data mining
Teaching with Google Books: research, copyright, and data miningNathan Rinne
 
Google file system
Google file systemGoogle file system
Google file systemDhan V Sagar
 
Preservaçao digital de tese e dissertaçoes
Preservaçao digital de tese e dissertaçoesPreservaçao digital de tese e dissertaçoes
Preservaçao digital de tese e dissertaçoesRicard de la Vega
 
Research in data mining
Research in data miningResearch in data mining
Research in data miningHouw Liong The
 
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukCloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukAndrii Vozniuk
 
Social Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisSocial Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisInfini Graph
 
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...shibbirtanvin
 
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...shibbirtanvin
 
Medical Informatics: Computational Analytics in Healthcare
Medical Informatics: Computational Analytics in HealthcareMedical Informatics: Computational Analytics in Healthcare
Medical Informatics: Computational Analytics in HealthcareNUS-ISS
 
Emotion detection from text using data mining and text mining
Emotion detection from text using data mining and text miningEmotion detection from text using data mining and text mining
Emotion detection from text using data mining and text miningSakthi Dasans
 
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...Sunil Nair
 
Recommender Systems - A Review and Recent Research Trends
Recommender Systems  -  A Review and Recent Research TrendsRecommender Systems  -  A Review and Recent Research Trends
Recommender Systems - A Review and Recent Research TrendsSujoy Bag
 
Slide-show on Biometrics
Slide-show on BiometricsSlide-show on Biometrics
Slide-show on BiometricsPathik504
 

Viewers also liked (17)

Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...
 
Panel at AMIA 2013 Conference on big data - The Exposome and the quantified s...
Panel at AMIA 2013 Conference on big data - The Exposome and the quantified s...Panel at AMIA 2013 Conference on big data - The Exposome and the quantified s...
Panel at AMIA 2013 Conference on big data - The Exposome and the quantified s...
 
Research Papers Recommender based on Digital Repositories Metadata
Research Papers Recommender based on Digital Repositories MetadataResearch Papers Recommender based on Digital Repositories Metadata
Research Papers Recommender based on Digital Repositories Metadata
 
Data mining paper survey for Health Care Support System
Data mining paper survey for Health Care Support SystemData mining paper survey for Health Care Support System
Data mining paper survey for Health Care Support System
 
Teaching with Google Books: research, copyright, and data mining
Teaching with Google Books: research, copyright, and data miningTeaching with Google Books: research, copyright, and data mining
Teaching with Google Books: research, copyright, and data mining
 
Google file system
Google file systemGoogle file system
Google file system
 
Preservaçao digital de tese e dissertaçoes
Preservaçao digital de tese e dissertaçoesPreservaçao digital de tese e dissertaçoes
Preservaçao digital de tese e dissertaçoes
 
Research in data mining
Research in data miningResearch in data mining
Research in data mining
 
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukCloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
 
Social Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisSocial Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & Analysis
 
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
Preprocessing of Academic Data for Mining Association Rule, Presentation @WAD...
 
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
Mining the Social Web to Analyze the Impact of Social Media on Socialization,...
 
Medical Informatics: Computational Analytics in Healthcare
Medical Informatics: Computational Analytics in HealthcareMedical Informatics: Computational Analytics in Healthcare
Medical Informatics: Computational Analytics in Healthcare
 
Emotion detection from text using data mining and text mining
Emotion detection from text using data mining and text miningEmotion detection from text using data mining and text mining
Emotion detection from text using data mining and text mining
 
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
 
Recommender Systems - A Review and Recent Research Trends
Recommender Systems  -  A Review and Recent Research TrendsRecommender Systems  -  A Review and Recent Research Trends
Recommender Systems - A Review and Recent Research Trends
 
Slide-show on Biometrics
Slide-show on BiometricsSlide-show on Biometrics
Slide-show on Biometrics
 

Similar to Google File System

storage-systems.pptx
storage-systems.pptxstorage-systems.pptx
storage-systems.pptxShimoFcis
 
advanced Google file System
advanced Google file Systemadvanced Google file System
advanced Google file Systemdiptipan
 
Google File System
Google File SystemGoogle File System
Google File SystemDreamJobs1
 
Gfs and map redusing
Gfs and map redusingGfs and map redusing
Gfs and map redusingilashanawaz
 
Google - Bigtable
Google - BigtableGoogle - Bigtable
Google - Bigtable영원 서
 
Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0Norberto Leite
 
Advance google file system
Advance google file systemAdvance google file system
Advance google file systemLalit Rastogi
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...areej qasrawi
 
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...Imperva Incapsula
 
MapReduce presentation
MapReduce presentationMapReduce presentation
MapReduce presentationVu Thi Trang
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia DatabasesJaime Crespo
 

Similar to Google File System (20)

The Google file system
The Google file systemThe Google file system
The Google file system
 
storage-systems.pptx
storage-systems.pptxstorage-systems.pptx
storage-systems.pptx
 
advanced Google file System
advanced Google file Systemadvanced Google file System
advanced Google file System
 
Google File System
Google File SystemGoogle File System
Google File System
 
Dba tuning
Dba tuningDba tuning
Dba tuning
 
os
osos
os
 
Gfs and map redusing
Gfs and map redusingGfs and map redusing
Gfs and map redusing
 
Lalit
LalitLalit
Lalit
 
Gfs sosp2003
Gfs sosp2003Gfs sosp2003
Gfs sosp2003
 
Gfs
GfsGfs
Gfs
 
Google - Bigtable
Google - BigtableGoogle - Bigtable
Google - Bigtable
 
Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0
 
Advance google file system
Advance google file systemAdvance google file system
Advance google file system
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
 
UNIT-2 OS.pptx
UNIT-2 OS.pptxUNIT-2 OS.pptx
UNIT-2 OS.pptx
 
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
 
MapReduce presentation
MapReduce presentationMapReduce presentation
MapReduce presentation
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 

Recently uploaded

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Google File System

  • 1. THE GOOGLE FILE SYSTEM By Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 1
  • 2. INTRODUCTION • Google • Applications process lots of data • Need good file system • Solution : Google File System Large, distributed, highly fault tolerant file system. 2
  • 3. DESIGN MOTIVATIONS 1. Fault-tolerance and auto-recovery need to be built into the system. 2. Standard I/O assumptions (e.g. block size) have to be re-examined. 3. Record appends are the prevalent form of writing. 4. Google applications and GFS should be co- designed. 3
  • 4. INTERFACE  Create  Delete  Open  Close  Read  Write  Snapshot  Record Append 4
  • 5. GFS ARCHITECTURE On a single-machine FS:  An upper layer maintains the metadata.  A lower layer (i.e. disk) stores the data in units called “blocks”. In the GFS:  A master process maintains the metadata. A lower layer (i.e. a set of chunk servers) stores the data in units called “chunks”. 5
  • 7. CHUNK  Analogous to block, except larger.  Size: 64 MB  Stored on chunk server as file  Chunk handle ( chunk file name) is used to reference chunk.  Replicated across multiple chunk servers 7
  • 8. CHUNK SIZE • Advantages o Reduce client-master interaction o Reduce the size of the metadata • Disadvantages o Hot Spots Solution: Higher replication factor 8
  • 9. MASTER  Single master is centralized  Stores all metadata: o File namespace o File to chunk mappings o Chunk location information 9
  • 11. System Interactions Current lease holder? identity of primary location of replicas (cached by client) 3a. data 3b. data 3c. data Write request Primary assign mutations Applies it Forward write request Operation completed Operation completed Operation completed or Error report 11
  • 12. SYSTEM INTERACTIONS  Record appends - Client specifies only data  Snapshot -Makes a copy of a file or a directory tree 12
  • 13. OPERATION LOG  Historical record of critical metadata changes  Defines the order of concurrent operations  Critical  Replicated on multiple remote machines  Respond to client only when log locally and remotely  Fast recovery by using checkpoints  Use a compact B-tree like form directly mapping into memory  Switch to a new log, Create new checkpoints in a separate threads 13
  • 14. MASTER OPERATIONS  Namespace Management and Locking  Chunk Creation  Chunk Re-replication  Chunk Rebalancing  Garbage Collection 14
  • 15. FAULT TOLERANCE AND DIAGNOSIS 1.High Availability They keep the overall system highly available with two simple yet effective strategies. Fast Recovery and replication 15
  • 16. 1.1 Fast Recovery : Master and chunk servers are designed to restart and restore states in a few seconds. 1.2 Chunk Replication : Across multiple machines, across multiple racks. 16
  • 17. 1.3 Master Replication:  Log of all changes made to metadata.  Log replicated on multiple machines.  “Shadow” masters for reading data if “real” master is down. 17
  • 18. 18
  • 19. 2. Data Integrity Each chunk has an associated checksum. 3. Diagnostic Logging Logging is maintained for keeping the details of interactions between machines. (exact request and responses sent on the wire except data being transferred.) 19
  • 20. MEASUREMENTS They measured performance on a GFS cluster consisting one master, two master replicas, 16 chunk servers and 16 clients. 20
  • 21. All machines are configured with 1.Dual 1.4 GHz PIII processors 2. 2 GB memory 3. Two 80 GB 5400 rpm disks 4. 100 Mbps full duplex Ethernet connection to an HP 2524 switch. 21
  • 22. 22
  • 23. 23
  • 24. Here also rate will drop when the number of clients increases up to 16 , append rate drops due to congestion and variance in network transfer rates seen by different clients. 24
  • 25. REAL WORLD CLUSTERS Table 1-Characteristics of two GFS clusters 25
  • 26. Table 2 –Performance Metrics for A and B clusters 26
  • 27. RESULTS 1.Read and Write Rates • Average write rate was 30 MB/s. • When the measurements were taken B was in a middle of a write. • Read rates were high, both clusters were in the middle of a heavy read activity. • A is using resources efficiently than B. 27
  • 28. 2. Master Loads Master can easily keep up with 200 to 500 operations per second. 28
  • 29. 3. Recovery Time. • Killed a single chunk server ( 15, 000 chunks containing 600 GB of data) in cluster B. •All chunks were replicated in 23.2 minutes at an effective replication rate of 440 MB/s. 29
  • 30. Killed two chunk servers (16 000 chunks and 660 GB of data). Failure reduced 266 chunks to having a single replica. 30
  • 31. These 266 chunks were cloned at a higher priority and all restored within 2 minutes. Putting the cluster in a state where it could tolerate another chunk server failure 31
  • 32. WORKLOAD BREAKDOWN Cluster X and Y are used to represent breakdown of the workloads on two GFS. Cluster X is for research and development while Y is for production data processing. 32
  • 33. Operations Breakdown by Size Table 3 – Operation Breakdown by Size (%) 33
  • 34. Bytes transferred breakdown by operation size Table 4 – Bytes Transferred Breakdown by Operation Size(%) 34
  • 35. Master Requests Breakdown by Type (%) Table 5 : Master request Breakdown by Type (%) 35
  • 36. CONCLUSIONS • GFS demonstrates the qualities essential for supporting large scale data processing workloads on commodity hardware. • It provides fault tolerance by constant monitoring, replicating crucial data and fast, automatic recovery. • It delivers high aggregate throughput to many concurrent readers and writers by separating file system control from data transfer. 36

Editor's Notes

  1. Beacause components failures are accepted even this kind of large system When we are regularly working , all TB sized, KB sized files are also suppoted by the system. Most files are mutatetd by appending rather than overwriting excepting data It is fine, if we have a file system without imposing burden on the application
  2. 1.System supports usual operations , as well as GFS has snapshot & record append operations also. 2. Snapshot creates a copy of a file or directory tree at low cost 3. Record append allows multiple clients to append data at the same file
  3. 1.Files are devided into fixed size chunk 2. Chunk handle  immutable and globally unique 64 bit at the time of chunk creation 3. By defult stored in three chunk servers
  4. Larger size --  Adavtages Read write interaction between master and client make lesser Likely to perform many operations
  5. To keep itself informed a shadow master reads a replica of the growing operation log and applies the same changes to its data structures exactly as the primary does. Keep handshake messages with chunkservers to monitor their status. It depends only on primary master only for replica location updates only from primary’s decision to create and delete replicas.
  6. Logs will be used to reconstruct the entire interaction history to diagnose a problem. Serve as traces for load testing and performance analysis.
  7. For one client read rate is 10 MB/s 80% of the estimated value For 16 clients 94 MB/s i.e for one client 6 MB/s 75% of the estimated value.
  8. Write rate for one client 6.3 MB/s. Half of the estimated value. (12.5 MB/s) Aggregate write rate for 16 clients 35 MB/s 2.2 MB/s per one client. Half of the estimated value.
  9. For one client it is 6.0 MB/s and for 16 clients it is 4.8 MB/s.
  10. A- used for research and development. It reads through a few MBs to TBs of data, analyze or process them and write the results. B – Used for production data processing. task lasts much longer. Metadata at chunkservers – checksums, chunk version number Metadata at Masters are so small. does not limit the system’s capacity. File names in compressed form, ownerships and permission, mapping from files to chunks, chunks current version, replica location etc. Recovery is fast.
  11. Because A can support up to 750 MB/s it is using 580 MB/s. B can support 1300 MB/s but using only 380 MB/s.