SlideShare a Scribd company logo
1 of 46
Lecture 2 ā€“ Distributed Filesystems 922EU3870 ā€“ Cloud Computing and Mobile Platforms, Autumn 2009 2009/9/21 Ping Yeh ( č‘‰å¹³ ), Google, Inc.
Outline ,[object Object]
Filesystems overview
Distributed file systems ,[object Object]
Shared storage (example: Global FS)
Wide-area (example: AFS)
Fault-tolerant (example: Coda)
Parallel (example: Lustre)
Fault-tolerant and Parallel (example: dCache) ,[object Object]
Homework
Numbers real world engineers should know L1 cache reference 0.5 ns Branch mispredict 5  ns L2 cache reference 7  ns Mutex lock/unlock 100  ns Main memory reference 100  ns Compress 1 KB with Zippy 10,000  ns Send 2 KB through 1 Gbps network 20,000  ns Read 1 MB sequentially from memory 250,000  ns Round trip within the same data center 500,000  ns Disk seek 10,000,000  ns Read 1 MB sequentially from network 10,000,000  ns Read 1 MB sequentially from disk 30,000,000  ns Round trip between California and Netherlands 150,000,000  ns
The Joys of Real Hardware Typical first year for a new cluster: ~0.5  overheating  (power down most machines in <5 mins, ~1-2 days to recover) ~1  PDU failure  (~500-1000 machines suddenly disappear, ~6 hours to come back) ~1  rack-move  (plenty of warning, ~500-1000 machines powered down, ~6 hours) ~1  network rewiring  (rolling ~5% of machines down over 2-day span) ~20  rack failures  (40-80 machines instantly disappear, 1-6 hours to get back) ~5  racks go wonky  (40-80 machines see 50% packetloss) ~8  network maintenances  (4 might cause ~30-minute random connectivity losses) ~12  router reloads  (takes out DNS and external vips for a couple minutes) ~3  router failures  (have to immediately pull traffic for an hour) ~dozens of minor  30-second blips for dns ~1000  individual machine failures ~thousands of  hard drive failures slow disks, bad memory, misconfigured machines, flaky machines,  etc.
File Systems Overview ,[object Object]
Usually layered on top of a lower-level physical storage medium
Divided into logical units called ā€œfilesā€ ,[object Object],[object Object],[object Object]
A path is the expression that joins directories and filename to form a unique ā€œfull nameā€ for a file. ,[object Object]
The set of valid paths form the  namespace  of the file system.
What Gets Stored ,[object Object]
Also includes meta-data on a volume-wide and per-file basis: Volume-wide: Available space Formatting info character set ... Per-file: name owner modification date physical layout...
High-Level Organization ,[object Object]
One directory acts as the ā€œrootā€
ā€œlinksā€ (symlinks, shortcuts, etc) provide simple means of providing multiple access paths to one file
Other file systems can be ā€œmountedā€ and dropped in as sub-hierarchies (other drives, network shares)
Typical operations on a file: create, delete, rename, open, close, read, write, append. ,[object Object]
Low-Level Organization (1/2) ,[object Object]
File descriptors + meta-data stored in inodes (Un*x) ,[object Object]
Tells how to look up file contents ,[object Object]
Low-Level Organization (2/2) ,[object Object]
Viewed as a sequential array of blocks
Usually address ~1 KB chunk at a time
Tree structure is ā€œflattenedā€ into blocks
Overlapping writes/deletes can cause fragmentation: files are often not stored with a linear layout ,[object Object]
Fragmentation
Filesystem Design Considerations ,[object Object]
Consistency: what to do when more than one user reads/writes on the same file?
Security: who can do what to a file? Authentication/ACL
Reliability: can files not be damaged at power outage or other hardware failures?
Local Filesystems on Unix-like Systems ,[object Object]
Namespace: root directory ā€œ/ā€, followed by directories and files.
Consistency: ā€œsequential consistencyā€, newly written data are immediately visible to open reads (if...)
Security: ,[object Object]
kerberos: tickets ,[object Object],[object Object],computer
Namespace ,[object Object]
/mnt/disk1, /mnt/disk2, ā€¦ when you have multiple disks ,[object Object],[object Object]
dynamical resizing by adding/removing disks without reboot
splitting/merging volumes as long as no data spans the split

More Related Content

What's hot

4.file service architecture
4.file service architecture4.file service architecture
4.file service architecture
AbDul ThaYyal
Ā 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memory
Ashish Kumar
Ā 
Market oriented Cloud Computing
Market oriented Cloud ComputingMarket oriented Cloud Computing
Market oriented Cloud Computing
Jithin Parakka
Ā 
MapReduce in Cloud Computing
MapReduce in Cloud ComputingMapReduce in Cloud Computing
MapReduce in Cloud Computing
Mohammad Mustaqeem
Ā 

What's hot (20)

File models and file accessing models
File models and file accessing modelsFile models and file accessing models
File models and file accessing models
Ā 
DDBMS Paper with Solution
DDBMS Paper with SolutionDDBMS Paper with Solution
DDBMS Paper with Solution
Ā 
Parallel Database
Parallel DatabaseParallel Database
Parallel Database
Ā 
Directory services
Directory servicesDirectory services
Directory services
Ā 
4.file service architecture
4.file service architecture4.file service architecture
4.file service architecture
Ā 
Multiversion Concurrency Control Techniques
Multiversion Concurrency Control TechniquesMultiversion Concurrency Control Techniques
Multiversion Concurrency Control Techniques
Ā 
Distributed concurrency control
Distributed concurrency controlDistributed concurrency control
Distributed concurrency control
Ā 
2. Distributed Systems Hardware & Software concepts
2. Distributed Systems Hardware & Software concepts2. Distributed Systems Hardware & Software concepts
2. Distributed Systems Hardware & Software concepts
Ā 
Database System Architectures
Database System ArchitecturesDatabase System Architectures
Database System Architectures
Ā 
Database replication
Database replicationDatabase replication
Database replication
Ā 
Distributed Systems Naming
Distributed Systems NamingDistributed Systems Naming
Distributed Systems Naming
Ā 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memory
Ā 
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Ā 
Market oriented Cloud Computing
Market oriented Cloud ComputingMarket oriented Cloud Computing
Market oriented Cloud Computing
Ā 
Distributed system architecture
Distributed system architectureDistributed system architecture
Distributed system architecture
Ā 
MapReduce in Cloud Computing
MapReduce in Cloud ComputingMapReduce in Cloud Computing
MapReduce in Cloud Computing
Ā 
Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed System
Ā 
Replication in Distributed Database
Replication in Distributed DatabaseReplication in Distributed Database
Replication in Distributed Database
Ā 
Distributed Operating System_1
Distributed Operating System_1Distributed Operating System_1
Distributed Operating System_1
Ā 
Sun NFS , Case study
Sun NFS , Case study Sun NFS , Case study
Sun NFS , Case study
Ā 

Similar to Distributed File System

Distributed file systems
Distributed file systemsDistributed file systems
Distributed file systems
Sri Prasanna
Ā 
Distributed file systems (from Google)
Distributed file systems (from Google)Distributed file systems (from Google)
Distributed file systems (from Google)
Sri Prasanna
Ā 
Distributed computing seminar lecture 3 - distributed file systems
Distributed computing seminar   lecture 3 - distributed file systemsDistributed computing seminar   lecture 3 - distributed file systems
Distributed computing seminar lecture 3 - distributed file systems
tugrulh
Ā 
Dfs (Distributed computing)
Dfs (Distributed computing)Dfs (Distributed computing)
Dfs (Distributed computing)
Sri Prasanna
Ā 

Similar to Distributed File System (20)

009709863.pdf
009709863.pdf009709863.pdf
009709863.pdf
Ā 
Distributed file systems
Distributed file systemsDistributed file systems
Distributed file systems
Ā 
Hadoop Distributed File System for Big Data Analytics
Hadoop Distributed File System for Big Data AnalyticsHadoop Distributed File System for Big Data Analytics
Hadoop Distributed File System for Big Data Analytics
Ā 
Distributed file systems (from Google)
Distributed file systems (from Google)Distributed file systems (from Google)
Distributed file systems (from Google)
Ā 
Lec3 Dfs
Lec3 DfsLec3 Dfs
Lec3 Dfs
Ā 
Distributed computing seminar lecture 3 - distributed file systems
Distributed computing seminar   lecture 3 - distributed file systemsDistributed computing seminar   lecture 3 - distributed file systems
Distributed computing seminar lecture 3 - distributed file systems
Ā 
Dfs (Distributed computing)
Dfs (Distributed computing)Dfs (Distributed computing)
Dfs (Distributed computing)
Ā 
Magnetic disk - Krishna Geetha.ppt
Magnetic disk  - Krishna Geetha.pptMagnetic disk  - Krishna Geetha.ppt
Magnetic disk - Krishna Geetha.ppt
Ā 
Distributed File Systems
Distributed File SystemsDistributed File Systems
Distributed File Systems
Ā 
Xen server storage Overview
Xen server storage OverviewXen server storage Overview
Xen server storage Overview
Ā 
Posscon2013
Posscon2013Posscon2013
Posscon2013
Ā 
ch11
ch11ch11
ch11
Ā 
Hadoop
HadoopHadoop
Hadoop
Ā 
Presentation on nfs,afs,vfs
Presentation on nfs,afs,vfsPresentation on nfs,afs,vfs
Presentation on nfs,afs,vfs
Ā 
I/O System and Case study
I/O System and Case studyI/O System and Case study
I/O System and Case study
Ā 
File system Os
File system OsFile system Os
File system Os
Ā 
FILE STRUCTURE IN DBMS
FILE STRUCTURE IN DBMSFILE STRUCTURE IN DBMS
FILE STRUCTURE IN DBMS
Ā 
SAN BASICS..Why we will go for SAN?
SAN BASICS..Why we will go for SAN?SAN BASICS..Why we will go for SAN?
SAN BASICS..Why we will go for SAN?
Ā 
Files and directories in Linux 6
Files and directories  in Linux 6Files and directories  in Linux 6
Files and directories in Linux 6
Ā 
linux file sysytem& input and output
linux file sysytem& input and outputlinux file sysytem& input and output
linux file sysytem& input and output
Ā 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
Ā 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Ā 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Ā 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
Ā 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
Ā 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Ā 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Ā 
šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜
Ā 
Scaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationScaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organization
Ā 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
Ā 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
Ā 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
Ā 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Ā 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Ā 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Ā 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Ā 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Ā 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Ā 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Ā 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Ā 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Ā 

Distributed File System

Editor's Notes

  1. memory: 1 GHz bus * 32 bit * 1B/8b = 4 GB/s 1MB / 4 GBps = 1/4 ms = 250 micro-seconds network: 1 Gbps = 100 MB/s 1MB / 100 MBps = 0.01 s = 10 ms
  2. 1.NAL-Network Abstraction Layers 2.Drivers: ext2 OBD, OBD filter makes other file systems recognizable such as XFS, JFS, and Ext3 3.NIO portal API provides for interoperation with a variety of network transports through NAL 4.When a Lustre inode represents a file, the metadata merely holds reference to the file data obj. stored on the OSTā€™s