SlideShare une entreprise Scribd logo
1  sur  20
Giraffa 
A highly available, scalable, distributed file system 
PLAMEN JELIAZKOV & MILAN DESAI
Quick Introduction 
• Giraffa is a new file system. 
• Distributes it’s namespace by utilizing features of HDFS 
and HBase. 
• Open source project in experimental stage.
Design Principals 
• Linear scalability – more nodes can do more work within the same 
time. Scale data size and compute resources. 
• Reliability and availability – 1/1000 probability that a drive will fail 
today; on a large cluster with thousands of drives there can be 
several failures. 
• Move computation to data – minimize expensive data transfers. 
• Sequential data processing – avoid random reads. [Use HBase for 
random access].
Scalability Limits 
• Single-master architecture: a constraining resource 
• Single NameNode limits linear performance growth – a few 
bad clients / jobs can saturate the NameNode. 
• Single point of failure – takes entire File System out of 
service. 
• NameNode space limit: 
-- 100 million files and 200 million blocks with 64GB RAM 
-- Restricts storage capacity to about 20 PB 
-- Small file problem: block-to-file ratio is shrinking as people 
store more small files in HDFS. 
These are Konstantin’s own discoveries as published in 
“HDFS Scalability: The limits to growth”, USENIX;login: 2010.
The Goals for Giraffa 
• Support millions of concurrent clients 
- More servers -> higher concurrent connections can be accepted. 
• Store hundreds of billions of objects 
- More servers -> higher total memory. 
• Maintain Exabyte total storage capacity 
- More servers -> host more slaves -> higher total storage. 
Sharding the namespace achieves all three goals.
What About Federation? 
1. HDFS Federation allows independent NameNodes to share a 
common pool of DataNodes. 
2. In Federation, a user sees NameNodes as volumes, or as isolated 
file systems. 
Federation is a static approach to Namespace partitioning. 
We call it static because sub-trees are statically assigned to disjoint 
volumes. 
Relocating sub-trees to a new volume requires copying between file 
systems. 
A dynamic Namespace partitioning could move sub-trees 
automatically based on utilization or load-balancing requirements. 
In some cases, sub-trees could be relocated without copying data 
blocks.
VS
Giraffa Requirements 
Availability – the primary goal 
- Region splitting leads to load balancing of metadata traffic. 
- Same data streaming speed to / from DataNodes. 
- No SPOF. Continuous availability. 
Scalability 
- Each RegionServer stores a part of the namespace. 
Cluster operability 
- Cost running larger cluster is same as a smaller one. 
- But, running multiple clusters is more expensive.
The Big Picture 
1. Use HBase to store HDFS Namespace metadata. 
2. DataNodes continue to store HDFS blocks. 
3. Introduce coprocessors to act as communication layer between 
HBase, HDFS, and the file system. 
4. Store files and directories as rows in HBase. 
A Giraffa “shard” consists of: 
HBase RegionServer 
HDFS NameNode – to be replaced with Giraffa BlockManager. 
HDFS DataNode(s) 
*HBase Master 
*ZooKeeper(s) 
* == Not required per shard, but necessary within the network.
Giraffa File System 
• fs.defaultFS = grfa:/// 
• fs.grfa.impl = org.apache.giraffa.GiraffaFileSystem 
• Namespace is cached in RegionServer RAM. 
• Regions lead to dynamic Namespace partitioning. 
• Block management handled by specialized RegionObserver 
coprocessor to handle communication to DataNodes -> performs 
block allocation, replication, deletion, heartbeats, and block 
reports. 
• Namespace manipulation handled by specialized coprocessor -> 
performs all NameNode RPC Server calls.
NamespaceAgent 
Quick run through of this class: 
1. Implements ClientProtocol. Not a coprocessor. 
2. Replaces NameNode RPC channel for GiraffaClient 
(which extends DFSClient and is the client used by 
GiraffaFileSystem class). 
3. Has an HBaseClient member that communicates RPC 
requests to the NamespaceProcessor coprocessor of a 
RegionServer.
Namespace Table 
Single HBase table called “Namespace” stores: 
1. A RowKey: the bytes that identify the row and therefore 
the file / directory. 
2. File attributes: name, owner, group, permissions, access-time, 
modification-time, block size, replication, length. 
3. List of blocks for the file. 
4. List of block locations. 
5. State of the file: under construction, closed.
Row Keys 
• Files and directories are stored as rows in HBase. 
• The key bytes of a row determine its sorting in the Namespace 
table. 
• Different RowKey definitions change locality of files and 
directories within the HBase region. 
• FullPathRowKey is the default implementation. The key bytes 
of the row are the full source path to the file or directory. 
-- Problem: Renames may cause row to move to another Region. 
• Another idea is NumberedRowKey. The key bytes are some 
decided number. 
-- Problem: You lose locality within HBase Namespace table.
Locality of Reference 
• Traditional tree structured namespace is flattened into 
linear array. 
• Ordered list of files is self-partitioned into regions. 
• RowKey implementations define sorting of files and 
directories in the table. 
• Files in the same directory will belong to the same region 
(most of the time). 
-- This leads to an efficient “ls” implementation by purely 
scanning across a Region.
Giraffa Today 
A lot of work has been done by the current team, the newest to 
date are: 
• Introduction of custom Giraffa WebUI. 
• Atomic in-place rename, non-atomic moves, and non-atomic 
move failure recovery. 
• Serializing Exceptions over RPC. 
• Support for YARN. 
• (Coming soon) Introduction of Lease management.
Neat Futures 
• Full Hadoop compatibility / HDFS replacement. We are 96% 
compliant with hadoop/hdfs shell today. Shown by passing 
bulk of TestHDFSCLI. Missing dfsadmin commands today. 
• Since file system metadata lives among the same pool as 
regular data, it is possible to deploy analytics and obtain 
detailed analysis of your own file system. 
• Snapshot implementation becomes a matter of increasing the 
number of versions of a row allowed in HBase. 
• Extended attributes implementation just mean adding a new 
column to the file row.
History 
2009 – Study on scalability limits. 
2010 – Konstantin Shvachko works on design with Michael Stack; 
presentation at HDFS contributors meeting. 
2011 – Plamen Jeliazkov implements first POC. 
2012 – Presented at Hadoop Summit. Open sourced as Apache 
Extra’s project. 
2013 – Milan Desai and Konstantin Pelykh added as committers. 
Konstantin Boudnik as a contributor. 
2014 – Giraffa Scalability tested – ~46,300 mkdirs / second with 64 
RegionServer nodes and 64 client nodes.
?’s
DEMO TIME! 
LINKS TO PROJECT WEBSITE BELOW 
http://apache-extras.org/p/giraffa/ 
https://code.google.com 
/a/apache-extras.org/p/giraffa/

Contenu connexe

Tendances

presentation_Hadoop_File_System
presentation_Hadoop_File_Systempresentation_Hadoop_File_System
presentation_Hadoop_File_SystemBrett Keim
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconYiwei Ma
 
Aziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jhaAziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jhaData Con LA
 
Hadoop distributed file system
Hadoop distributed file systemHadoop distributed file system
Hadoop distributed file systemAnshul Bhatnagar
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationAdam Kawa
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemAnand Kulkarni
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for ArchitectsNick Dimiduk
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfsshrey mehrotra
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemRutvik Bapat
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)Prashant Gupta
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseCloudera, Inc.
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012larsgeorge
 
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and DeploymentOct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and DeploymentYahoo Developer Network
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...Cloudera, Inc.
 

Tendances (20)

presentation_Hadoop_File_System
presentation_Hadoop_File_Systempresentation_Hadoop_File_System
presentation_Hadoop_File_System
 
Apache hadoop hbase
Apache hadoop hbaseApache hadoop hbase
Apache hadoop hbase
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
 
Aziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jhaAziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jha
 
Hadoop distributed file system
Hadoop distributed file systemHadoop distributed file system
Hadoop distributed file system
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for Architects
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfs
 
Hbase: an introduction
Hbase: an introductionHbase: an introduction
Hbase: an introduction
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBase
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
 
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and DeploymentOct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
 
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ TwitterCross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
 
HBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and CompactionHBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and Compaction
 

Similaire à Giraffa - November 2014

Hadoop - HDFS
Hadoop - HDFSHadoop - HDFS
Hadoop - HDFSKavyaGo
 
Apache hadoop basics
Apache hadoop basicsApache hadoop basics
Apache hadoop basicssaili mane
 
Topic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptxTopic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptxDanishMahmood23
 
Dynamic Namespace Partitioning with Giraffa File System
Dynamic Namespace Partitioning with Giraffa File SystemDynamic Namespace Partitioning with Giraffa File System
Dynamic Namespace Partitioning with Giraffa File SystemDataWorks Summit
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiridatastack
 
Data Analytics presentation.pptx
Data Analytics presentation.pptxData Analytics presentation.pptx
Data Analytics presentation.pptxSwarnaSLcse
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File Systemelliando dias
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduceDerek Chen
 
Hadoop File System.pptx
Hadoop File System.pptxHadoop File System.pptx
Hadoop File System.pptxAakashBerlia1
 
Big Data Reverse Knowledge Transfer.pptx
Big Data Reverse Knowledge Transfer.pptxBig Data Reverse Knowledge Transfer.pptx
Big Data Reverse Knowledge Transfer.pptxssuser8c3ea7
 

Similaire à Giraffa - November 2014 (20)

Hadoop - HDFS
Hadoop - HDFSHadoop - HDFS
Hadoop - HDFS
 
Apache hadoop basics
Apache hadoop basicsApache hadoop basics
Apache hadoop basics
 
Topic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptxTopic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptx
 
Hadoop data management
Hadoop data managementHadoop data management
Hadoop data management
 
Chapter2.pdf
Chapter2.pdfChapter2.pdf
Chapter2.pdf
 
Dynamic Namespace Partitioning with Giraffa File System
Dynamic Namespace Partitioning with Giraffa File SystemDynamic Namespace Partitioning with Giraffa File System
Dynamic Namespace Partitioning with Giraffa File System
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
 
module 2.pptx
module 2.pptxmodule 2.pptx
module 2.pptx
 
Data Analytics presentation.pptx
Data Analytics presentation.pptxData Analytics presentation.pptx
Data Analytics presentation.pptx
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
HDFS Deep Dive
HDFS Deep DiveHDFS Deep Dive
HDFS Deep Dive
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Hadoop training in bangalore
Hadoop training in bangaloreHadoop training in bangalore
Hadoop training in bangalore
 
HDFS
HDFSHDFS
HDFS
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Big data Hadoop
Big data  Hadoop   Big data  Hadoop
Big data Hadoop
 
Hadoop File System.pptx
Hadoop File System.pptxHadoop File System.pptx
Hadoop File System.pptx
 
Big Data Reverse Knowledge Transfer.pptx
Big Data Reverse Knowledge Transfer.pptxBig Data Reverse Knowledge Transfer.pptx
Big Data Reverse Knowledge Transfer.pptx
 
Hadoop overview.pdf
Hadoop overview.pdfHadoop overview.pdf
Hadoop overview.pdf
 

Dernier

"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...Erbil Polytechnic University
 
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxBSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxNiranjanYadav41
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Sumanth A
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communicationpanditadesh123
 
Autonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptAutonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptbibisarnayak0
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxVelmuruganTECE
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Romil Mishra
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Immutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdfImmutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdfDrew Moseley
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Erbil Polytechnic University
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdfHafizMudaserAhmad
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
home automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadhome automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadaditya806802
 
Crystal Structure analysis and detailed information pptx
Crystal Structure analysis and detailed information pptxCrystal Structure analysis and detailed information pptx
Crystal Structure analysis and detailed information pptxachiever3003
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 

Dernier (20)

"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...
 
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxBSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptx
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communication
 
Autonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptAutonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.ppt
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptx
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Immutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdfImmutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdf
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
home automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadhome automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasad
 
Crystal Structure analysis and detailed information pptx
Crystal Structure analysis and detailed information pptxCrystal Structure analysis and detailed information pptx
Crystal Structure analysis and detailed information pptx
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 

Giraffa - November 2014

  • 1. Giraffa A highly available, scalable, distributed file system PLAMEN JELIAZKOV & MILAN DESAI
  • 2. Quick Introduction • Giraffa is a new file system. • Distributes it’s namespace by utilizing features of HDFS and HBase. • Open source project in experimental stage.
  • 3. Design Principals • Linear scalability – more nodes can do more work within the same time. Scale data size and compute resources. • Reliability and availability – 1/1000 probability that a drive will fail today; on a large cluster with thousands of drives there can be several failures. • Move computation to data – minimize expensive data transfers. • Sequential data processing – avoid random reads. [Use HBase for random access].
  • 4. Scalability Limits • Single-master architecture: a constraining resource • Single NameNode limits linear performance growth – a few bad clients / jobs can saturate the NameNode. • Single point of failure – takes entire File System out of service. • NameNode space limit: -- 100 million files and 200 million blocks with 64GB RAM -- Restricts storage capacity to about 20 PB -- Small file problem: block-to-file ratio is shrinking as people store more small files in HDFS. These are Konstantin’s own discoveries as published in “HDFS Scalability: The limits to growth”, USENIX;login: 2010.
  • 5. The Goals for Giraffa • Support millions of concurrent clients - More servers -> higher concurrent connections can be accepted. • Store hundreds of billions of objects - More servers -> higher total memory. • Maintain Exabyte total storage capacity - More servers -> host more slaves -> higher total storage. Sharding the namespace achieves all three goals.
  • 6. What About Federation? 1. HDFS Federation allows independent NameNodes to share a common pool of DataNodes. 2. In Federation, a user sees NameNodes as volumes, or as isolated file systems. Federation is a static approach to Namespace partitioning. We call it static because sub-trees are statically assigned to disjoint volumes. Relocating sub-trees to a new volume requires copying between file systems. A dynamic Namespace partitioning could move sub-trees automatically based on utilization or load-balancing requirements. In some cases, sub-trees could be relocated without copying data blocks.
  • 7. VS
  • 8. Giraffa Requirements Availability – the primary goal - Region splitting leads to load balancing of metadata traffic. - Same data streaming speed to / from DataNodes. - No SPOF. Continuous availability. Scalability - Each RegionServer stores a part of the namespace. Cluster operability - Cost running larger cluster is same as a smaller one. - But, running multiple clusters is more expensive.
  • 9. The Big Picture 1. Use HBase to store HDFS Namespace metadata. 2. DataNodes continue to store HDFS blocks. 3. Introduce coprocessors to act as communication layer between HBase, HDFS, and the file system. 4. Store files and directories as rows in HBase. A Giraffa “shard” consists of: HBase RegionServer HDFS NameNode – to be replaced with Giraffa BlockManager. HDFS DataNode(s) *HBase Master *ZooKeeper(s) * == Not required per shard, but necessary within the network.
  • 10.
  • 11. Giraffa File System • fs.defaultFS = grfa:/// • fs.grfa.impl = org.apache.giraffa.GiraffaFileSystem • Namespace is cached in RegionServer RAM. • Regions lead to dynamic Namespace partitioning. • Block management handled by specialized RegionObserver coprocessor to handle communication to DataNodes -> performs block allocation, replication, deletion, heartbeats, and block reports. • Namespace manipulation handled by specialized coprocessor -> performs all NameNode RPC Server calls.
  • 12. NamespaceAgent Quick run through of this class: 1. Implements ClientProtocol. Not a coprocessor. 2. Replaces NameNode RPC channel for GiraffaClient (which extends DFSClient and is the client used by GiraffaFileSystem class). 3. Has an HBaseClient member that communicates RPC requests to the NamespaceProcessor coprocessor of a RegionServer.
  • 13. Namespace Table Single HBase table called “Namespace” stores: 1. A RowKey: the bytes that identify the row and therefore the file / directory. 2. File attributes: name, owner, group, permissions, access-time, modification-time, block size, replication, length. 3. List of blocks for the file. 4. List of block locations. 5. State of the file: under construction, closed.
  • 14. Row Keys • Files and directories are stored as rows in HBase. • The key bytes of a row determine its sorting in the Namespace table. • Different RowKey definitions change locality of files and directories within the HBase region. • FullPathRowKey is the default implementation. The key bytes of the row are the full source path to the file or directory. -- Problem: Renames may cause row to move to another Region. • Another idea is NumberedRowKey. The key bytes are some decided number. -- Problem: You lose locality within HBase Namespace table.
  • 15. Locality of Reference • Traditional tree structured namespace is flattened into linear array. • Ordered list of files is self-partitioned into regions. • RowKey implementations define sorting of files and directories in the table. • Files in the same directory will belong to the same region (most of the time). -- This leads to an efficient “ls” implementation by purely scanning across a Region.
  • 16. Giraffa Today A lot of work has been done by the current team, the newest to date are: • Introduction of custom Giraffa WebUI. • Atomic in-place rename, non-atomic moves, and non-atomic move failure recovery. • Serializing Exceptions over RPC. • Support for YARN. • (Coming soon) Introduction of Lease management.
  • 17. Neat Futures • Full Hadoop compatibility / HDFS replacement. We are 96% compliant with hadoop/hdfs shell today. Shown by passing bulk of TestHDFSCLI. Missing dfsadmin commands today. • Since file system metadata lives among the same pool as regular data, it is possible to deploy analytics and obtain detailed analysis of your own file system. • Snapshot implementation becomes a matter of increasing the number of versions of a row allowed in HBase. • Extended attributes implementation just mean adding a new column to the file row.
  • 18. History 2009 – Study on scalability limits. 2010 – Konstantin Shvachko works on design with Michael Stack; presentation at HDFS contributors meeting. 2011 – Plamen Jeliazkov implements first POC. 2012 – Presented at Hadoop Summit. Open sourced as Apache Extra’s project. 2013 – Milan Desai and Konstantin Pelykh added as committers. Konstantin Boudnik as a contributor. 2014 – Giraffa Scalability tested – ~46,300 mkdirs / second with 64 RegionServer nodes and 64 client nodes.
  • 19. ?’s
  • 20. DEMO TIME! LINKS TO PROJECT WEBSITE BELOW http://apache-extras.org/p/giraffa/ https://code.google.com /a/apache-extras.org/p/giraffa/