SlideShare une entreprise Scribd logo
1  sur  41
1 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone: scaling HDFS to trillions of objects
Xiaoyu Yao & Anu Engineer
Apache Committers & PMC members
2 © Hortonworks Inc. 2011–2018. All rights reserved
HDFS
• Apache HDFS is designed for large objects – not for many small objects.
• Small files create memory pressure on namenode.
• Metadata in memory is the strength of the original GFS and HDFS design, but also the weakness in
scaling to large number of files and blocks.
• Each file adds metadata to the namenode.
• Datanodes send all block information to namenode in BlockReports.
• Both of these create scalability issues on Namenode.
3 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone and HDFS - past
• HDFS-5477 – Separate Block Management.
• HDFS-8286 - Partial Namespace In Memory.
• HDFS-1052 Federation aims to address namespace and Block Space scalability issues.
• Federation deployment and planning adds complexity
• Requires changes to other components in the Hadoop stack
• HDFS-10467 Router based federation
• HDFS-7240 - Ozone borrows many ideas learned from these efforts and is a super set of
these approaches.
4 © Hortonworks Inc. 2011–2018. All rights reserved
• HDFS-7240 merged into Apache Hadoop trunk
• New sub-project named HDDS (Hadoop Distributed Data Store).
• Loadable module running in/out of process with HDFS datanode.
• HDDS release is independent release Hadoop.
• Maven profile (-pHdds) that disable HDDS compile by default.
• Thanks to all the contributors from the community that made Ozone real!
Ozone and HDFS - now
5 © Hortonworks Inc. 2011–2018. All rights reserved
• Ozone is an S3 like object store, which exposes REST API and Hadoop RPC interfaces.
• OzoneFS is a Hadoop compatible file system.
• Applications like Hive, Spark, YARN and MapReduce run natively on OzoneFS without
any modifications.
Ozone Overview
6 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone and Block layer
• The most basic abstraction in Ozone is a block layer called –
• Hadoop Distributed Data Store or HDDS.
• This talk is really about HDDS; how it works and how it will be used by other name space
services.
• For example, Ozone, HDFS and Quadra (CBlock) are simple namespace services that can be built
using HDDS.
• HDDS is a Distributed, Highly available block storage framework. It is designed to be
used by a system that provides it own namespace management.
• HDDS consists of a block manager called Storage Container Manager (SCM) and a set of
nodes that act as the data stores. The following architectural diagram shows the
components that makeup HDDS.
7 © Hortonworks Inc. 2011–2018. All rights reserved
Storage
Container Manager
DatanodeDatanode Datanode
Apache
Ratis
Apache
Ratis
Apache
Ratis
Hadoop Distributed Data Store Architecture
8 © Hortonworks Inc. 2011–2018. All rights reserved
Hadoop Distributed Data Store Services
HDDS
Cluster Identity Blocks Replication Security
9 © Hortonworks Inc. 2011–2018. All rights reserved
Booting a Cluster
10 © Hortonworks Inc. 2011–2018. All rights reserved
Booting up a Cluster
• HDDS acts as the central identity manager for the cluster.
• HDDS has a complete Certificate Infrastructure built-in
• The identity management service of HDDS provides the following identities
• name service identities.
• datanode identities.
• When SCM boots up it creates its own Certificate Authority
• Yes, You can provision it to be an Intermediate authority – That is be a part of an existing CA infra-
structure
• When datanodes join the cluster, SCM verifies datanode identity and issues a datanode
certificate.
• Other name space services like Ozone Manager, Namenode’ or Quadra Server will
obtain service identity certificates from SCM.
11 © Hortonworks Inc. 2011–2018. All rights reserved
2. Generate Self-Signed
Certificates
1. Generate Cluster ID &
SCM ID
3. Wait for Data nodes
to register
Booting up a Cluster – Initializing SCM
SCM
12 © Hortonworks Inc. 2011–2018. All rights reserved
SCM
2. SCM issues certificate after
verifying the identity
1. Data nodes make a
certificate request
3. Data nodes are
functional
Booting up a Cluster – Issuing certificates to Data nodes
• Kerberos is not
required on
datanodes for a
secure cluster.
13 © Hortonworks Inc. 2011–2018. All rights reserved
Booting up a Cluster – Registering a name service
Storage
Container Manager
DatanodeDatanode Datanode
Apache
Ratis
Apache
Ratis
Apache
Ratis
Ozone Manager NameNode’
• Admins add a Name Service to
HDDS cluster
• Name Services also get identity
certificates from SCM
Ozone Manager
Ozone Manager
NameNode’
NameNode’
14 © Hortonworks Inc. 2011–2018. All rights reserved
Writing Data
15 © Hortonworks Inc. 2011–2018. All rights reserved
Writing Data – Understanding Storage Containers
• HDDS datanode is a plug-in service running in Datanodes.
• Container is basic unit of replication (2-16GB).
• Fully distributed block metadata in LSM-based K-V store.
• No centralized block map in memory like HDFS.
Key -
Value
store
16 © Hortonworks Inc. 2011–2018. All rights reserved
Writing Data – Understanding Open/Close Containers
• Close Container: keys and their blocks are immutable
• Write is not allowed.
• Container closed when is it is close to full or failure.
• Open Container: keys and their blocks are mutable
• Write via Apace Ratis
CLOSED
OPEN
17 © Hortonworks Inc. 2011–2018. All rights reserved
Writing Data – Understanding Pipelines
• A pipeline is a replicated stream.
• HDDS is designed to have different kind of
streams – that allows support for different
strategies.
• For example, if we are using HDFS
replication, a Pipeline would like this.
DN 1
DN 2
DN3
18 © Hortonworks Inc. 2011–2018. All rights reserved
Writing Data – Understanding Pipelines
• HDDS writes to Containers using Pipelines.
• HDDS uses Apache Ratis based pipeline,
which still appears as a stream.
• Apache Ratis is a highly customizable Raft
consensus protocol library.
• Support high throughput data replication use
cases like Ozone.
• For example, if we are using HDDS and
Apache Ratis based replication, a Pipeline
would like this.
Leader
DN
DN 1DN 2
19 © Hortonworks Inc. 2011–2018. All rights reserved
Writing Data -- Blocks
1. Write Key
Client
Ozone Manager
2. Get Block
1. Client makes a
request to Ozone
Manager to write a
key. 2. Ozone Manager
requests SCM to
allocate a block.
SCM
3. SCM allocates a
block.
20 © Hortonworks Inc. 2011–2018. All rights reserved
Writing Data – SCM Allocating Container/Pipeline for Blocks
Container
found?
pipeline
found?
Create
Pipeline and
Container
Create
container
return
Yes
No
No
Yes
21 © Hortonworks Inc. 2011–2018. All rights reserved
Containers
Writing Data - SCM records the Container state
Storage Container Manager persists the container states to LSM-based K-V Store
CID Container Info
… Container Info
CID Container Info
22 © Hortonworks Inc. 2011–2018. All rights reserved
Writing Data
SCM returns a Block ID and a Pipeline to Ozone Manager.
Container ID
Local ID
DN 1 DN 2 DN 3{Client needs
Block ID and
Pipeline to
write data
This is similar to HDFS block and data node locations.
23 © Hortonworks Inc. 2011–2018. All rights reserved
Block Commit Protocol
Client
Block Value
Block 001 { List of Chunks }
Block 002 { List of Chunks }
• Writes data as chunks
• Update metadata of a block
Client
PutKey
Ozone
Manager
Allocate Block
SCM
Client
Commit Key Ozone
Manager
Datanode
1.
2.
3.
24 © Hortonworks Inc. 2011–2018. All rights reserved
Key
Writing Data – Ozone Manager records the Key info
Ozone Manager saves the Key Info to LSM-based K-V Metadata Store
Container ID
Local ID
…
…
Container ID
Local ID
Commit length
25 © Hortonworks Inc. 2011–2018. All rights reserved
Writing Data – Ozone Key State
Open
ExpiredClosed
• Open Key: Uncommitted key
• commit length subject to change
• Closed Key: Commit length won’t change
• Expired Open Key:
• Open key is garbage collected after timeout
26 © Hortonworks Inc. 2011–2018. All rights reserved
Reading Data
27 © Hortonworks Inc. 2011–2018. All rights reserved
Reading Data
1. Read Key
Client
Ozone Manager
2. GetLatestKeyLocations
1. Client makes a
request to Ozone
Manager to write a
key. 2. Ozone Manager
returns key location
info, i.e. list of blocks.
DN
3. Client connects to
DN and reads blocks.
28 © Hortonworks Inc. 2011–2018. All rights reserved
Deleting Data
29 © Hortonworks Inc. 2011–2018. All rights reserved
Deleting Data – Open/Close Containers
• Close Container: keys and their blocks are immutable
• Delete key/blocks via SCM Delete log.
• Each delete log has a unique ID
• SCM sort delete logs by Datanodes and Containers
• SCM send deleted logs to Datanode via heartbeat
response
• SCM clear delete log entry only if delete entry
processing is marked done on all the datanodes.
• Open Container: keys and their blocks are mutable
• Delete key/blocks via SCM Delete log.
• SCM holds off sending delete logs to datanode on open
containers until the container is closed.
30 © Hortonworks Inc. 2011–2018. All rights reserved
Scaling Ozone
31 © Hortonworks Inc. 2011–2018. All rights reserved
Single Ozone Manager + Single SCM
32 © Hortonworks Inc. 2011–2018. All rights reserved
Multiple Ozone Managers + Single SCM
33 © Hortonworks Inc. 2011–2018. All rights reserved
Single Ozone Manager + Multiple SCMs
34 © Hortonworks Inc. 2011–2018. All rights reserved
Multiple Ozone Managers + Multiple SCMs
35 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone Security
36 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone Security
• Authentication
• Identity
• Kerberos (OM/SCM/Client)
• Certificate (SCM/OM/Datanodes)
• Token
• Delegation Token
• Block Token
37 © Hortonworks Inc. 2011–2018. All rights reserved
Certificate-based Delegation Token
• OM token <-> HDFS namenode token
• For MR jobs to access OM
Ozone HDFS
Issued by OM Namenode
Validated by OM
Certificate(Public Key)
Namenode
Signed by HMAC-SHA256 HMAC-SHA1
OM Private Key Shared Key
Protocol Hadoop RPC/REST Hadoop RPC
38 © Hortonworks Inc. 2011–2018. All rights reserved
Certificate-based Block Token
• OM block token <-> HDFS block token
• To authenticate block access on datanodes from Ozone clients.
Ozone HDFS
Issued by OM Namenode
Validated by HDDS Datanode
OM(Public Key)
Datanode
(Shared Key)
Signed by HMAC-SHA256
OM(Private Key)
HMAC-SHA1
(Shared Key)
Protocol gRPC Hadoop RPC
Secure Block Token
Transfer
DN Public Key Hadoop RPC
encryption
39 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone HDDS in progress
• High Availability
• Apache Ratis-based OM/SCM metadata replication
• HDDS-151 branch
• Pluggable Container Storage I/O
• HDFS-48 branch
• Security
• Kerberos/Delegation Token/Audit
• HDDS-4 branch
40 © Hortonworks Inc. 2011–2018. All rights reserved
Questions?
• Visit Ozone website:
• http://ozone.hadoop.apache.org
• Visit Ozone Booth@Ask Me Anything
• Ozone Demo
• Q & A
41 © Hortonworks Inc. 2011–2018. All rights reserved
Thank you

Contenu connexe

Tendances

Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowDataWorks Summit
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataDataWorks Summit
 
Hadoopの概念と基本的知識
Hadoopの概念と基本的知識Hadoopの概念と基本的知識
Hadoopの概念と基本的知識Ken SASAKI
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...NTT DATA Technology & Innovation
 
Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)
Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)
Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)NTT DATA Technology & Innovation
 
Apache Hiveの今とこれから
Apache Hiveの今とこれからApache Hiveの今とこれから
Apache Hiveの今とこれからYifeng Jiang
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Cloudera, Inc.
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache KuduJeff Holoman
 
Goでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 Spring
Goでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 SpringGoでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 Spring
Goでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 SpringYahoo!デベロッパーネットワーク
 
Ozone and HDFS's Evolution
Ozone and HDFS's EvolutionOzone and HDFS's Evolution
Ozone and HDFS's EvolutionDataWorks Summit
 
分散DB Apache Kuduのアーキテクチャ DBの性能と一貫性を両立させる仕組み 「HybridTime」とは
分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは
分散DB Apache Kuduのアーキテクチャ DBの性能と一貫性を両立させる仕組み 「HybridTime」とはCloudera Japan
 
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)NTT DATA OSS Professional Services
 
Hadoop -NameNode HAの仕組み-
Hadoop -NameNode HAの仕組み-Hadoop -NameNode HAの仕組み-
Hadoop -NameNode HAの仕組み-Yuki Gonda
 
最新版Hadoopクラスタを運用して得られたもの
最新版Hadoopクラスタを運用して得られたもの最新版Hadoopクラスタを運用して得られたもの
最新版Hadoopクラスタを運用して得られたものcyberagent
 

Tendances (20)

Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and Tomorrow
 
MapReduce/YARNの仕組みを知る
MapReduce/YARNの仕組みを知るMapReduce/YARNの仕組みを知る
MapReduce/YARNの仕組みを知る
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
 
Hadoopの概念と基本的知識
Hadoopの概念と基本的知識Hadoopの概念と基本的知識
Hadoopの概念と基本的知識
 
ORC Deep Dive 2020
ORC Deep Dive 2020ORC Deep Dive 2020
ORC Deep Dive 2020
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
 
Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)
Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)
Apache Spark on Kubernetes入門(Open Source Conference 2021 Online Hiroshima 発表資料)
 
Apache Hiveの今とこれから
Apache Hiveの今とこれからApache Hiveの今とこれから
Apache Hiveの今とこれから
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Goでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 Spring
Goでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 SpringGoでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 Spring
Goでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 Spring
 
Ozone and HDFS's Evolution
Ozone and HDFS's EvolutionOzone and HDFS's Evolution
Ozone and HDFS's Evolution
 
分散DB Apache Kuduのアーキテクチャ DBの性能と一貫性を両立させる仕組み 「HybridTime」とは
分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは
分散DB Apache Kuduのアーキテクチャ DBの性能と一貫性を両立させる仕組み 「HybridTime」とは
 
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
 
Hadoop -NameNode HAの仕組み-
Hadoop -NameNode HAの仕組み-Hadoop -NameNode HAの仕組み-
Hadoop -NameNode HAの仕組み-
 
最新版Hadoopクラスタを運用して得られたもの
最新版Hadoopクラスタを運用して得られたもの最新版Hadoopクラスタを運用して得られたもの
最新版Hadoopクラスタを運用して得られたもの
 
Hive on Tezのベストプラクティス
Hive on TezのベストプラクティスHive on Tezのベストプラクティス
Hive on Tezのベストプラクティス
 

Similaire à Ozone: scaling HDFS to trillions of objects

Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolutionDataWorks Summit
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolutionDataWorks Summit
 
Ozone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopOzone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopHortonworks
 
Ozone: An Object Store in HDFS
Ozone: An Object Store in HDFSOzone: An Object Store in HDFS
Ozone: An Object Store in HDFSDataWorks Summit
 
Evolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemEvolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemDataWorks Summit/Hadoop Summit
 
High throughput data replication over RAFT
High throughput data replication over RAFTHigh throughput data replication over RAFT
High throughput data replication over RAFTDataWorks Summit
 
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloudMoving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloudDataWorks Summit/Hadoop Summit
 
Data in the Cloud Crash Course
Data in the Cloud Crash CourseData in the Cloud Crash Course
Data in the Cloud Crash CourseDataWorks Summit
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateDataWorks Summit
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?DataWorks Summit
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoDataWorks Summit
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...DataWorks Summit
 
Data in the Cloud Crash Course
Data in the Cloud Crash CourseData in the Cloud Crash Course
Data in the Cloud Crash CourseDataWorks Summit
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudDataWorks Summit
 
Druid Scaling Realtime Analytics
Druid Scaling Realtime AnalyticsDruid Scaling Realtime Analytics
Druid Scaling Realtime AnalyticsAaron Brooks
 

Similaire à Ozone: scaling HDFS to trillions of objects (20)

Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolution
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolution
 
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage SubsystemEvolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
 
Ozone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopOzone- Object store for Apache Hadoop
Ozone- Object store for Apache Hadoop
 
Ozone: An Object Store in HDFS
Ozone: An Object Store in HDFSOzone: An Object Store in HDFS
Ozone: An Object Store in HDFS
 
Evolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemEvolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage Subsystem
 
Hadoop 3 in a Nutshell
Hadoop 3 in a NutshellHadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
 
High throughput data replication over RAFT
High throughput data replication over RAFTHigh throughput data replication over RAFT
High throughput data replication over RAFT
 
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloudMoving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloud
 
Containers and Big Data
Containers and Big DataContainers and Big Data
Containers and Big Data
 
Data in the Cloud Crash Course
Data in the Cloud Crash CourseData in the Cloud Crash Course
Data in the Cloud Crash Course
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
 
Data in the Cloud Crash Course
Data in the Cloud Crash CourseData in the Cloud Crash Course
Data in the Cloud Crash Course
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the Cloud
 
Containers and Big Data
Containers and Big Data Containers and Big Data
Containers and Big Data
 
Druid Scaling Realtime Analytics
Druid Scaling Realtime AnalyticsDruid Scaling Realtime Analytics
Druid Scaling Realtime Analytics
 

Plus de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...DataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
 

Dernier

Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 

Dernier (20)

Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 

Ozone: scaling HDFS to trillions of objects

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved Ozone: scaling HDFS to trillions of objects Xiaoyu Yao & Anu Engineer Apache Committers & PMC members
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved HDFS • Apache HDFS is designed for large objects – not for many small objects. • Small files create memory pressure on namenode. • Metadata in memory is the strength of the original GFS and HDFS design, but also the weakness in scaling to large number of files and blocks. • Each file adds metadata to the namenode. • Datanodes send all block information to namenode in BlockReports. • Both of these create scalability issues on Namenode.
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved Ozone and HDFS - past • HDFS-5477 – Separate Block Management. • HDFS-8286 - Partial Namespace In Memory. • HDFS-1052 Federation aims to address namespace and Block Space scalability issues. • Federation deployment and planning adds complexity • Requires changes to other components in the Hadoop stack • HDFS-10467 Router based federation • HDFS-7240 - Ozone borrows many ideas learned from these efforts and is a super set of these approaches.
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved • HDFS-7240 merged into Apache Hadoop trunk • New sub-project named HDDS (Hadoop Distributed Data Store). • Loadable module running in/out of process with HDFS datanode. • HDDS release is independent release Hadoop. • Maven profile (-pHdds) that disable HDDS compile by default. • Thanks to all the contributors from the community that made Ozone real! Ozone and HDFS - now
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved • Ozone is an S3 like object store, which exposes REST API and Hadoop RPC interfaces. • OzoneFS is a Hadoop compatible file system. • Applications like Hive, Spark, YARN and MapReduce run natively on OzoneFS without any modifications. Ozone Overview
  • 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved Ozone and Block layer • The most basic abstraction in Ozone is a block layer called – • Hadoop Distributed Data Store or HDDS. • This talk is really about HDDS; how it works and how it will be used by other name space services. • For example, Ozone, HDFS and Quadra (CBlock) are simple namespace services that can be built using HDDS. • HDDS is a Distributed, Highly available block storage framework. It is designed to be used by a system that provides it own namespace management. • HDDS consists of a block manager called Storage Container Manager (SCM) and a set of nodes that act as the data stores. The following architectural diagram shows the components that makeup HDDS.
  • 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved Storage Container Manager DatanodeDatanode Datanode Apache Ratis Apache Ratis Apache Ratis Hadoop Distributed Data Store Architecture
  • 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved Hadoop Distributed Data Store Services HDDS Cluster Identity Blocks Replication Security
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved Booting a Cluster
  • 10. 10 © Hortonworks Inc. 2011–2018. All rights reserved Booting up a Cluster • HDDS acts as the central identity manager for the cluster. • HDDS has a complete Certificate Infrastructure built-in • The identity management service of HDDS provides the following identities • name service identities. • datanode identities. • When SCM boots up it creates its own Certificate Authority • Yes, You can provision it to be an Intermediate authority – That is be a part of an existing CA infra- structure • When datanodes join the cluster, SCM verifies datanode identity and issues a datanode certificate. • Other name space services like Ozone Manager, Namenode’ or Quadra Server will obtain service identity certificates from SCM.
  • 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved 2. Generate Self-Signed Certificates 1. Generate Cluster ID & SCM ID 3. Wait for Data nodes to register Booting up a Cluster – Initializing SCM SCM
  • 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved SCM 2. SCM issues certificate after verifying the identity 1. Data nodes make a certificate request 3. Data nodes are functional Booting up a Cluster – Issuing certificates to Data nodes • Kerberos is not required on datanodes for a secure cluster.
  • 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved Booting up a Cluster – Registering a name service Storage Container Manager DatanodeDatanode Datanode Apache Ratis Apache Ratis Apache Ratis Ozone Manager NameNode’ • Admins add a Name Service to HDDS cluster • Name Services also get identity certificates from SCM Ozone Manager Ozone Manager NameNode’ NameNode’
  • 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved Writing Data
  • 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved Writing Data – Understanding Storage Containers • HDDS datanode is a plug-in service running in Datanodes. • Container is basic unit of replication (2-16GB). • Fully distributed block metadata in LSM-based K-V store. • No centralized block map in memory like HDFS. Key - Value store
  • 16. 16 © Hortonworks Inc. 2011–2018. All rights reserved Writing Data – Understanding Open/Close Containers • Close Container: keys and their blocks are immutable • Write is not allowed. • Container closed when is it is close to full or failure. • Open Container: keys and their blocks are mutable • Write via Apace Ratis CLOSED OPEN
  • 17. 17 © Hortonworks Inc. 2011–2018. All rights reserved Writing Data – Understanding Pipelines • A pipeline is a replicated stream. • HDDS is designed to have different kind of streams – that allows support for different strategies. • For example, if we are using HDFS replication, a Pipeline would like this. DN 1 DN 2 DN3
  • 18. 18 © Hortonworks Inc. 2011–2018. All rights reserved Writing Data – Understanding Pipelines • HDDS writes to Containers using Pipelines. • HDDS uses Apache Ratis based pipeline, which still appears as a stream. • Apache Ratis is a highly customizable Raft consensus protocol library. • Support high throughput data replication use cases like Ozone. • For example, if we are using HDDS and Apache Ratis based replication, a Pipeline would like this. Leader DN DN 1DN 2
  • 19. 19 © Hortonworks Inc. 2011–2018. All rights reserved Writing Data -- Blocks 1. Write Key Client Ozone Manager 2. Get Block 1. Client makes a request to Ozone Manager to write a key. 2. Ozone Manager requests SCM to allocate a block. SCM 3. SCM allocates a block.
  • 20. 20 © Hortonworks Inc. 2011–2018. All rights reserved Writing Data – SCM Allocating Container/Pipeline for Blocks Container found? pipeline found? Create Pipeline and Container Create container return Yes No No Yes
  • 21. 21 © Hortonworks Inc. 2011–2018. All rights reserved Containers Writing Data - SCM records the Container state Storage Container Manager persists the container states to LSM-based K-V Store CID Container Info … Container Info CID Container Info
  • 22. 22 © Hortonworks Inc. 2011–2018. All rights reserved Writing Data SCM returns a Block ID and a Pipeline to Ozone Manager. Container ID Local ID DN 1 DN 2 DN 3{Client needs Block ID and Pipeline to write data This is similar to HDFS block and data node locations.
  • 23. 23 © Hortonworks Inc. 2011–2018. All rights reserved Block Commit Protocol Client Block Value Block 001 { List of Chunks } Block 002 { List of Chunks } • Writes data as chunks • Update metadata of a block Client PutKey Ozone Manager Allocate Block SCM Client Commit Key Ozone Manager Datanode 1. 2. 3.
  • 24. 24 © Hortonworks Inc. 2011–2018. All rights reserved Key Writing Data – Ozone Manager records the Key info Ozone Manager saves the Key Info to LSM-based K-V Metadata Store Container ID Local ID … … Container ID Local ID Commit length
  • 25. 25 © Hortonworks Inc. 2011–2018. All rights reserved Writing Data – Ozone Key State Open ExpiredClosed • Open Key: Uncommitted key • commit length subject to change • Closed Key: Commit length won’t change • Expired Open Key: • Open key is garbage collected after timeout
  • 26. 26 © Hortonworks Inc. 2011–2018. All rights reserved Reading Data
  • 27. 27 © Hortonworks Inc. 2011–2018. All rights reserved Reading Data 1. Read Key Client Ozone Manager 2. GetLatestKeyLocations 1. Client makes a request to Ozone Manager to write a key. 2. Ozone Manager returns key location info, i.e. list of blocks. DN 3. Client connects to DN and reads blocks.
  • 28. 28 © Hortonworks Inc. 2011–2018. All rights reserved Deleting Data
  • 29. 29 © Hortonworks Inc. 2011–2018. All rights reserved Deleting Data – Open/Close Containers • Close Container: keys and their blocks are immutable • Delete key/blocks via SCM Delete log. • Each delete log has a unique ID • SCM sort delete logs by Datanodes and Containers • SCM send deleted logs to Datanode via heartbeat response • SCM clear delete log entry only if delete entry processing is marked done on all the datanodes. • Open Container: keys and their blocks are mutable • Delete key/blocks via SCM Delete log. • SCM holds off sending delete logs to datanode on open containers until the container is closed.
  • 30. 30 © Hortonworks Inc. 2011–2018. All rights reserved Scaling Ozone
  • 31. 31 © Hortonworks Inc. 2011–2018. All rights reserved Single Ozone Manager + Single SCM
  • 32. 32 © Hortonworks Inc. 2011–2018. All rights reserved Multiple Ozone Managers + Single SCM
  • 33. 33 © Hortonworks Inc. 2011–2018. All rights reserved Single Ozone Manager + Multiple SCMs
  • 34. 34 © Hortonworks Inc. 2011–2018. All rights reserved Multiple Ozone Managers + Multiple SCMs
  • 35. 35 © Hortonworks Inc. 2011–2018. All rights reserved Ozone Security
  • 36. 36 © Hortonworks Inc. 2011–2018. All rights reserved Ozone Security • Authentication • Identity • Kerberos (OM/SCM/Client) • Certificate (SCM/OM/Datanodes) • Token • Delegation Token • Block Token
  • 37. 37 © Hortonworks Inc. 2011–2018. All rights reserved Certificate-based Delegation Token • OM token <-> HDFS namenode token • For MR jobs to access OM Ozone HDFS Issued by OM Namenode Validated by OM Certificate(Public Key) Namenode Signed by HMAC-SHA256 HMAC-SHA1 OM Private Key Shared Key Protocol Hadoop RPC/REST Hadoop RPC
  • 38. 38 © Hortonworks Inc. 2011–2018. All rights reserved Certificate-based Block Token • OM block token <-> HDFS block token • To authenticate block access on datanodes from Ozone clients. Ozone HDFS Issued by OM Namenode Validated by HDDS Datanode OM(Public Key) Datanode (Shared Key) Signed by HMAC-SHA256 OM(Private Key) HMAC-SHA1 (Shared Key) Protocol gRPC Hadoop RPC Secure Block Token Transfer DN Public Key Hadoop RPC encryption
  • 39. 39 © Hortonworks Inc. 2011–2018. All rights reserved Ozone HDDS in progress • High Availability • Apache Ratis-based OM/SCM metadata replication • HDDS-151 branch • Pluggable Container Storage I/O • HDFS-48 branch • Security • Kerberos/Delegation Token/Audit • HDDS-4 branch
  • 40. 40 © Hortonworks Inc. 2011–2018. All rights reserved Questions? • Visit Ozone website: • http://ozone.hadoop.apache.org • Visit Ozone Booth@Ask Me Anything • Ozone Demo • Q & A
  • 41. 41 © Hortonworks Inc. 2011–2018. All rights reserved Thank you

Notes de l'éditeur

  1. TALK TRACK Hortonworks Powers the Future of Data: data-in-motion, data-at-rest, and Modern Data Applications. [NEXT SLIDE]