Hadoop Storage in the Cloud Native Era

•Télécharger en tant que PPTX, PDF•

2 j'aime•482 vues

Hadoop was born much earlier than the Cloud Native era. But the question is still the same: what can it offer in the time of Kubernetes, containerization and hybrid clouds? Apache Hadoop Ozone is a new subproject of Hadoop. It has a generic low-level binary layer, the Hadoop Distributed Data Storage (HDDS) and a S3 compatible Object Store implementation on top of it. But the HDDS data storage layer is not just for the object store. It could be used for multiple purposes: to enhance the scalability the HDFS or provide block level access to the managed storage space. With this approach the same Hadoop Ozone cluster could provide hadoop file system based storage, object store space and block level storage. Storage is still a hot topic with Kubernetes and in Cloud Native environments. Container Storage Interface specification is a vendor neutral standard to provide storage plugin for multiple container orchestration system. Quadra provides block level access on top of the Hadoop Distributed Data Storage layer and it’s first class citizen of the containerized word. It implements the Container Storage Interface and can work as a Kubernetes dynamic volume provisioner. In this talk we will demonstrate how the Hadoop Ozone storage could be used from containers. We will explain the basic storage type of Kubernetes clusters and show how Hadoop Ozone and Quadra could help to solve the storage problem in an industry standard way.

Technologie

© Cloudera, Inc. All rights reserved.
HADOOP STORAGE IN THE CLOUD NATIVE ERA
Nandakumar Vadivelu
nanda@apache.org

© Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved.
Hadoop Storage
Container
Orchestrator
Application
(Container)
Application
(Container)
Application
(Container)
Application
(Container)
Application
(Container)

© Cloudera, Inc. All rights reserved. 3
HDDS
Ozone
Quadra
CSI

© Cloudera, Inc. All rights reserved. 4© Cloudera, Inc. All rights reserved.
HADOOP DISTRIBUTED DATA STORE

© Cloudera, Inc. All rights reserved. 5
Namespace
+
Blockspace
HDFS - NameNode

© Cloudera, Inc. All rights reserved. 6
Namespace
File -> B1, B2, B3
Block Management Layer
B1 -> Dn1, Dn2, Dn3
Namenode
Layering

© Cloudera, Inc. All rights reserved. 7
Namespace
HDDS
Block Storage

© Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved.
BlocksMetadata
STORAGE CONTAINER

© Cloudera, Inc. All rights reserved. 9© Cloudera, Inc. All rights reserved.
ARCHITECTURE
Storage
Container
Manager
DatanodeDatanode Datanode

© Cloudera, Inc. All rights reserved. 10© Cloudera, Inc. All rights reserved.
Container Protocol
● Create Container
● Get Container
● List Container
● Close Container
● Delete Container
Container Operations
● Read Block
● Write Block
● Delete Block

© Cloudera, Inc. All rights reserved. 11© Cloudera, Inc. All rights reserved.
OZONE

© Cloudera, Inc. All rights reserved. 12
Ozone Manager
Key 1 -> List <block Id>
Key 2 -> List <block Id>
Key 3 -> List <block Id>
Key 4 -> List <block Id>
Key 5 -> List <block Id>
…
Key n -> List <block Id>
Block Id -> [Container Id + Local Id]

© Cloudera, Inc. All rights reserved. 13© Cloudera, Inc. All rights reserved.
Storage
Container Manager
DatanodeDatanode Datanode
HDDS
Ozone Manager
ARCHITECTURE

© Cloudera, Inc. All rights reserved. 14© Cloudera, Inc. All rights reserved.
INTERFACES
Object Store API
(RPC)
OzoneFileSystem (HCFS)
Connector
S3 Connector

© Cloudera, Inc. All rights reserved. 15© Cloudera, Inc. All rights reserved.
NAMENODE’
HDFS-10419

© Cloudera, Inc. All rights reserved. 16© Cloudera, Inc. All rights reserved.
Storage
Container Manager
DatanodeDatanode Datanode
HDDS
NameNode’
ARCHITECTURE

© Cloudera, Inc. All rights reserved. 17© Cloudera, Inc. All rights reserved.
QUADRA
HDFS-11118

© Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved.
QUADRA
• LUN like Raw-Block Storage
• Backed by HDDS
• Mountable disk FS volume
• Volume: A raw-block device that can be used to create mountable disk
• Can create filesystems like ext4 or XFS on the volumes
• POSIX semantics

© Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved.
Storage
Container Manager
DatanodeDatanode Datanode
HDDS
Quadra Manager
iSCSI Server
ARCHITECTURE

© Cloudera, Inc. All rights reserved. 20© Cloudera, Inc. All rights reserved.
USAGE
• Create a Volume
• quadra -c foo datavolume 4TB
• Mount the volume
• iscsiadm -m node -o new -T foo:datavolume -p localhost:3260
• Format the Volume
• mkfs.ext4 -b 4096 /dev/sdb
• Mount the filesystem
• mkdir datavol; mount /dev/sdb datavol

© Cloudera, Inc. All rights reserved. 21© Cloudera, Inc. All rights reserved.
Storage
Container
Manager
HDDS
Datanode DatanodeDatanode
Quadra
Volume
Manager
Quadra
Plugin
JSCSI
Kernel
User
SCSI Initiator
Volume API
Data Path
HOST

© Cloudera, Inc. All rights reserved. 22© Cloudera, Inc. All rights reserved.
Storage
Container Manager
DatanodeDatanode Datanode
HDDS
Quadra Manager Ozone Manager NameNode’
Block Store Object Store File Store
HADOOP STORAGE ECOSYSTEM

© Cloudera, Inc. All rights reserved. 23© Cloudera, Inc. All rights reserved.
CONTAINER STORAGE INTERFACE

© Cloudera, Inc. All rights reserved. 24© Cloudera, Inc. All rights reserved.
Pluggable Storage
Interface
Pluggable Storage
Interface
Pluggable Storage
Interface
Storage Provider has to write a plugin for each container orchestrator
WHY?

© Cloudera, Inc. All rights reserved. 25© Cloudera, Inc. All rights reserved.
CONTAINER STORAGE INTERFACE
• Specification
• Interoperable
• Vendor neutral
• Control plane only

© Cloudera, Inc. All rights reserved. 26© Cloudera, Inc. All rights reserved.
Pluggable Storage Interface
Storage Provider
CSI

© Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved.
PLUGINS
Control
Plugin
Node
Plugin
Container
Orchestrator
Storage Provider

© Cloudera, Inc. All rights reserved. 28© Cloudera, Inc. All rights reserved.
• It can run anywhere
• Handles storage volume
creation and deletion
Control
Plugin
Node
Plugin
• Runs on all the nodes
• Handles storage volume
mounting and unmounting

© Cloudera, Inc. All rights reserved. 29

© Cloudera, Inc. All rights reserved. 30© Cloudera, Inc. All rights reserved.
CSI DRIVER FOR HADOOP STORAGE
HDDS-1382

© Cloudera, Inc. All rights reserved. 31
DatanodeDatanode Datanode
HDDS
Quadra Manager
iSCSI Server
Storage
Container Manager
Hadoop
Storage
Container
Orchestrator
Application Application Application Application Application
CSI Driver

© Cloudera, Inc. All rights reserved. 32© Cloudera, Inc. All rights reserved.
Control
Plugin
Node
Plugin
Hadoop CSI Driver
Storage
Container
Manager
HDDS
DatanodeDatanode
Quadra
Volume
Manager
Quadra
Plugin
JSCSI
Volume API
Data Path
Datanode

© Cloudera, Inc. All rights reserved. 33© Cloudera, Inc. All rights reserved.
DEMO

© Cloudera, Inc. All rights reserved. 35© Cloudera, Inc. All rights reserved.
CURRENT STATUS
● Apache Hadoop Ozone 0.4.0-alpha – Released on May 7
● Implementing Namenode on top of HDDS (HDFS-10419) – Design Discussion
● Quadra (HDFS-11118) – Design Discussion (POC)
● CSI Server for Ozone (HDDS-1382) – In development

© Cloudera, Inc. All rights reserved. 36© Cloudera, Inc. All rights reserved.
Q & A

© Cloudera, Inc. All rights reserved.
THANK YOU

Recommandé

HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit

Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit

Tales from the Cloudera FieldHBaseCon

Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit

Data Protection in Hybrid Enterprise Data Lake EnvironmentDataWorks Summit

Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit

Scalable HiveServer2 as a ServiceDataWorks Summit

Scaling HDFS at XiaomiDataWorks Summit

Recommandé

HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit

Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit

Tales from the Cloudera FieldHBaseCon

Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit

Data Protection in Hybrid Enterprise Data Lake EnvironmentDataWorks Summit

Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit

Scalable HiveServer2 as a ServiceDataWorks Summit

Scaling HDFS at XiaomiDataWorks Summit

High Availability for HBase Tables - Past, Present, and FutureDataWorks Summit

Cross-DC Fault-Tolerant ViewFileSystem @ TwitterDataWorks Summit/Hadoop Summit

HBase BackupsHBaseCon

Ozone - Evolution of hdfs scalabilityDinesh Chitlangia

Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsDataWorks Summit

Taming the Elephant: Efficient and Effective Apache Hadoop ManagementDataWorks Summit/Hadoop Summit

HBaseCon 2015: HBase and SparkHBaseCon

From docker to kubernetes: running Apache Hadoop in a cloud native wayDataWorks Summit

Large-scale Web Apps @ PinterestHBaseCon

HBase Data Modeling and Access Patterns with Kite SDKHBaseCon

HDFS Tiered Storage: Mounting Object Stores in HDFSDataWorks Summit/Hadoop Summit

Difference between hadoop 2 vs hadoop 3Manish Chopra

Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraCeph Community

Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red_Hat_Storage

Red Hat Storage Day New York - New Reference ArchitecturesRed_Hat_Storage

Backup and Disaster Recovery in Hadooplarsgeorge

Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu YongCeph Community

How the Internet of Things are Turning the Internet Upside DownDataWorks Summit

HBaseCon 2013: Apache HBase Operations at PinterestCloudera, Inc.

HDFS Tiered Storage: Mounting Object Stores in HDFSDataWorks Summit

Hadoop 3 (2017 hadoop taiwan workshop)Wei-Chiu Chuang

Hadoop on Cloud: Why and How?Cloudera, Inc.

Contenu connexe

Tendances

High Availability for HBase Tables - Past, Present, and FutureDataWorks Summit

Cross-DC Fault-Tolerant ViewFileSystem @ TwitterDataWorks Summit/Hadoop Summit

HBase BackupsHBaseCon

Ozone - Evolution of hdfs scalabilityDinesh Chitlangia

Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsDataWorks Summit

Taming the Elephant: Efficient and Effective Apache Hadoop ManagementDataWorks Summit/Hadoop Summit

HBaseCon 2015: HBase and SparkHBaseCon

From docker to kubernetes: running Apache Hadoop in a cloud native wayDataWorks Summit

Large-scale Web Apps @ PinterestHBaseCon

HBase Data Modeling and Access Patterns with Kite SDKHBaseCon

HDFS Tiered Storage: Mounting Object Stores in HDFSDataWorks Summit/Hadoop Summit

Difference between hadoop 2 vs hadoop 3Manish Chopra

Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraCeph Community

Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red_Hat_Storage

Red Hat Storage Day New York - New Reference ArchitecturesRed_Hat_Storage

Backup and Disaster Recovery in Hadooplarsgeorge

Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu YongCeph Community

How the Internet of Things are Turning the Internet Upside DownDataWorks Summit

HBaseCon 2013: Apache HBase Operations at PinterestCloudera, Inc.

HDFS Tiered Storage: Mounting Object Stores in HDFSDataWorks Summit

Tendances (20)

High Availability for HBase Tables - Past, Present, and Future

Cross-DC Fault-Tolerant ViewFileSystem @ Twitter

HBase Backups

Ozone - Evolution of hdfs scalability

Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments

Taming the Elephant: Efficient and Effective Apache Hadoop Management

HBaseCon 2015: HBase and Spark

From docker to kubernetes: running Apache Hadoop in a cloud native way

Large-scale Web Apps @ Pinterest

HBase Data Modeling and Access Patterns with Kite SDK

HDFS Tiered Storage: Mounting Object Stores in HDFS

Difference between hadoop 2 vs hadoop 3

Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira

Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...

Red Hat Storage Day New York - New Reference Architectures

Backup and Disaster Recovery in Hadoop

Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong

How the Internet of Things are Turning the Internet Upside Down

HBaseCon 2013: Apache HBase Operations at Pinterest

HDFS Tiered Storage: Mounting Object Stores in HDFS

Similaire à Hadoop Storage in the Cloud Native Era

Hadoop 3 (2017 hadoop taiwan workshop)Wei-Chiu Chuang

Hadoop on Cloud: Why and How?Cloudera, Inc.

One Hadoop, Multiple Clouds - NYC Big Data MeetupAndrei Savu

One Hadoop, Multiple CloudsCloudera, Inc.

Querying multiple distributed storage systems with Apache Hive robustlyAshish Singh

Cloudera のサポートエンジニアリング #supennightCloudera Japan

Apache Hadoop 3Cloudera, Inc.

Five Tips for Running Cloudera on AWSCloudera, Inc.

Hadoop OperationsCloudera, Inc.

Introduction to HBaseApekshit Sharma

Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Stefan Lipp

Apache Spark OperationsCloudera, Inc.

Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Big Data Spain

Cloudera GoDataFest Deploying Cloudera in the CloudGoDataDriven

Scaling DataStax in DockerDataStax

Apache Accumulo OverviewBill Havanki

Data Science and Machine Learning for the EnterpriseCloudera, Inc.

Self-service Big Data Analytics on Microsoft AzureCloudera, Inc.

How to go into production your machine learning models? #CWT2017Cloudera Japan

Risk Management for Data: Secured and GovernedCloudera, Inc.

Similaire à Hadoop Storage in the Cloud Native Era (20)

Hadoop 3 (2017 hadoop taiwan workshop)

Hadoop on Cloud: Why and How?

One Hadoop, Multiple Clouds - NYC Big Data Meetup

One Hadoop, Multiple Clouds

Querying multiple distributed storage systems with Apache Hive robustly

Cloudera のサポートエンジニアリング #supennight

Apache Hadoop 3

Five Tips for Running Cloudera on AWS

Hadoop Operations

Introduction to HBase

Cloudera Analytics and Machine Learning Platform - Optimized for Cloud

Apache Spark Operations

Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...

Cloudera GoDataFest Deploying Cloudera in the Cloud

Scaling DataStax in Docker

Apache Accumulo Overview

Data Science and Machine Learning for the Enterprise

Self-service Big Data Analytics on Microsoft Azure

How to go into production your machine learning models? #CWT2017

Risk Management for Data: Secured and Governed

Plus de DataWorks Summit

Data Science Crash CourseDataWorks Summit

Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit

Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit

Managing the Dewey Decimal SystemDataWorks Summit

HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit

Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit

Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit

Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit

Security Framework for Multitenant ArchitectureDataWorks Summit

Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit

Extending Twitter's Data Platform to Google CloudDataWorks Summit

Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit

Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit

Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit

Computer Vision: Coming to a Store Near YouDataWorks Summit

Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit

Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...DataWorks Summit

Applying Noisy Knowledge Graphs to Real ProblemsDataWorks Summit

Open Source, Open Data: Driving Innovation in Smart CitiesDataWorks Summit

Plus de DataWorks Summit (20)

Data Science Crash Course

Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi

Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...

Managing the Dewey Decimal System

HBase Global Indexing to support large-scale data ingestion at Uber

Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix

Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi

Supporting Apache HBase : Troubleshooting and Supportability Improvements

Security Framework for Multitenant Architecture

Presto: Optimizing Performance of SQL-on-Anything Engine

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...

Extending Twitter's Data Platform to Google Cloud

Event-Driven Messaging and Actions using Apache Flink and Apache NiFi

Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger

Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...

Computer Vision: Coming to a Store Near You

Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark

Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...

Applying Noisy Knowledge Graphs to Real Problems

Open Source, Open Data: Driving Innovation in Smart Cities

Dernier

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

A Year of the Servo Reboot: Where Are We Now?Igalia

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Manulife - Insurer Transformation Award 2024The Digital Insurer

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra

Ransomware_Q4_2023. The report. [EN].pdfOverkill Security

Architecting Cloud Native ApplicationsWSO2

ICT role in 21st century education and its challengesrafiqahmad00786416

FWD Group - Insurer Innovation Award 2024The Digital Insurer

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

GenAI Risks & Security Meetup 01052024.pdflior mazor

MINDCTI Revenue Release Quarter One 2024MIND CTI

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot

Dernier (20)

presentation ICT roal in 21st century education

Artificial Intelligence Chap.5 : Uncertainty

Data Cloud, More than a CDP by Matt Robison

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

2024: Domino Containers - The Next Step. News from the Domino Container commu...

A Year of the Servo Reboot: Where Are We Now?

How to Troubleshoot Apps for the Modern Connected Worker

Manulife - Insurer Transformation Award 2024

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Ransomware_Q4_2023. The report. [EN].pdf

Architecting Cloud Native Applications

ICT role in 21st century education and its challenges

FWD Group - Insurer Innovation Award 2024

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Powerful Google developer tools for immediate impact! (2023-24 C)

GenAI Risks & Security Meetup 01052024.pdf

MINDCTI Revenue Release Quarter One 2024

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

Hadoop Storage in the Cloud Native Era

2. © Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved. Hadoop Storage Container Orchestrator Application (Container) Application (Container) Application (Container) Application (Container) Application (Container)

10. © Cloudera, Inc. All rights reserved. 10© Cloudera, Inc. All rights reserved. Container Protocol ● Create Container ● Get Container ● List Container ● Close Container ● Delete Container Container Operations ● Read Block ● Write Block ● Delete Block

12. © Cloudera, Inc. All rights reserved. 12 Ozone Manager Key 1 -> List <block Id> Key 2 -> List <block Id> Key 3 -> List <block Id> Key 4 -> List <block Id> Key 5 -> List <block Id> … Key n -> List <block Id> Block Id -> [Container Id + Local Id]

18. © Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved. QUADRA • LUN like Raw-Block Storage • Backed by HDDS • Mountable disk FS volume • Volume: A raw-block device that can be used to create mountable disk • Can create filesystems like ext4 or XFS on the volumes • POSIX semantics

20. © Cloudera, Inc. All rights reserved. 20© Cloudera, Inc. All rights reserved. USAGE • Create a Volume • quadra -c foo datavolume 4TB • Mount the volume • iscsiadm -m node -o new -T foo:datavolume -p localhost:3260 • Format the Volume • mkfs.ext4 -b 4096 /dev/sdb • Mount the filesystem • mkdir datavol; mount /dev/sdb datavol

21. © Cloudera, Inc. All rights reserved. 21© Cloudera, Inc. All rights reserved. Storage Container Manager HDDS Datanode DatanodeDatanode Quadra Volume Manager Quadra Plugin JSCSI Kernel User SCSI Initiator Volume API Data Path HOST

22. © Cloudera, Inc. All rights reserved. 22© Cloudera, Inc. All rights reserved. Storage Container Manager DatanodeDatanode Datanode HDDS Quadra Manager Ozone Manager NameNode’ Block Store Object Store File Store HADOOP STORAGE ECOSYSTEM

24. © Cloudera, Inc. All rights reserved. 24© Cloudera, Inc. All rights reserved. Pluggable Storage Interface Pluggable Storage Interface Pluggable Storage Interface Storage Provider has to write a plugin for each container orchestrator WHY?

28. © Cloudera, Inc. All rights reserved. 28© Cloudera, Inc. All rights reserved. • It can run anywhere • Handles storage volume creation and deletion Control Plugin Node Plugin • Runs on all the nodes • Handles storage volume mounting and unmounting

31. © Cloudera, Inc. All rights reserved. 31 DatanodeDatanode Datanode HDDS Quadra Manager iSCSI Server Storage Container Manager Hadoop Storage Container Orchestrator Application Application Application Application Application CSI Driver

32. © Cloudera, Inc. All rights reserved. 32© Cloudera, Inc. All rights reserved. Control Plugin Node Plugin Hadoop CSI Driver Storage Container Manager HDDS DatanodeDatanode Quadra Volume Manager Quadra Plugin JSCSI Volume API Data Path Datanode

34.

35. © Cloudera, Inc. All rights reserved. 35© Cloudera, Inc. All rights reserved. CURRENT STATUS ● Apache Hadoop Ozone 0.4.0-alpha – Released on May 7 ● Implementing Namenode on top of HDDS (HDFS-10419) – Design Discussion ● Quadra (HDFS-11118) – Design Discussion (POC) ● CSI Server for Ozone (HDDS-1382) – In development