SlideShare une entreprise Scribd logo
1  sur  35
1
HDFS Tiered Storage
Virajith Jalaparthi, Chris Douglas
Ewan Higgs, Thomas Demoor
• Tiered Storage [issues.apache.org]
– HDFS-9806
– HDFS-12090
Microsoft – Western Digital – Apache Community
2
Virajith Jalaparti
Chris Douglas
…
Ewan Higgs
Kasper Janssens
Thomas Demoor
…
• Hadoop Compatible FS [1]: s3a://, wasb://, adl://, …
• Direct IO between Hadoop apps and Object Store
• Disaggregated compute & storage
• HDFS NameNode functions taken up by Object Store
Hadoop already plays nicely with Object Stores
3
REMOTE
STORE
APP
HADOOP CLUSTER
READWRITE
[1]: https://s.apache.org/Hadoop3FSspec
• Pain points:
– Not really a FileSystem: rename, append, directories, ...
• Even with correct semantics, performance unlike HDFS
• HDFS features unavailable (e.g., hedged reads, snapshots, etc.)
– No locality
• Higher latency than attached storage
• Higher variance in both latency and throughput
– No HDFS integration
• Policies for users, permissions, quota, security, …
• Storage Plugins (e.g. Ranger, Sentry)
• External Storage Tier for HDFS
– HDFS Storage Policy: DISK, SSD, RAM, ARCHIVE, PROVIDED
• Share namespace, not only data!
– Keep 1-to-1 mapping: HDFS file  external object
• No change to existing HDFS workflows
– Hadoop Apps interact with HDFS as before (fully transparent)
– Data Tiering happens async in background
– Native support for all HDFS features / admin tools
• Data Tiering controlled by admin
– On directory / file level
– Through Storage Policy (e.g. <HDD, HDD, HDD>  <PROVIDED>)
• HDFS NameNode scalability not a bottleneck
– HDFS manages the working set/compatibility
– Object store manages larger data lake, ingest, etc.
Goal: let HDFS play nicely with Object Stores
4
HDFS
REMOTE
STORE
APP
HADOOP CLUSTER
WRITE
BACK
LOAD
ON-DEMAND
READWRITE
HDFS
• Use HDFS to manage remote storage
“Mount” remote storage in HDFS
5
• Use HDFS to manage remote storage
– HDFS blocks correspond to fixed range of bytes in remote
– AliasMap (DWS17: youtu.be/kpNDZNp-Nlw)
– HDFS coordinates reads/writes to remote store
– Mount remote store as a PROVIDED tier in HDFS
– Set StoragePolicy to move data into HDFS
… …
/
a b
HDFS
Namespace
… …
/
d e f
Remote
Namespace
Mount remote
namespace
c
d e f
Mount point
REMOTE
STORE
APP
HADOOP CLUSTER
WRITE
THROUGH
LOAD
ON-DEMAND
READWRITE
Alias Map
HDFS Block->Remote location
PROVIDED storage on the READ path
6
/foo/bar
/foo/baz
/foo/bazt
/foo/bazz
IaaS(De)Hydration Delegation
PROVIDED storage on the READ path
7
/foo/bar
/foo/baz
/foo/bazt
/foo/bazz
/foo
bar baz bazt bazz
setrep=2
IaaS(De)Hydration Delegation
PROVIDED storage on the READ path
8
/foo/bar
/foo/baz
/foo/bazt
/foo/bazz
/foo
bar baz bazt bazz
[Router-Based]
Federation
/cloud
?
IaaS(De)Hydration Delegation
Apache Hadoop 3.1.0
• Generate FSImage from a FileSystem
 Start a NameNode serving remote data
 Serve from (a subset of) DataNodes in the cluster
• Backported and deployed in production at Microsoft
• Static: namespace changes are not reflected in HDFS NameNode
9
• Prototype code [2] with the PROVIDED abstraction
 Read-through caching of blocks (demand paging)
 Scheduled, metered prefetch for recurring pipelines with SLOs
 Write-through to remote (participant in the HDFS write pipeline)
 Wire FSImage to a running NameNode
• Per-application NameNodes; with isolation
• Bidirectional synchronization out of scope
[2]: https://github.com/Microsoft-CISL/hadoop/tree/tieredStore-sig16
Running Apache Hadoop in the cloud
10
• HDInsight/Elastic MapReduce (EMR)/etc.
• Disaggregation introduces not only
latency, but also variance
• “Lift and shift” workloads
 Rely on HDFS plugins
 May need to use attached storage to meet
SLOs
 Would otherwise require spending more for
capacity to the remote store
𝑠𝑡𝑑𝑑𝑒𝑣
𝑚𝑒𝑎𝑛
• HDFS can be used as a cache for Object Storage
• Similar to $my_favorite_caching_FS (CFS)?
– These are all caching systems that dispatch between storage systems horizontally
– We want to tier the storage systems vertically
• Support HDFS, not just Hadoop ecosystem around FileSystem
Notes on Caching
11
Compute
$CFS
☁️
Compute
$CFS
Compute
$CFS
HDFS HDFS HDFS
Compute
☁️
Compute Compute
HDFS HDFS HDFS
… …
…
/
bucket1 carlhadoop
Object Store
…
…
/
reports
fileA fileB dir
sales
HDFS cluster
NameNode
DataNode 1
P
External Storage for HDFS
Hadoop
Client
DataNode N
P
fileA fileB dir
12
• “DropBox for Hadoop"
– Hadoop cluster has complete namespace but only “data in working set” is stored locally
– Dynamically page in missing data from object store on read
– Asynchronously write back data to object store
• Storage Policies + Replication count offer rich placement options
– E.g.: hot data: <SSD, PROVIDED> / cold data: <PROVIDED>
• Dedicated object storage system more efficient ($$$)
– Similar goal as ARCHIVE storage policy
– Object storage features (erasure coding, multi-geo replication, …)
• Data sharing with non-Hadoop apps
– File-object mapping means objects can be accessed in remote store with REST API / SDKs
Use case: External Storage for HDFS
13
Community feedback at last year’s Summit
14
WD Activescale Object Storage
• Western Digital moving up the stack (Data Center Systems)
• Scale-out object storage system for Private & Public Cloud
• Key features:
 Compatible with Amazon S3 API
 Strong consistency (not eventual!)
 Erasure coding for efficient storage
• Scale:
 Petabytes per rack
 Billions of objects per rack
 Linear scalability in # of racks
• More info at http://www.hgst.com/products/systems
15
• AS AN Administrator
• I CAN configure HDFS with an object storage backend
hdfs storagepolicies -setStoragePolicy -policy PROVIDED -path /var/log
hdfs syncservice -create -backupOnly -name activescale /var/logs s3a://hadoop-logs/
• SO THAT when a user copies files to HDFS they are asynchronously copied to
to the synchronization endpoint
Demo time
16
Another example
• AS AN Administrator
• I CAN set the Storage Policy to be PROVIDED_ONLY
hdfs storagepolicies –setStoragePolicy -policy PROVIDED_ONLY -path /var/log
• SO THAT data is no longer in the Datanode but is transparently read
through from the synchronization endpoint on access.
17
• Preserve file-object mapping
– AliasMap (last year’s talk – HDFS-9806): synchronize namespaces
– Datanodes collaborate to move blocks which together form an object in destination system
• Minimize impact on frontend traffic / efficient data transfer
– Obvious: Read all blocks into a single Datanode to reconstruct a file before transferring
– Efficient: Transfer directly copies block per block outside of cluster using
• S3: multipart upload
• WASB: append blobs
• HDFS: tmpdir + concat
• Flexible deployment: could run in NameNode OR as External service
– In Namenode is easy to deploy but adds resource pressure
– External service is more difficult to deploy for some sites but reduces resource pressure
– Ongoing community discussion; start with external, include internal option as required
Requirements
18
• MountManager manages all the local mount points
– Mount point can be configured to sync with external store
• Periodically create a diff by comparing snapshots of the mountpoint
– NEW SyncService (in/out NameNode)
• Generate a ”phased plan” for ordering the operations in the diff
– Multiple ordered phases
• RENAMES_TO_TEMP, DELETES, RENAMES_TO_FINAL, CREATE_DIRS, CREATE_FILES
• e.g. dir creation before file creation
– Parallel operations within a phase
• Leverage multiple datanodes and connections to external store
• e.g. Upload multiple new files in parallel
• Execute plan and track work
– Namespace (metadata) operations originate from SyncService
– Data operations originate from DataNodes
– Tracking: admin can query mountpoint for progress
Deep Dive: Synchronization
19
• Snapshot diff:
– Reflect point-in-time 100% accurate state of HDFS in external store
– Snapshot ensures data remains referenceable: retains blocks of data
– Does not track create + delete in between consecutive snapshots (cfr. file B in Fig.)
• EditLog post-processing:
– To parallelize
• Read batch from log and track lineage between overlapping operations
– HDFS operations might have altered reality: no point-in time
• Data not part of log: would require postponing block garbage collection
Tracking changes: Snapshot diff vs. EditLog
20
ss-6
B B A A
ss-5time
SnapshotDiffReport
M d .
+ d ./a
+ f ./f1.bin
Example Diff – Simple Case
21
Commands
#given /basic-test
mkdir -p /basic-test/a/b/c
touch /basic-test/a/b/c/d/f1.bin
touch /basic-test/f1.bin
Simple Case - New dirs; new files
PhasedPlan
+ d ./a/b/c/d/
+ f ./a/b/c/d/f1.bin
+ f ./f1.bin
SnapshotDiffReport
M d .
R f ./a.bin -> ./b.bin
R f ./b.bin -> ./a.bin
Example Diff – Harder Case
22
Commands
#given /swap-test/a.bin
#given /swap-test/b.bin
mv /swap-test/a.bin /swap-test/tmp
mv /swap-test/b.bin /swap-test/a.bin
mv /swap-test/tmp /swap-test/b.bin
Harder Case - Cycle
PhasedPlan
R f ./a.bin -> ./tmp/b.bin
R f ./b.bin -> ./tmp/a.bin
R f ./tmp/b.bin -> b.bin
R f ./tmp/a.bin -> a.bin
• Tiered Storage HDFS-12090 [issues.apache.org]
– Design documentation
– List of subtasks, lots of linked tickets – take one!
– Discussion of scope, implementation, and feedback
• Bert Verslyppe, Hendrik Depauw, Íñigo Goiri, Rakesh Radhakrishnan, Uma
Gangumalla, Daryn Sharp, Steve Loughran, Sanjay Radia, Anu Engineer,
Jitendra Pandey, Andrew Wang, Zhe Zhang, Allen Wittenauer, and many
others …
Thanks to the community for feedback & help!
23
Multipart Extra Slides
24
• Applications write to HDFS
– First to DISK, then SyncService asynchronously copies to synchronization endpoint
– When files have been copied, the extraneous disk replicas can be removed
Deep Dive: MultiPart Upload
25
SyncService
Datanode
Datanode
Datanode
External
Store
File
Block1
Client
Write File
Multipart InitMultipart Complete
Multipart PutPart
Block2
Block3
• Common concept in Object Storage
– Supported by S3, WASB
• Usage in Hadoop
– S3A uses it – see Steve Loughran’s talk
– New to HDFS – HDFS-13186
• Three phases
– UploadHandle initMultipart(Path filePath)
– PartHandle putPart(Path filePath, InputStream inputStream,
int partNumber, UploadHandle uploadId, long lengthInBytes)
– void complete(Path filePath, List<Pair<Integer, PartHandle>> handles,
UploadHandle multipartUploadId)
• Benefits:
– Object/File Isolation – you only see the results when it’s done
– Can be written in parallel across multiple nodes
MultipartUploader
26
MultipartUploader in SyncService
27
Sync Extra Slides
28
Create Directory
29
Delete Directory
30
Rename Directory
31
Create File
32
Delete File
33
Rename File
34
Modify File
35

Contenu connexe

Tendances

Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks
 
Big data processing meets non-volatile memory: opportunities and challenges
Big data processing meets non-volatile memory: opportunities and challenges Big data processing meets non-volatile memory: opportunities and challenges
Big data processing meets non-volatile memory: opportunities and challenges DataWorks Summit
 
Running secured Spark job in Kubernetes compute cluster and integrating with ...
Running secured Spark job in Kubernetes compute cluster and integrating with ...Running secured Spark job in Kubernetes compute cluster and integrating with ...
Running secured Spark job in Kubernetes compute cluster and integrating with ...DataWorks Summit
 
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)Data processing at the speed of 100 Gbps@Apache Crail (Incubating)
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)DataWorks Summit
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingDataWorks Summit
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonDataWorks Summit/Hadoop Summit
 
Exploiting machine learning to keep Hadoop clusters healthy
Exploiting machine learning to keep Hadoop clusters healthyExploiting machine learning to keep Hadoop clusters healthy
Exploiting machine learning to keep Hadoop clusters healthyDataWorks Summit
 
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo ScaleManaging Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo ScaleDataWorks Summit/Hadoop Summit
 
Evolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemEvolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemDataWorks Summit/Hadoop Summit
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldDataWorks Summit
 
A New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouseA New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouseDataWorks Summit/Hadoop Summit
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInDataWorks Summit
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobileDataWorks Summit
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit
 
HBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, SolutionsHBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, SolutionsDataWorks Summit
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementDataWorks Summit/Hadoop Summit
 

Tendances (20)

Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
Big data processing meets non-volatile memory: opportunities and challenges
Big data processing meets non-volatile memory: opportunities and challenges Big data processing meets non-volatile memory: opportunities and challenges
Big data processing meets non-volatile memory: opportunities and challenges
 
Running secured Spark job in Kubernetes compute cluster and integrating with ...
Running secured Spark job in Kubernetes compute cluster and integrating with ...Running secured Spark job in Kubernetes compute cluster and integrating with ...
Running secured Spark job in Kubernetes compute cluster and integrating with ...
 
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)Data processing at the speed of 100 Gbps@Apache Crail (Incubating)
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
 
Exploiting machine learning to keep Hadoop clusters healthy
Exploiting machine learning to keep Hadoop clusters healthyExploiting machine learning to keep Hadoop clusters healthy
Exploiting machine learning to keep Hadoop clusters healthy
 
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo ScaleManaging Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
 
HDFS Analysis for Small Files
HDFS Analysis for Small FilesHDFS Analysis for Small Files
HDFS Analysis for Small Files
 
Evolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemEvolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage Subsystem
 
To The Cloud and Back: A Look At Hybrid Analytics
To The Cloud and Back: A Look At Hybrid AnalyticsTo The Cloud and Back: A Look At Hybrid Analytics
To The Cloud and Back: A Look At Hybrid Analytics
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
 
A New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouseA New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouse
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 
HBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, SolutionsHBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, Solutions
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
 
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage SubsystemEvolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
 

Similaire à HDFS tiered storage

HDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSHDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSDataWorks Summit
 
Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?gvernik
 
Hadoop and object stores can we do it better
Hadoop and object stores  can we do it betterHadoop and object stores  can we do it better
Hadoop and object stores can we do it bettergvernik
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFSApache Apex
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFSEdureka!
 
Big data with HDFS and Mapreduce
Big data  with HDFS and MapreduceBig data  with HDFS and Mapreduce
Big data with HDFS and Mapreducesenthil0809
 
Hadoop security
Hadoop securityHadoop security
Hadoop securityBiju Nair
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudLeons Petražickis
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Simplilearn
 
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at  FacebookHadoop and Hive Development at  Facebook
Hadoop and Hive Development at FacebookS S
 
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at FacebookHadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebookelliando dias
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduceDerek Chen
 
Topic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptxTopic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptxDanishMahmood23
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jkEdureka!
 

Similaire à HDFS tiered storage (20)

HDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSHDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFS
 
Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?
 
Hadoop and object stores can we do it better
Hadoop and object stores  can we do it betterHadoop and object stores  can we do it better
Hadoop and object stores can we do it better
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFS
 
Lecture 2 part 1
Lecture 2 part 1Lecture 2 part 1
Lecture 2 part 1
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
 
Unit-3.pptx
Unit-3.pptxUnit-3.pptx
Unit-3.pptx
 
Nextag talk
Nextag talkNextag talk
Nextag talk
 
Big data with HDFS and Mapreduce
Big data  with HDFS and MapreduceBig data  with HDFS and Mapreduce
Big data with HDFS and Mapreduce
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
HDFS
HDFSHDFS
HDFS
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
 
Tutorial Haddop 2.3
Tutorial Haddop 2.3Tutorial Haddop 2.3
Tutorial Haddop 2.3
 
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at  FacebookHadoop and Hive Development at  Facebook
Hadoop and Hive Development at Facebook
 
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at FacebookHadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebook
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Hadoop overview.pdf
Hadoop overview.pdfHadoop overview.pdf
Hadoop overview.pdf
 
Topic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptxTopic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptx
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
 

Plus de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Dernier

Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 

Dernier (20)

Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 

HDFS tiered storage

  • 1. 1 HDFS Tiered Storage Virajith Jalaparthi, Chris Douglas Ewan Higgs, Thomas Demoor
  • 2. • Tiered Storage [issues.apache.org] – HDFS-9806 – HDFS-12090 Microsoft – Western Digital – Apache Community 2 Virajith Jalaparti Chris Douglas … Ewan Higgs Kasper Janssens Thomas Demoor …
  • 3. • Hadoop Compatible FS [1]: s3a://, wasb://, adl://, … • Direct IO between Hadoop apps and Object Store • Disaggregated compute & storage • HDFS NameNode functions taken up by Object Store Hadoop already plays nicely with Object Stores 3 REMOTE STORE APP HADOOP CLUSTER READWRITE [1]: https://s.apache.org/Hadoop3FSspec • Pain points: – Not really a FileSystem: rename, append, directories, ... • Even with correct semantics, performance unlike HDFS • HDFS features unavailable (e.g., hedged reads, snapshots, etc.) – No locality • Higher latency than attached storage • Higher variance in both latency and throughput – No HDFS integration • Policies for users, permissions, quota, security, … • Storage Plugins (e.g. Ranger, Sentry)
  • 4. • External Storage Tier for HDFS – HDFS Storage Policy: DISK, SSD, RAM, ARCHIVE, PROVIDED • Share namespace, not only data! – Keep 1-to-1 mapping: HDFS file  external object • No change to existing HDFS workflows – Hadoop Apps interact with HDFS as before (fully transparent) – Data Tiering happens async in background – Native support for all HDFS features / admin tools • Data Tiering controlled by admin – On directory / file level – Through Storage Policy (e.g. <HDD, HDD, HDD>  <PROVIDED>) • HDFS NameNode scalability not a bottleneck – HDFS manages the working set/compatibility – Object store manages larger data lake, ingest, etc. Goal: let HDFS play nicely with Object Stores 4 HDFS REMOTE STORE APP HADOOP CLUSTER WRITE BACK LOAD ON-DEMAND READWRITE
  • 5. HDFS • Use HDFS to manage remote storage “Mount” remote storage in HDFS 5 • Use HDFS to manage remote storage – HDFS blocks correspond to fixed range of bytes in remote – AliasMap (DWS17: youtu.be/kpNDZNp-Nlw) – HDFS coordinates reads/writes to remote store – Mount remote store as a PROVIDED tier in HDFS – Set StoragePolicy to move data into HDFS … … / a b HDFS Namespace … … / d e f Remote Namespace Mount remote namespace c d e f Mount point REMOTE STORE APP HADOOP CLUSTER WRITE THROUGH LOAD ON-DEMAND READWRITE Alias Map HDFS Block->Remote location
  • 6. PROVIDED storage on the READ path 6 /foo/bar /foo/baz /foo/bazt /foo/bazz IaaS(De)Hydration Delegation
  • 7. PROVIDED storage on the READ path 7 /foo/bar /foo/baz /foo/bazt /foo/bazz /foo bar baz bazt bazz setrep=2 IaaS(De)Hydration Delegation
  • 8. PROVIDED storage on the READ path 8 /foo/bar /foo/baz /foo/bazt /foo/bazz /foo bar baz bazt bazz [Router-Based] Federation /cloud ? IaaS(De)Hydration Delegation
  • 9. Apache Hadoop 3.1.0 • Generate FSImage from a FileSystem  Start a NameNode serving remote data  Serve from (a subset of) DataNodes in the cluster • Backported and deployed in production at Microsoft • Static: namespace changes are not reflected in HDFS NameNode 9 • Prototype code [2] with the PROVIDED abstraction  Read-through caching of blocks (demand paging)  Scheduled, metered prefetch for recurring pipelines with SLOs  Write-through to remote (participant in the HDFS write pipeline)  Wire FSImage to a running NameNode • Per-application NameNodes; with isolation • Bidirectional synchronization out of scope [2]: https://github.com/Microsoft-CISL/hadoop/tree/tieredStore-sig16
  • 10. Running Apache Hadoop in the cloud 10 • HDInsight/Elastic MapReduce (EMR)/etc. • Disaggregation introduces not only latency, but also variance • “Lift and shift” workloads  Rely on HDFS plugins  May need to use attached storage to meet SLOs  Would otherwise require spending more for capacity to the remote store 𝑠𝑡𝑑𝑑𝑒𝑣 𝑚𝑒𝑎𝑛
  • 11. • HDFS can be used as a cache for Object Storage • Similar to $my_favorite_caching_FS (CFS)? – These are all caching systems that dispatch between storage systems horizontally – We want to tier the storage systems vertically • Support HDFS, not just Hadoop ecosystem around FileSystem Notes on Caching 11 Compute $CFS ☁️ Compute $CFS Compute $CFS HDFS HDFS HDFS Compute ☁️ Compute Compute HDFS HDFS HDFS
  • 12. … … … / bucket1 carlhadoop Object Store … … / reports fileA fileB dir sales HDFS cluster NameNode DataNode 1 P External Storage for HDFS Hadoop Client DataNode N P fileA fileB dir 12
  • 13. • “DropBox for Hadoop" – Hadoop cluster has complete namespace but only “data in working set” is stored locally – Dynamically page in missing data from object store on read – Asynchronously write back data to object store • Storage Policies + Replication count offer rich placement options – E.g.: hot data: <SSD, PROVIDED> / cold data: <PROVIDED> • Dedicated object storage system more efficient ($$$) – Similar goal as ARCHIVE storage policy – Object storage features (erasure coding, multi-geo replication, …) • Data sharing with non-Hadoop apps – File-object mapping means objects can be accessed in remote store with REST API / SDKs Use case: External Storage for HDFS 13
  • 14. Community feedback at last year’s Summit 14
  • 15. WD Activescale Object Storage • Western Digital moving up the stack (Data Center Systems) • Scale-out object storage system for Private & Public Cloud • Key features:  Compatible with Amazon S3 API  Strong consistency (not eventual!)  Erasure coding for efficient storage • Scale:  Petabytes per rack  Billions of objects per rack  Linear scalability in # of racks • More info at http://www.hgst.com/products/systems 15
  • 16. • AS AN Administrator • I CAN configure HDFS with an object storage backend hdfs storagepolicies -setStoragePolicy -policy PROVIDED -path /var/log hdfs syncservice -create -backupOnly -name activescale /var/logs s3a://hadoop-logs/ • SO THAT when a user copies files to HDFS they are asynchronously copied to to the synchronization endpoint Demo time 16
  • 17. Another example • AS AN Administrator • I CAN set the Storage Policy to be PROVIDED_ONLY hdfs storagepolicies –setStoragePolicy -policy PROVIDED_ONLY -path /var/log • SO THAT data is no longer in the Datanode but is transparently read through from the synchronization endpoint on access. 17
  • 18. • Preserve file-object mapping – AliasMap (last year’s talk – HDFS-9806): synchronize namespaces – Datanodes collaborate to move blocks which together form an object in destination system • Minimize impact on frontend traffic / efficient data transfer – Obvious: Read all blocks into a single Datanode to reconstruct a file before transferring – Efficient: Transfer directly copies block per block outside of cluster using • S3: multipart upload • WASB: append blobs • HDFS: tmpdir + concat • Flexible deployment: could run in NameNode OR as External service – In Namenode is easy to deploy but adds resource pressure – External service is more difficult to deploy for some sites but reduces resource pressure – Ongoing community discussion; start with external, include internal option as required Requirements 18
  • 19. • MountManager manages all the local mount points – Mount point can be configured to sync with external store • Periodically create a diff by comparing snapshots of the mountpoint – NEW SyncService (in/out NameNode) • Generate a ”phased plan” for ordering the operations in the diff – Multiple ordered phases • RENAMES_TO_TEMP, DELETES, RENAMES_TO_FINAL, CREATE_DIRS, CREATE_FILES • e.g. dir creation before file creation – Parallel operations within a phase • Leverage multiple datanodes and connections to external store • e.g. Upload multiple new files in parallel • Execute plan and track work – Namespace (metadata) operations originate from SyncService – Data operations originate from DataNodes – Tracking: admin can query mountpoint for progress Deep Dive: Synchronization 19
  • 20. • Snapshot diff: – Reflect point-in-time 100% accurate state of HDFS in external store – Snapshot ensures data remains referenceable: retains blocks of data – Does not track create + delete in between consecutive snapshots (cfr. file B in Fig.) • EditLog post-processing: – To parallelize • Read batch from log and track lineage between overlapping operations – HDFS operations might have altered reality: no point-in time • Data not part of log: would require postponing block garbage collection Tracking changes: Snapshot diff vs. EditLog 20 ss-6 B B A A ss-5time
  • 21. SnapshotDiffReport M d . + d ./a + f ./f1.bin Example Diff – Simple Case 21 Commands #given /basic-test mkdir -p /basic-test/a/b/c touch /basic-test/a/b/c/d/f1.bin touch /basic-test/f1.bin Simple Case - New dirs; new files PhasedPlan + d ./a/b/c/d/ + f ./a/b/c/d/f1.bin + f ./f1.bin
  • 22. SnapshotDiffReport M d . R f ./a.bin -> ./b.bin R f ./b.bin -> ./a.bin Example Diff – Harder Case 22 Commands #given /swap-test/a.bin #given /swap-test/b.bin mv /swap-test/a.bin /swap-test/tmp mv /swap-test/b.bin /swap-test/a.bin mv /swap-test/tmp /swap-test/b.bin Harder Case - Cycle PhasedPlan R f ./a.bin -> ./tmp/b.bin R f ./b.bin -> ./tmp/a.bin R f ./tmp/b.bin -> b.bin R f ./tmp/a.bin -> a.bin
  • 23. • Tiered Storage HDFS-12090 [issues.apache.org] – Design documentation – List of subtasks, lots of linked tickets – take one! – Discussion of scope, implementation, and feedback • Bert Verslyppe, Hendrik Depauw, Íñigo Goiri, Rakesh Radhakrishnan, Uma Gangumalla, Daryn Sharp, Steve Loughran, Sanjay Radia, Anu Engineer, Jitendra Pandey, Andrew Wang, Zhe Zhang, Allen Wittenauer, and many others … Thanks to the community for feedback & help! 23
  • 25. • Applications write to HDFS – First to DISK, then SyncService asynchronously copies to synchronization endpoint – When files have been copied, the extraneous disk replicas can be removed Deep Dive: MultiPart Upload 25 SyncService Datanode Datanode Datanode External Store File Block1 Client Write File Multipart InitMultipart Complete Multipart PutPart Block2 Block3
  • 26. • Common concept in Object Storage – Supported by S3, WASB • Usage in Hadoop – S3A uses it – see Steve Loughran’s talk – New to HDFS – HDFS-13186 • Three phases – UploadHandle initMultipart(Path filePath) – PartHandle putPart(Path filePath, InputStream inputStream, int partNumber, UploadHandle uploadId, long lengthInBytes) – void complete(Path filePath, List<Pair<Integer, PartHandle>> handles, UploadHandle multipartUploadId) • Benefits: – Object/File Isolation – you only see the results when it’s done – Can be written in parallel across multiple nodes MultipartUploader 26