SlideShare une entreprise Scribd logo
1  sur  66
Télécharger pour lire hors ligne
Scaling HDFS to Manage Billions of Files
Haohui Mai, Jing Zhao

Hortonworks, Inc.
About the speakers
• Haohui Mai
• Active committers and PMC in Hadoop
• Ph.D. in Computer Science from UIUC in 2013
• Joined the HDFS team in Hortonworks
• 250+ commits in Hadoop
About the speakers
• Jing Zhao
• Active committers and PMC in Hadoop
• Ph.D. in Computer Science from USC in 2012
• HDFS team member in Hortonworks
• 250+ commits in Hadoop
Past: the scale of data
Past: the scale of data
• In 2007 (PC)
• ~500 GB hard drives
• thousands of files
Past: the scale of data
• In 2007 (PC)
• ~500 GB hard drives
• thousands of files
• In 2007 (Hadoop)
• several hundred nodes
• several hundred TBs
• millions of files
Past: the scale of data
• In 2007 (Hadoop)
• several hundred nodes
• several hundred TBs
• millions of files
Past: the scale of data
• In 2007 (Hadoop)
• several hundred nodes
• several hundred TBs
• millions of files
• In 2015
• 4,000+ nodes (10x)
• 150+ PBs (1000x)
• 400M+ files (100x)
Present: a generic storage system
Present: a generic storage system
• SQL-On-Hadoop
• Machine learning
• Real-time analytics
• Data streaming
• File archives, NFS…

Present: a generic storage system
• SQL-On-Hadoop
• Machine learning
• Real-time analytics
• Data streaming
• File archives, NFS…

• From MR-centric filesystem to a
generic distributed storage system
Future: Billions of files in HDFS
Future: Billions of files in HDFS
• HDFS clusters continue to grow
Future: Billions of files in HDFS
• HDFS clusters continue to grow
• New use cases emerge
• IoT, time series data…
Future: Billions of files in HDFS
• HDFS clusters continue to grow
• New use cases emerge
• IoT, time series data…
• Files are natural abstractions of
data
• Few big files → many small
files in HDFS
• Billions of files in a few years
NameNode limits the scale
NameNode limits the scale
• Master / slave architecture
• All metadata in NN, data
across multiple DNs
• Simple and robust
NN
DN DN DN
NameNode limits the scale
• Master / slave architecture
• All metadata in NN, data
across multiple DNs
• Simple and robust
• Does not scale beyond the size
of the NN heap
• 400M files ~ 128G heap
• GC pauses
NN
DN DN DN
Next-gen arch: HDFS on top of KV stores
Next-gen arch: HDFS on top of KV stores
• Namespace (NS) on top of Key-
Value (KV) stores
• Storing the NS into LevelDB
Next-gen arch: HDFS on top of KV stores
• Namespace (NS) on top of Key-
Value (KV) stores
• Storing the NS into LevelDB
• Working set fits in memory,
cold metadata on disks
• Match the usage patterns of
HDFS
Namespace
Next-gen arch: HDFS on top of KV stores
• Namespace (NS) on top of Key-
Value (KV) stores
• Storing the NS into LevelDB
• Working set fits in memory,
cold metadata on disks
• Match the usage patterns of
HDFS
• Low adoption cost: fully
compatible
Namespace
• Introduction
• Namespace on top of KV stores
• Evaluation
• Future work & conclusions
Encoding the NS as KV pairs
• inode_id → flat binary
representation of inode
• avoid serialization costs
• <pid,foo> → the inode id of
the child foo whose parent’s
inode id is pid
struct inode {
uint64_t inode_id;
uint64_t parent_id;
uint32_t flags;
…
};
Encoding directories
root
foo
bar
Encoding directories
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
root
foo
bar
Encoding directories
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
root
foo
bar
Encoding directories
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
root
foo
bar
Resolving paths
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
Resolving paths
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
/ foo / bar
Resolving paths
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
/ foo / barID: 1
Resolving paths
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
/ foo / barID: 1
<1,foo>
Resolving paths
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
/ foo / bar
<1,foo>
ID: 2
Resolving paths
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
/ foo / bar
ID: 2
<2,bar>
Resolving paths
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
/ foo / bar
<2,bar>
ID: 3
Listing directories
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
Listing directories
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
$ ls /
Listing directories
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
$ ls /
ID: 1
Listing directories
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
$ ls /
ID: 1
<1,*>
Listing directories
Key Value
1 root
2 foo
3 bar
<1,foo> 2
<2,bar> 3
$ ls /
foo
ID: 1
<1,*>
Integrate with existing HDFS features
Integrate with existing HDFS features
• HDFS snapshots
• Metadata only operations
• Append version ids for each key
• Map between snapshot ids and version ids
Integrate with existing HDFS features
• HDFS snapshots
• Metadata only operations
• Append version ids for each key
• Map between snapshot ids and version ids
• NameNode High Availability (HA)
• Use edit logs instead of the WAL of the KV stores to persist operations
• Minimal changes in the current HA mechanisms
Current status
• Phase I — NS on top of KV interfaces (HDFS-8286)
• NS on top of an in-memory KV store
• Under active development
• Phase II — Partial NS in the memory
• Working set of the NS in the memory, cold metadata on disks
• Scaling the NS beyond the size of heap
• Introduction
• Namespace on top of KV stores
• Evaluation
• Future work & conclusions
Evaluation
• 4 node clusters, connected with 10GbE networks
• 6-core Intel Xeon E5-2630 @ 2.3GHz * 2, 64G DDR3 @ 1333 MHz
• WDC 1TB disks @ 7200 RPM, 64MB cache
• OpenJDK 1.8.0_25
Evaluation (cont.)
Apache Hadoop
2.7.0
2.7.0 InMem LevelDB
NS on top of an in-
memory KV map
NS on top of
LevelDB
NNThroughput (write)
• 300,000 operations
• 8 threads
• Comparable performance
• Simpler implementation
on delete()
• Syncing edit logs is the
bottleneck
Throughput(ops/s) 0
125
250
375
500
create mkdirs delete rename
2.7.0 InMem LevelDB
NNThroughput (read)
• Read-only operations
Throughput(ops/s)
0
50000
100000
150000
200000
open fileStatus
2.7.0 InMem
LevelDB LevelDB-opt
NNThroughput (read)
• Read-only operations
• 1/3 throughput of vanilla
LevelDB v.s. 2.7.0
• Contentions of the global lock
in LevelDB during get()
Throughput(ops/s)
0
50000
100000
150000
200000
open fileStatus
2.7.0 InMem
LevelDB LevelDB-opt
NNThroughput (read)
• Read-only operations
• 1/3 throughput of vanilla
LevelDB v.s. 2.7.0
• Contentions of the global lock
in LevelDB during get()
• A lock-free fast path of get() to
recover the performance
(LevelDB-opt)
Throughput(ops/s)
0
50000
100000
150000
200000
open fileStatus
2.7.0 InMem
LevelDB LevelDB-opt
YCSB: Throughput
• YCSB against HBase 1.0.1.1
• Enabled short-circuit reads
• 100 threads, 10M records
Throughput(ops/s)
0
30000
60000
90000
120000
A B C F D E
2.7.0 InMem LevelDB
Latency(us)
0
2500
5000
7500
10000
A-Read
B-Read
C
-Read
F-RM
W
F-Read
E-Scan
2.7.0 InMem LevelDB
YCSB: Latency
Runtime(s)
0
350
700
1050
1400
TeraG
en
TeraSort
TeraValidate
2.7.0 InMem LevelDB
TeraSort
• 10G data per node
• Comparable performance
Throughput(op/s)
0
32500
65000
97500
130000
Working set
256M
128M
64M
32M
Impact on working set size
• Throughput of GetFileStatus()
under different sizes of working
sets
• Introduction
• Namespace on top of KV stores
• Evaluation
• Future work & conclusions
Future work
• Implementation and stabilization
• Operation concerns
• Compaction and fsck
• Failure recovery
• Cold startup
Conclusions
• HDFS needs to continue to scale
Conclusions
• HDFS needs to continue to scale
• Evolve HDFS towards KV-based architecture
Conclusions
• HDFS needs to continue to scale
• Evolve HDFS towards KV-based architecture
• Scaling beyond the size of the NN heap
Conclusions
• HDFS needs to continue to scale
• Evolve HDFS towards KV-based architecture
• Scaling beyond the size of the NN heap
• Preliminary evaluation looks promising
Acknowledgement
• Xiao Lin, interned with Hortonworks in 2013
• PoC implementation of LevelDB backed namespace
• Zhilei Xu, interned with Hortonworks in 2014
• Integration between various HDFS features and LevelDB
• Performance tuning
Thank you!
Integrating with LevelDB
Write operations in HDFS

• Updating LevelDB inside the
global lock
• New LevelDB::Write() w/
o blocking calls
• Write edit log
• logSync()
Integrating with LevelDB
Write operations in HDFS

• Updating LevelDB inside the
global lock
• New LevelDB::Write() w/
o blocking calls
• Write edit log
• logSync()
Pruning edit logs
• Dump memtable into the disks
• Update MANIFEST
• Prune edit logs

Contenu connexe

Tendances

Hadoop Meetup Jan 2019 - TonY: TensorFlow on YARN and Beyond
Hadoop Meetup Jan 2019 - TonY: TensorFlow on YARN and BeyondHadoop Meetup Jan 2019 - TonY: TensorFlow on YARN and Beyond
Hadoop Meetup Jan 2019 - TonY: TensorFlow on YARN and BeyondErik Krogen
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDBMongoDB
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDBSage Weil
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
 
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyPilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyStuart Pook
 
RocksDB storage engine for MySQL and MongoDB
RocksDB storage engine for MySQL and MongoDBRocksDB storage engine for MySQL and MongoDB
RocksDB storage engine for MySQL and MongoDBIgor Canadi
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit
 
Quantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFSQuantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFSbigdatagurus_meetup
 
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby NodeHadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby NodeErik Krogen
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...leifwalsh
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaCloudera, Inc.
 
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopTez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopDataWorks Summit
 
第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandraShun Nakamura
 
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisStorage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisSameer Tiwari
 
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter
 
Real-Time Data Loading from MySQL to Hadoop
Real-Time Data Loading from MySQL to HadoopReal-Time Data Loading from MySQL to Hadoop
Real-Time Data Loading from MySQL to HadoopContinuent
 

Tendances (20)

Hadoop Meetup Jan 2019 - TonY: TensorFlow on YARN and Beyond
Hadoop Meetup Jan 2019 - TonY: TensorFlow on YARN and BeyondHadoop Meetup Jan 2019 - TonY: TensorFlow on YARN and Beyond
Hadoop Meetup Jan 2019 - TonY: TensorFlow on YARN and Beyond
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDB
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
 
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyPilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
 
RocksDB meetup
RocksDB meetupRocksDB meetup
RocksDB meetup
 
RocksDB storage engine for MySQL and MongoDB
RocksDB storage engine for MySQL and MongoDBRocksDB storage engine for MySQL and MongoDB
RocksDB storage engine for MySQL and MongoDB
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 
Quantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFSQuantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFS
 
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby NodeHadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
Apache Hadoop 0.22 and Other Versions
Apache Hadoop 0.22 and Other VersionsApache Hadoop 0.22 and Other Versions
Apache Hadoop 0.22 and Other Versions
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
 
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopTez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
 
第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra
 
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisStorage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
 
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
 
Real-Time Data Loading from MySQL to Hadoop
Real-Time Data Loading from MySQL to HadoopReal-Time Data Loading from MySQL to Hadoop
Real-Time Data Loading from MySQL to Hadoop
 

Similaire à Scaling HDFS to Manage Billions of Files

Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...SignalFx
 
PhegData X - High Performance EBS
PhegData X - High Performance EBSPhegData X - High Performance EBS
PhegData X - High Performance EBSHanson Dong
 
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanNarayana B
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL Bernd Ocklin
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Javasunnygleason
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDBMongoDB
 
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMs
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMsGlobal Azure Virtual 2020 What's new on Azure IaaS for SQL VMs
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMsMarco Obinu
 
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedis Labs
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataRoger Xia
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoopbddmoscow
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectureshypertable
 
Red Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep DiveRed Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep DiveRed_Hat_Storage
 
Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!DataWorks Summit
 

Similaire à Scaling HDFS to Manage Billions of Files (20)

Hadoop and friends
Hadoop and friendsHadoop and friends
Hadoop and friends
 
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...
 
PhegData X - High Performance EBS
PhegData X - High Performance EBSPhegData X - High Performance EBS
PhegData X - High Performance EBS
 
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in Telco
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Java
 
Drop acid
Drop acidDrop acid
Drop acid
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMs
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMsGlobal Azure Virtual 2020 What's new on Azure IaaS for SQL VMs
Global Azure Virtual 2020 What's new on Azure IaaS for SQL VMs
 
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectures
 
Red Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep DiveRed Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep Dive
 
Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!
 

Dernier

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Dernier (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

Scaling HDFS to Manage Billions of Files