SlideShare une entreprise Scribd logo
1  sur  19
Télécharger pour lire hors ligne
HBase Sizing Notes
                                     Lars George
                         Director EMEA Services @ Cloudera
                                 lars@cloudera.com




Saturday, June 30, 12
Competing Resources

                        • Reads and Writes compete for the same
                          low-level resources
                         ‣ Disk (HDFS) and Network I/O
                         ‣ RPC Handlers and Threads
                        • Otherwise the do exercise completely
                          separate code paths


Saturday, June 30, 12
Memory Sharing
                        • By default every region server is dividing its
                          memory (i.e. given maximum heap) into
                          ‣ 40% for in-memory stores (write ops)
                          ‣ 20% for block caching (reads ops)
                          ‣ remaining space (here 40%) go towards
                            usual Java heap usage (objects etc.)
                        • Share of memory needs to be tweaked
Saturday, June 30, 12
Reads
                        • Locate and route request to appropriate
                          region server
                          ‣ Client caches information for faster
                            lookups ➜ consider prefetching option
                            for fast warmups
                        • Eliminate store files if possible using time
                          ranges or Bloom filter
                        • Try block cache, if block is missing then
                          load from disk

Saturday, June 30, 12
Block Cache
                        • Use exported metrics to see effectiveness
                          of block cache
                         ‣ Check fill and eviction rate, as well as hit
                            ratios ➜ random reads are not ideal
                        • Tweak up or down as needed, but watch
                          overall heap usage
                        • You absolutely need the block cache
                         ‣ Set to 10% at least for short term benefits
Saturday, June 30, 12
Writes
                        •   The cluster size is often determined by the
                            write performance
                        •   Log structured merge trees like
                            ‣   Store mutation in in-memory store and
                                write-ahead log
                            ‣   Flush out aggregated, sorted maps at specified
                                threshold - or - when under pressure
                            ‣   Discard logs with no pending edits
                            ‣   Perform regular compactions of store files

Saturday, June 30, 12
Write Performance
                        • There are many factors to the overall write
                          performance of a cluster
                          ‣ Key Distribution ➜ Avoid region hotspot
                          ‣ Handlers ➜ Do not pile up too early
                          ‣ Write-ahead log ➜ Bottleneck #1
                          ‣ Compactions ➜ Badly tuned can cause
                            ever increasing background noise

Saturday, June 30, 12
Write-Ahead Log
                        • Currently only one per region server
                         ‣ Shared across all stores (i.e. column
                            families)
                         ‣ Synchronized on file append calls
                        • Work being done on mitigating this
                         ‣ WAL Compression
                         ‣ Multiple WAL’s per region server ➜ Start
                            more than one region server per node?

Saturday, June 30, 12
Write-Ahead Log (cont.)
                        • Size set to 95% of default block size
                          ‣ 64MB or 128MB, but check config!
                        • Keep number low to reduce recovery time
                          ‣ Limit set to 32, but can be increased
                        • Increase size of logs - and/or - increase the
                          number of logs before blocking
                        • Compute number based on fill distribution
                          and flush frequencies

Saturday, June 30, 12
Write-Ahead Log (cont.)
                        • Writes are synchronized across all stores
                          ‣ A large cell in one family can stop all
                            writes of another
                          ‣ In this case the RPC handlers go binary,
                            i.e. either work or all block
                        • Can be bypassed on writes, but means no
                          real durability and no replication
                          ‣ Maybe use coprocessor to restore
                            dependent data sets (preWALRestore)
Saturday, June 30, 12
Flushes

                        • Every mutation call (put, delete etc.) causes
                          a check for a flush
                        • If threshold is met, flush file to disk and
                          schedule a compaction
                          ‣ Try to compact newly flushed files quickly
                        • The compaction returns - if necessary -
                          where a region should be split


Saturday, June 30, 12
Compaction Storms
                        • Premature flushing because of # of logs or
                          memory pressure
                         ‣ Files will be smaller than the configured
                            flush size
                        • The background compactions are hard at
                          work merging small flush files into the
                          existing, larger store files
                         ‣ Rewrite hundreds of MB over and over

Saturday, June 30, 12
Dependencies

                        • Flushes happen across all stores/column
                          families, even if just one triggers it
                        • The flush size is compared to the size of all
                          stores combined
                          ‣ Many column families dilute the size
                          ‣ Example: 55MB + 5MB + 4MB


Saturday, June 30, 12
Some Numbers
                        • Typical write performance of HDFS is
                          35-50MB/s
                               Cell Size               OPS
                                0.5MB                  70-100
                                100KB                 350-500
                                 10KB               3500-5000 ??
                                  1KB             35000-50000 ????

                        This is way to high in practice - Contention!

Saturday, June 30, 12
Some More Numbers
                        •   Under real world conditions the rate is less, more
                            like 15MB/s or less
                            ‣   Thread contention is cause for massive slow
                                down

                                   Cell Size                  OPS
                                    0.5MB                       10
                                    100KB                      100
                                     10KB                      800
                                      1KB                     6000


Saturday, June 30, 12
Notes
                        • Compute memstore sizes based on number
                          of regions x flush size
                        • Compute number of logs to keep based on
                          fill and flush rate
                        • Ultimately the capacity is driven by
                          ‣ Java Heap
                          ‣ Region Count and Size
                          ‣ Key Distribution
Saturday, June 30, 12
Cheat Sheet #1

                        • Ensure you have enough or large enough
                          write-ahead logs
                        • Ensure you do not oversubscribe available
                          memstore space
                        • Ensure to set flush size large enough but
                          not too large
                        • Check write-ahead log usage carefully
Saturday, June 30, 12
Cheat Sheet #2
                        • Enable compression to store more data per
                          node
                        • Tweak compaction algorithm to peg
                          background I/O at some level
                        • Consider putting uneven column families in
                          separate tables
                        • Check metrics carefully for block cache,
                          memstore, and all queues

Saturday, June 30, 12
Example
                        •   Java Xmx heap at 10GB
                        •   Memstore share at 40% (default)
                            ‣   10GB Heap x 0.4 = 4GB
                        •   Desired flush size at 128MB
                            ‣   4GB / 128MB = 32 regions max!
                        •   For WAL size of 128MB x 0.95%
                            ‣   4GB / (128MB x 0.95) = ~33 partially uncommitted
                                logs to keep around
                        •   Region size at 20GB
                            ‣   20GB x 32 regions = 640GB raw storage used

Saturday, June 30, 12

Contenu connexe

Tendances

HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationSchubert Zhang
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceCloudera, Inc.
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introductionScott Miao
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.Jack Levin
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon
 
004 architecture andadvanceduse
004 architecture andadvanceduse004 architecture andadvanceduse
004 architecture andadvanceduseScott Miao
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0enissoz
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars GeorgeJAX London
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand EnvironmentHBaseCon
 
Rigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance MeasurementRigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance MeasurementDataWorks Summit
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101Nick Dimiduk
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme MakeoverHBaseCon
 
Introduction to hadoop high availability
Introduction to hadoop high availability Introduction to hadoop high availability
Introduction to hadoop high availability Omid Vahdaty
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path HBaseCon
 

Tendances (20)

HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBase
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance Evaluation
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
 
004 architecture andadvanceduse
004 architecture andadvanceduse004 architecture andadvanceduse
004 architecture andadvanceduse
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on Mesos
 
HBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and CompactionHBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and Compaction
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
 
Rigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance MeasurementRigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance Measurement
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101
 
Hbase: an introduction
Hbase: an introductionHbase: an introduction
Hbase: an introduction
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme Makeover
 
Introduction to hadoop high availability
Introduction to hadoop high availability Introduction to hadoop high availability
Introduction to hadoop high availability
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path
 

En vedette

HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best PracticesVenu Anuganti
 
Designing Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQLDesigning Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQLVenu Anuganti
 
Maste Thesis Ap Thiago Assis
Maste Thesis Ap Thiago AssisMaste Thesis Ap Thiago Assis
Maste Thesis Ap Thiago AssisThiago Assis
 
iOS 上 self-sizing cell 的過去、現在、與未來
iOS 上 self-sizing cell 的過去、現在、與未來iOS 上 self-sizing cell 的過去、現在、與未來
iOS 上 self-sizing cell 的過去、現在、與未來Jeff Lin
 
Spark Streaming Data Pipelines
Spark Streaming Data PipelinesSpark Streaming Data Pipelines
Spark Streaming Data PipelinesMapR Technologies
 
Social Networks and the Richness of Data
Social Networks and the Richness of DataSocial Networks and the Richness of Data
Social Networks and the Richness of Datalarsgeorge
 
Ysance conference - cloud computing - aws - 3 mai 2010
Ysance   conference - cloud computing - aws - 3 mai 2010Ysance   conference - cloud computing - aws - 3 mai 2010
Ysance conference - cloud computing - aws - 3 mai 2010Ysance
 
See who is using MemSQL
See who is using MemSQLSee who is using MemSQL
See who is using MemSQLjenjermain
 
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 GenoaHadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoalarsgeorge
 
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012larsgeorge
 
Introduction sur les problématiques d'une architecture distribuée
Introduction sur les problématiques d'une architecture distribuéeIntroduction sur les problématiques d'une architecture distribuée
Introduction sur les problématiques d'une architecture distribuéeKhanh Maudoux
 
Big Data is not Rocket Science
Big Data is not Rocket ScienceBig Data is not Rocket Science
Big Data is not Rocket Sciencelarsgeorge
 
Phoenix - A High Performance Open Source SQL Layer over HBase
Phoenix - A High Performance Open Source SQL Layer over HBasePhoenix - A High Performance Open Source SQL Layer over HBase
Phoenix - A High Performance Open Source SQL Layer over HBaseSalesforce Developers
 
HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017larsgeorge
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012larsgeorge
 
MapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document DatabaseMapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document DatabaseMapR Technologies
 

En vedette (20)

HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best Practices
 
Hbase at Salesforce.com
Hbase at Salesforce.comHbase at Salesforce.com
Hbase at Salesforce.com
 
Designing Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQLDesigning Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQL
 
Maste Thesis Ap Thiago Assis
Maste Thesis Ap Thiago AssisMaste Thesis Ap Thiago Assis
Maste Thesis Ap Thiago Assis
 
iOS 上 self-sizing cell 的過去、現在、與未來
iOS 上 self-sizing cell 的過去、現在、與未來iOS 上 self-sizing cell 的過去、現在、與未來
iOS 上 self-sizing cell 的過去、現在、與未來
 
Spark Streaming Data Pipelines
Spark Streaming Data PipelinesSpark Streaming Data Pipelines
Spark Streaming Data Pipelines
 
Social Networks and the Richness of Data
Social Networks and the Richness of DataSocial Networks and the Richness of Data
Social Networks and the Richness of Data
 
Ysance conference - cloud computing - aws - 3 mai 2010
Ysance   conference - cloud computing - aws - 3 mai 2010Ysance   conference - cloud computing - aws - 3 mai 2010
Ysance conference - cloud computing - aws - 3 mai 2010
 
Hadoop unit
Hadoop unitHadoop unit
Hadoop unit
 
See who is using MemSQL
See who is using MemSQLSee who is using MemSQL
See who is using MemSQL
 
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 GenoaHadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
 
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
 
Introduction sur les problématiques d'une architecture distribuée
Introduction sur les problématiques d'une architecture distribuéeIntroduction sur les problématiques d'une architecture distribuée
Introduction sur les problématiques d'une architecture distribuée
 
Présentation Club STORM
Présentation Club STORMPrésentation Club STORM
Présentation Club STORM
 
Big Data is not Rocket Science
Big Data is not Rocket ScienceBig Data is not Rocket Science
Big Data is not Rocket Science
 
Phoenix - A High Performance Open Source SQL Layer over HBase
Phoenix - A High Performance Open Source SQL Layer over HBasePhoenix - A High Performance Open Source SQL Layer over HBase
Phoenix - A High Performance Open Source SQL Layer over HBase
 
Tech day hadoop, Spark
Tech day hadoop, SparkTech day hadoop, Spark
Tech day hadoop, Spark
 
HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
 
MapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document DatabaseMapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document Database
 

Similaire à HBase Sizing Notes

Practical ,Transparent Operating System Support For Superpages
Practical ,Transparent Operating System Support For SuperpagesPractical ,Transparent Operating System Support For Superpages
Practical ,Transparent Operating System Support For SuperpagesNadeeshani Hewage
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems confluent
 
strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsMatthew Dennis
 
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Lars Marowsky-Brée
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia DatabasesJaime Crespo
 
Tips and Tricks for SAP Sybase IQ
Tips and Tricks for SAP  Sybase IQTips and Tricks for SAP  Sybase IQ
Tips and Tricks for SAP Sybase IQDon Brizendine
 
Retaining Goodput with Query Rate Limiting
Retaining Goodput with Query Rate LimitingRetaining Goodput with Query Rate Limiting
Retaining Goodput with Query Rate LimitingScyllaDB
 
Linux internals for Database administrators at Linux Piter 2016
Linux internals for Database administrators at Linux Piter 2016Linux internals for Database administrators at Linux Piter 2016
Linux internals for Database administrators at Linux Piter 2016PostgreSQL-Consulting
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedEqunix Business Solutions
 
ZFS by PWR 2013
ZFS by PWR 2013ZFS by PWR 2013
ZFS by PWR 2013pwrsoft
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Community
 
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...peknap
 
In-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesIn-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesHazelcast
 
MySQL 高可用性
MySQL 高可用性MySQL 高可用性
MySQL 高可用性YUCHENG HU
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld
 

Similaire à HBase Sizing Notes (20)

HBase Sizing Notes
HBase Sizing NotesHBase Sizing Notes
HBase Sizing Notes
 
Five steps perform_2009 (1)
Five steps perform_2009 (1)Five steps perform_2009 (1)
Five steps perform_2009 (1)
 
5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance
 
Practical ,Transparent Operating System Support For Superpages
Practical ,Transparent Operating System Support For SuperpagesPractical ,Transparent Operating System Support For Superpages
Practical ,Transparent Operating System Support For Superpages
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems
 
strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patterns
 
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
 
Ocfs2 storage
Ocfs2 storageOcfs2 storage
Ocfs2 storage
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
Tips and Tricks for SAP Sybase IQ
Tips and Tricks for SAP  Sybase IQTips and Tricks for SAP  Sybase IQ
Tips and Tricks for SAP Sybase IQ
 
Retaining Goodput with Query Rate Limiting
Retaining Goodput with Query Rate LimitingRetaining Goodput with Query Rate Limiting
Retaining Goodput with Query Rate Limiting
 
Linux internals for Database administrators at Linux Piter 2016
Linux internals for Database administrators at Linux Piter 2016Linux internals for Database administrators at Linux Piter 2016
Linux internals for Database administrators at Linux Piter 2016
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
 
ZFS by PWR 2013
ZFS by PWR 2013ZFS by PWR 2013
ZFS by PWR 2013
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
 
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
 
Designing data intensive applications
Designing data intensive applicationsDesigning data intensive applications
Designing data intensive applications
 
In-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesIn-memory Data Management Trends & Techniques
In-memory Data Management Trends & Techniques
 
MySQL 高可用性
MySQL 高可用性MySQL 高可用性
MySQL 高可用性
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
 

Plus de larsgeorge

HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practicelarsgeorge
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in HadoopBackup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadooplarsgeorge
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv larsgeorge
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014larsgeorge
 
Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013larsgeorge
 
Realtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaseRealtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaselarsgeorge
 

Plus de larsgeorge (6)

HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in HadoopBackup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014
 
Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013
 
Realtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaseRealtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBase
 

Dernier

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Dernier (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

HBase Sizing Notes

  • 1. HBase Sizing Notes Lars George Director EMEA Services @ Cloudera lars@cloudera.com Saturday, June 30, 12
  • 2. Competing Resources • Reads and Writes compete for the same low-level resources ‣ Disk (HDFS) and Network I/O ‣ RPC Handlers and Threads • Otherwise the do exercise completely separate code paths Saturday, June 30, 12
  • 3. Memory Sharing • By default every region server is dividing its memory (i.e. given maximum heap) into ‣ 40% for in-memory stores (write ops) ‣ 20% for block caching (reads ops) ‣ remaining space (here 40%) go towards usual Java heap usage (objects etc.) • Share of memory needs to be tweaked Saturday, June 30, 12
  • 4. Reads • Locate and route request to appropriate region server ‣ Client caches information for faster lookups ➜ consider prefetching option for fast warmups • Eliminate store files if possible using time ranges or Bloom filter • Try block cache, if block is missing then load from disk Saturday, June 30, 12
  • 5. Block Cache • Use exported metrics to see effectiveness of block cache ‣ Check fill and eviction rate, as well as hit ratios ➜ random reads are not ideal • Tweak up or down as needed, but watch overall heap usage • You absolutely need the block cache ‣ Set to 10% at least for short term benefits Saturday, June 30, 12
  • 6. Writes • The cluster size is often determined by the write performance • Log structured merge trees like ‣ Store mutation in in-memory store and write-ahead log ‣ Flush out aggregated, sorted maps at specified threshold - or - when under pressure ‣ Discard logs with no pending edits ‣ Perform regular compactions of store files Saturday, June 30, 12
  • 7. Write Performance • There are many factors to the overall write performance of a cluster ‣ Key Distribution ➜ Avoid region hotspot ‣ Handlers ➜ Do not pile up too early ‣ Write-ahead log ➜ Bottleneck #1 ‣ Compactions ➜ Badly tuned can cause ever increasing background noise Saturday, June 30, 12
  • 8. Write-Ahead Log • Currently only one per region server ‣ Shared across all stores (i.e. column families) ‣ Synchronized on file append calls • Work being done on mitigating this ‣ WAL Compression ‣ Multiple WAL’s per region server ➜ Start more than one region server per node? Saturday, June 30, 12
  • 9. Write-Ahead Log (cont.) • Size set to 95% of default block size ‣ 64MB or 128MB, but check config! • Keep number low to reduce recovery time ‣ Limit set to 32, but can be increased • Increase size of logs - and/or - increase the number of logs before blocking • Compute number based on fill distribution and flush frequencies Saturday, June 30, 12
  • 10. Write-Ahead Log (cont.) • Writes are synchronized across all stores ‣ A large cell in one family can stop all writes of another ‣ In this case the RPC handlers go binary, i.e. either work or all block • Can be bypassed on writes, but means no real durability and no replication ‣ Maybe use coprocessor to restore dependent data sets (preWALRestore) Saturday, June 30, 12
  • 11. Flushes • Every mutation call (put, delete etc.) causes a check for a flush • If threshold is met, flush file to disk and schedule a compaction ‣ Try to compact newly flushed files quickly • The compaction returns - if necessary - where a region should be split Saturday, June 30, 12
  • 12. Compaction Storms • Premature flushing because of # of logs or memory pressure ‣ Files will be smaller than the configured flush size • The background compactions are hard at work merging small flush files into the existing, larger store files ‣ Rewrite hundreds of MB over and over Saturday, June 30, 12
  • 13. Dependencies • Flushes happen across all stores/column families, even if just one triggers it • The flush size is compared to the size of all stores combined ‣ Many column families dilute the size ‣ Example: 55MB + 5MB + 4MB Saturday, June 30, 12
  • 14. Some Numbers • Typical write performance of HDFS is 35-50MB/s Cell Size OPS 0.5MB 70-100 100KB 350-500 10KB 3500-5000 ?? 1KB 35000-50000 ???? This is way to high in practice - Contention! Saturday, June 30, 12
  • 15. Some More Numbers • Under real world conditions the rate is less, more like 15MB/s or less ‣ Thread contention is cause for massive slow down Cell Size OPS 0.5MB 10 100KB 100 10KB 800 1KB 6000 Saturday, June 30, 12
  • 16. Notes • Compute memstore sizes based on number of regions x flush size • Compute number of logs to keep based on fill and flush rate • Ultimately the capacity is driven by ‣ Java Heap ‣ Region Count and Size ‣ Key Distribution Saturday, June 30, 12
  • 17. Cheat Sheet #1 • Ensure you have enough or large enough write-ahead logs • Ensure you do not oversubscribe available memstore space • Ensure to set flush size large enough but not too large • Check write-ahead log usage carefully Saturday, June 30, 12
  • 18. Cheat Sheet #2 • Enable compression to store more data per node • Tweak compaction algorithm to peg background I/O at some level • Consider putting uneven column families in separate tables • Check metrics carefully for block cache, memstore, and all queues Saturday, June 30, 12
  • 19. Example • Java Xmx heap at 10GB • Memstore share at 40% (default) ‣ 10GB Heap x 0.4 = 4GB • Desired flush size at 128MB ‣ 4GB / 128MB = 32 regions max! • For WAL size of 128MB x 0.95% ‣ 4GB / (128MB x 0.95) = ~33 partially uncommitted logs to keep around • Region size at 20GB ‣ 20GB x 32 regions = 640GB raw storage used Saturday, June 30, 12