SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
Growing Your Inbox:
                      HBase at Tumblr

     HBaseCon 2012

Friday, May 11, 12
Monthly page views


               1 8 , 0 0 0, 0 0 0, 0 0 0
                                600M page views per day

                               60-80M new posts per day

                     Peak rate of 50k requests & 2k posts per second

                        Totals: 22B posts, 53M blogs, 45M users

                     23 in Engineering (7 ops, 4 net/dc, 4 SRE, 8 dev)

                       70% of traffic is destined for the dashboard

     HBaseCon 2012

Friday, May 11, 12
Posts and followers drive growth



                     Posts


                Average
               Followers




     HBaseCon 2012

Friday, May 11, 12
Traffic growth
                     Page
                     Views

                              Fun Month   Dashboard
                               Begins      Creaking




                     People




     HBaseCon 2012

Friday, May 11, 12
One small problem
               The dashboard is assembled using a time ordered
               essentially unpartitioned MySQL database without
               caching. Sharding and caching aren’t reasonable
               approaches. Write concurrency will soon introduce slave
               lag, rendering the dashboard unusable.




     HBaseCon 2012

Friday, May 11, 12
Motherboy Motivations and Goals
            Motivations                                       Goals
              ✦      Awesome Arrested Development reference   ✦   Durability and availability

              ✦      The dashboard is going to die            ✦   Multi-datacenter tenancy

              ✦      Time ordered MySQL sharding == bad       ✦   Features (read/unread, tags, etc)

              ✦      Inconsistent dashboard experience




     HBaseCon 2012

Friday, May 11, 12
Challenges
              ✦      Response time < 100ms

              ✦      Data volume

              ✦      Active legacy PHP code base

              ✦      Hardware provisioning, failures, etc.

              ✦      Small team, not much in house JVM/Hadoop experience




     HBaseCon 2012

Friday, May 11, 12
One small problem
            Assumptions                                                 Data Set Growth
              ✦      60M posts per day                                             Rows          Size
                                                                        Day        30 Billion    447 GB
              ✦      500 followers on average
                                                                        Week       210 Billion   3.1 TB
              ✦      40 bytes per row, 24 byte row key                  Month      840 Billion   12.2 TB
              ✦      Compression factor of 0.4                          Year       10 Trillion   146 TB

              ✦      Replication factor of 3.5 (3 plus scratch space)   Data Set Growth (Replicated)
                                                                                   Size
                                                                        Day        1.5 TB
                                                                        Week       10.7 TB
                                                                        Month      42.8 TB
                                                                        Year       513 TB

     HBaseCon 2012

Friday, May 11, 12
Motherboy Architecture
            Goals                                                       Failure Handling
              ✦      Dashboard posts are persistent for each follower   ✦   Reads can be fulfilled from any store
                     (their inbox)
                                                                        ✦   Writes can catch up when a store is back online
              ✦      Writes are asynchronous and distributed

              ✦      Users home to an inbox

              ✦      Inboxes are eventually consistent

              ✦      Selective materialization of feeds




     HBaseCon 2012

Friday, May 11, 12
Motherboy V0
            Implementation
              ✦      Goal: Get our feet wet!

              ✦      Single 6 node HBase cluster

              ✦      Writes to Motherboy via Finagle/Thrift RPC service

              ✦      User homed to Motherboy cluster




     HBaseCon 2012

Friday, May 11, 12
HBaseCon 2012

Friday, May 11, 12
Motherboy V0 learnings
              ✦      Conclusion: We needed more experience

              ✦      We had no clue what we were doing

              ✦      Poor automation ➜ Hard to recover in event of failure

              ✦      No test infrastructure ➜ Hard to evaluate effects of cluster performance tuning

              ✦      Very little insight into environment




     HBaseCon 2012

Friday, May 11, 12
Sidebar: OpenTSDB
              ✦      Goal: Get some hands on experience running a production cluster

              ✦      Build out fully automated cluster provisioning

              ✦      Build out a cluster monitoring infrastructure

              ✦      Build out cluster trending/visualization




     HBaseCon 2012

Friday, May 11, 12
Automated provisioning




     HBaseCon 2012

Friday, May 11, 12
Automated monitoring




     HBaseCon 2012

Friday, May 11, 12
Automated monitoring




     HBaseCon 2012

Friday, May 11, 12
OpenTSDB: Cluster 0
              ✦      10k data points per second, 500k at peak

              ✦      1200 servers reporting via collectd

              ✦      Application stats via collectd

              ✦      864M data points per day

              ✦      10 node cluster

              ✦      Making progress




     HBaseCon 2012

Friday, May 11, 12
Motherboy V1: Try again
            Implementation
              ✦      Goal: Into testing!

              ✦      Single 10 node HBase cluster

              ✦      No writes directly to Motherboy, consumed via Firehose

              ✦      Enable reads via feature flag for engineers

              ✦      Ignore discovery mechanism for now




     HBaseCon 2012

Friday, May 11, 12
HBaseCon 2012

Friday, May 11, 12
Motherboy V1 learnings
            The black-art of tuning                                 Challenges
              ✦      Disable major compactions                      ✦   Region sizing (# regions & size) is crucial and
                                                                        workload dependent
              ✦      Disable auto-splits, self manage
                                                                    ✦   YCSB was valuable for load testing but required
              ✦      regionserver.handler.count                         extensive modification
              ✦      hregion.max.filesize                           ✦   Too hard to test changes, versions, etc.
              ✦      hregion.memstore.mslab.enabled                 ✦   Failure recovery is manual and time consuming
              ✦      block.cache.size

              ✦      Table design is super important, a few bytes
                     matter



     HBaseCon 2012

Friday, May 11, 12
Sidebar:
            Cluster testing
              ✦      Goal: Make it easy!

              ✦      Fixture data

              ✦      Load test tools

              ✦      Historical data, look for regressions

              ✦      Create a baseline!

              ✦      Test different versions, patches,
                     schema, compression, split methods,
                     configurations




     HBaseCon 2012

Friday, May 11, 12
Testing setup
            Overview                                                  Motherboy Testing
              ✦      Standard test cluster: 6 RS/DN, 1HM, 1NN, 3ZK    ✦   60M posts, 24GB LZO compressed

              ✦      Drive load via separate 23 node Hadoop cluster   ✦   51 mappers

              ✦      Test dataset created for each application        ✦   Map posts into tuple of Long’s - 24 byte row key
                                                                          and 16 bytes of data in single CF
              ✦      Results recorded and annotated in OpenTSDB
                                                                      ✦   Write to HBase cluster as fast as possible




     HBaseCon 2012

Friday, May 11, 12
Sample test results
            Scenario                       Test Time              Posts per Second Avg. Request Time Avg. CPU Load
            Baseline                       2:49:46                5801.4           3.7ms                15%
            Handler Count = 100            2:39:57                6157.5           3.4ms                12%
            Disabled Auto Flush            1:58:53                8284.5           3.4ms                7%
            Pre-Created Regions            30:08                  32684.3          2.2ms                3%

              Conclusions
                                                                               Tests must be designed for intended workload
                     ✦   More region servers, better throughput

                     ✦   Less round-trips, better throughput

                     ✦   Not earth shattering

                     ✦   If experimenting is easy, people will do it

     HBaseCon 2012

Friday, May 11, 12
Motherboy V2: In progress
            Implementation
              ✦      Goal: Into production!

              ✦      Multiple 10 node HBase clusters

              ✦      Writes/reads mediated by reactor layer

              ✦      Supervisor managed migrations and region splits




     HBaseCon 2012

Friday, May 11, 12
HBaseCon 2012

Friday, May 11, 12
Lessons learned
              1. The value of automation can’t be understated

              2. Rolling upgrades and deploys are easy, but refer to #1

              3. Technology choice does play a role in how easily you can scale

              4. At scale, nothing works as advertised

              5. Operational tooling is the single most important thing you can do up front




     HBaseCon 2012

Friday, May 11, 12
Questions?
                     Blake Matheny: bmatheny@tumblr.com




     HBaseCon 2012

Friday, May 11, 12

Contenu connexe

En vedette

HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashCloudera, Inc.
 
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!Cloudera, Inc.
 
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon
 
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponCloudera, Inc.
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera FieldHBaseCon
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics Cloudera, Inc.
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...Cloudera, Inc.
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...Cloudera, Inc.
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.Cloudera, Inc.
 
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseHBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseCloudera, Inc.
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsCloudera, Inc.
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseCloudera, Inc.
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesCloudera, Inc.
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCCloudera, Inc.
 
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataHBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataCloudera, Inc.
 
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestCloudera, Inc.
 
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to ContributeHBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to ContributeHBaseCon
 
Data Evolution in HBase
Data Evolution in HBaseData Evolution in HBase
Data Evolution in HBaseHBaseCon
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseCloudera, Inc.
 
HBaseCon 2015: Just the Basics
HBaseCon 2015: Just the BasicsHBaseCon 2015: Just the Basics
HBaseCon 2015: Just the BasicsHBaseCon
 

En vedette (20)

HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on Flash
 
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
 
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
 
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera Field
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
 
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBaseHBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBase
 
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three Acts
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 Minutes
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
 
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataHBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
 
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at Pinterest
 
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to ContributeHBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to Contribute
 
Data Evolution in HBase
Data Evolution in HBaseData Evolution in HBase
Data Evolution in HBase
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBase
 
HBaseCon 2015: Just the Basics
HBaseCon 2015: Just the BasicsHBaseCon 2015: Just the Basics
HBaseCon 2015: Just the Basics
 

Plus de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Plus de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Dernier

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 

Dernier (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 

HBaseCon 2012 | Growing Your Inbox, HBase at Tumblr

  • 1. Growing Your Inbox: HBase at Tumblr HBaseCon 2012 Friday, May 11, 12
  • 2. Monthly page views 1 8 , 0 0 0, 0 0 0, 0 0 0 600M page views per day 60-80M new posts per day Peak rate of 50k requests & 2k posts per second Totals: 22B posts, 53M blogs, 45M users 23 in Engineering (7 ops, 4 net/dc, 4 SRE, 8 dev) 70% of traffic is destined for the dashboard HBaseCon 2012 Friday, May 11, 12
  • 3. Posts and followers drive growth Posts Average Followers HBaseCon 2012 Friday, May 11, 12
  • 4. Traffic growth Page Views Fun Month Dashboard Begins Creaking People HBaseCon 2012 Friday, May 11, 12
  • 5. One small problem The dashboard is assembled using a time ordered essentially unpartitioned MySQL database without caching. Sharding and caching aren’t reasonable approaches. Write concurrency will soon introduce slave lag, rendering the dashboard unusable. HBaseCon 2012 Friday, May 11, 12
  • 6. Motherboy Motivations and Goals Motivations Goals ✦ Awesome Arrested Development reference ✦ Durability and availability ✦ The dashboard is going to die ✦ Multi-datacenter tenancy ✦ Time ordered MySQL sharding == bad ✦ Features (read/unread, tags, etc) ✦ Inconsistent dashboard experience HBaseCon 2012 Friday, May 11, 12
  • 7. Challenges ✦ Response time < 100ms ✦ Data volume ✦ Active legacy PHP code base ✦ Hardware provisioning, failures, etc. ✦ Small team, not much in house JVM/Hadoop experience HBaseCon 2012 Friday, May 11, 12
  • 8. One small problem Assumptions Data Set Growth ✦ 60M posts per day Rows Size Day 30 Billion 447 GB ✦ 500 followers on average Week 210 Billion 3.1 TB ✦ 40 bytes per row, 24 byte row key Month 840 Billion 12.2 TB ✦ Compression factor of 0.4 Year 10 Trillion 146 TB ✦ Replication factor of 3.5 (3 plus scratch space) Data Set Growth (Replicated) Size Day 1.5 TB Week 10.7 TB Month 42.8 TB Year 513 TB HBaseCon 2012 Friday, May 11, 12
  • 9. Motherboy Architecture Goals Failure Handling ✦ Dashboard posts are persistent for each follower ✦ Reads can be fulfilled from any store (their inbox) ✦ Writes can catch up when a store is back online ✦ Writes are asynchronous and distributed ✦ Users home to an inbox ✦ Inboxes are eventually consistent ✦ Selective materialization of feeds HBaseCon 2012 Friday, May 11, 12
  • 10. Motherboy V0 Implementation ✦ Goal: Get our feet wet! ✦ Single 6 node HBase cluster ✦ Writes to Motherboy via Finagle/Thrift RPC service ✦ User homed to Motherboy cluster HBaseCon 2012 Friday, May 11, 12
  • 12. Motherboy V0 learnings ✦ Conclusion: We needed more experience ✦ We had no clue what we were doing ✦ Poor automation ➜ Hard to recover in event of failure ✦ No test infrastructure ➜ Hard to evaluate effects of cluster performance tuning ✦ Very little insight into environment HBaseCon 2012 Friday, May 11, 12
  • 13. Sidebar: OpenTSDB ✦ Goal: Get some hands on experience running a production cluster ✦ Build out fully automated cluster provisioning ✦ Build out a cluster monitoring infrastructure ✦ Build out cluster trending/visualization HBaseCon 2012 Friday, May 11, 12
  • 14. Automated provisioning HBaseCon 2012 Friday, May 11, 12
  • 15. Automated monitoring HBaseCon 2012 Friday, May 11, 12
  • 16. Automated monitoring HBaseCon 2012 Friday, May 11, 12
  • 17. OpenTSDB: Cluster 0 ✦ 10k data points per second, 500k at peak ✦ 1200 servers reporting via collectd ✦ Application stats via collectd ✦ 864M data points per day ✦ 10 node cluster ✦ Making progress HBaseCon 2012 Friday, May 11, 12
  • 18. Motherboy V1: Try again Implementation ✦ Goal: Into testing! ✦ Single 10 node HBase cluster ✦ No writes directly to Motherboy, consumed via Firehose ✦ Enable reads via feature flag for engineers ✦ Ignore discovery mechanism for now HBaseCon 2012 Friday, May 11, 12
  • 20. Motherboy V1 learnings The black-art of tuning Challenges ✦ Disable major compactions ✦ Region sizing (# regions & size) is crucial and workload dependent ✦ Disable auto-splits, self manage ✦ YCSB was valuable for load testing but required ✦ regionserver.handler.count extensive modification ✦ hregion.max.filesize ✦ Too hard to test changes, versions, etc. ✦ hregion.memstore.mslab.enabled ✦ Failure recovery is manual and time consuming ✦ block.cache.size ✦ Table design is super important, a few bytes matter HBaseCon 2012 Friday, May 11, 12
  • 21. Sidebar: Cluster testing ✦ Goal: Make it easy! ✦ Fixture data ✦ Load test tools ✦ Historical data, look for regressions ✦ Create a baseline! ✦ Test different versions, patches, schema, compression, split methods, configurations HBaseCon 2012 Friday, May 11, 12
  • 22. Testing setup Overview Motherboy Testing ✦ Standard test cluster: 6 RS/DN, 1HM, 1NN, 3ZK ✦ 60M posts, 24GB LZO compressed ✦ Drive load via separate 23 node Hadoop cluster ✦ 51 mappers ✦ Test dataset created for each application ✦ Map posts into tuple of Long’s - 24 byte row key and 16 bytes of data in single CF ✦ Results recorded and annotated in OpenTSDB ✦ Write to HBase cluster as fast as possible HBaseCon 2012 Friday, May 11, 12
  • 23. Sample test results Scenario Test Time Posts per Second Avg. Request Time Avg. CPU Load Baseline 2:49:46 5801.4 3.7ms 15% Handler Count = 100 2:39:57 6157.5 3.4ms 12% Disabled Auto Flush 1:58:53 8284.5 3.4ms 7% Pre-Created Regions 30:08 32684.3 2.2ms 3% Conclusions Tests must be designed for intended workload ✦ More region servers, better throughput ✦ Less round-trips, better throughput ✦ Not earth shattering ✦ If experimenting is easy, people will do it HBaseCon 2012 Friday, May 11, 12
  • 24. Motherboy V2: In progress Implementation ✦ Goal: Into production! ✦ Multiple 10 node HBase clusters ✦ Writes/reads mediated by reactor layer ✦ Supervisor managed migrations and region splits HBaseCon 2012 Friday, May 11, 12
  • 26. Lessons learned 1. The value of automation can’t be understated 2. Rolling upgrades and deploys are easy, but refer to #1 3. Technology choice does play a role in how easily you can scale 4. At scale, nothing works as advertised 5. Operational tooling is the single most important thing you can do up front HBaseCon 2012 Friday, May 11, 12
  • 27. Questions? Blake Matheny: bmatheny@tumblr.com HBaseCon 2012 Friday, May 11, 12