SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
HBase
           Introduction to
           column oriented
           databases




        Luís Cipriani
        @lfcipriani (twitter, linkedin, github, ...)
        22o. GURU (2012-02-25) - Sao Paulo/Brazil

sexta-feira, 24 de fevereiro de 12
ME




sexta-feira, 24 de fevereiro de 12
intro




                                     “A BigTable HBase is a sparse,
                                         distributed, persistent
                                     multidimensional sorted map”




                                                     http://research.google.com/archive/bigtable.html
sexta-feira, 24 de fevereiro de 12
intro > data model
                           {                           <-- table
                                // ...
                                "aaaaa" : {            <--   row
                                   "A" : {             <--   column family
                                      "foo" : {        <--   column (qualifier)
                                        15: "y",       <--   timestamp, value
                                         4: "m"
                                      }
                                      "bar" : {...}
                                   },
                                   "B" : {
                                      "" : {...}
                                   }
                                },
                                "aaaab" : {
                                   "A" : {
                                      "foo" : {...},
                                      "bar" : {...},
                                      "joe" : {...}
                                   },
                                   "B" : {
                                      "" : {...}
                                   }
                                },
                                // ...
                           }
sexta-feira, 24 de fevereiro de 12
intro > data model




      (Table, RowKey, Family, Column, Timestamp) → Value




sexta-feira, 24 de fevereiro de 12
intro > hadoop stack




                          • hadoop HDFS (or not)
                          • hadoop MapReduce
                          • hadoop ZooKeeper
                          • hadoop HBase
                          • hadoop Hue, Whirr, etc...



sexta-feira, 24 de fevereiro de 12
architecture




sexta-feira, 24 de fevereiro de 12
key design > read/write model




               • randon reads (get)
               • sequential reads (scan)
                 • partial key scans
               • writes (put = update)



sexta-feira, 24 de fevereiro de 12
key design > storage model




                                     http://ofps.oreilly.com/titles/9781449396107/advanced.html
sexta-feira, 24 de fevereiro de 12
key design > strategies


                                 • tall-narrow vs flat-wide
                                 • partial key scans
                                 • pagination
                                 • time series
                                    • salting
                                    • field swap
                                    • randomization
                                 • secondary indexes

sexta-feira, 24 de fevereiro de 12
key design > example




sexta-feira, 24 de fevereiro de 12
development




            • installation modes
               • standalone, pseudo-distributed, distributed
            • JRuby console
            • Access
               • java/jruby API (more features)
               • entrypoints REST, Thrift, Avro, Protobuffers
               • there several other libs


sexta-feira, 24 de fevereiro de 12
cons




            • complex config and maintenance
            • hot regions
            • no secondary index built-in
            • no transactions built-in
            • complex schema design




sexta-feira, 24 de fevereiro de 12
pros




               • distributed
               • scalable (auto-sharding)
               • built on Hadoop stack
               • handles Big Data
               • high performance for write and read
               • no SPOF
               • fault tolerant, no data loss
               • active community



sexta-feira, 24 de fevereiro de 12
Reformulação Box de Login                                                       Abril ID
   http://engineering.abril.com.br/
   http://abr.io/hbase-intro
   https://pinboard.in/u:lfcipriani/t:hbase/
   http://hbase.apache.org/




                                     ?   http://shop.oreilly.com/product/0636920014348.do




sexta-feira, 24 de fevereiro de 12

Contenu connexe

Similaire à Hbase: Introduction to column oriented databases

Hive introduction 介绍
Hive  introduction 介绍Hive  introduction 介绍
Hive introduction 介绍ablozhou
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars GeorgeJAX London
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 
Brust hadoopecosystem
Brust hadoopecosystemBrust hadoopecosystem
Brust hadoopecosystemAndrew Brust
 
Generalized framework for using NoSQL Databases
Generalized framework for using NoSQL DatabasesGeneralized framework for using NoSQL Databases
Generalized framework for using NoSQL DatabasesKIRAN V
 
Hadoop/Mahout/HBaseで テキスト分類器を作ったよ
Hadoop/Mahout/HBaseで テキスト分類器を作ったよHadoop/Mahout/HBaseで テキスト分類器を作ったよ
Hadoop/Mahout/HBaseで テキスト分類器を作ったよNaoki Yanai
 
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) BigDataEverywhere
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtMichael Stack
 
HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017larsgeorge
 
Hadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureHadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureRyan Hennig
 

Similaire à Hbase: Introduction to column oriented databases (14)

Hive introduction 介绍
Hive  introduction 介绍Hive  introduction 介绍
Hive introduction 介绍
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Brust hadoopecosystem
Brust hadoopecosystemBrust hadoopecosystem
Brust hadoopecosystem
 
Generalized framework for using NoSQL Databases
Generalized framework for using NoSQL DatabasesGeneralized framework for using NoSQL Databases
Generalized framework for using NoSQL Databases
 
Hadoop/Mahout/HBaseで テキスト分類器を作ったよ
Hadoop/Mahout/HBaseで テキスト分類器を作ったよHadoop/Mahout/HBaseで テキスト分類器を作ったよ
Hadoop/Mahout/HBaseで テキスト分類器を作ったよ
 
20100128ebay
20100128ebay20100128ebay
20100128ebay
 
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
 
HBase, no trouble
HBase, no troubleHBase, no trouble
HBase, no trouble
 
Hadoop - Introduction to Hadoop
Hadoop - Introduction to HadoopHadoop - Introduction to Hadoop
Hadoop - Introduction to Hadoop
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the Art
 
HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017
 
MongoDB at GUL
MongoDB at GULMongoDB at GUL
MongoDB at GUL
 
Hadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureHadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and Future
 

Plus de Luis Cipriani

Adventures with Raspberry Pi and Twitter API
Adventures with Raspberry Pi and Twitter APIAdventures with Raspberry Pi and Twitter API
Adventures with Raspberry Pi and Twitter APILuis Cipriani
 
Capturando o pulso do planeta com as APIs de Streaming do Twitter
Capturando o pulso do planeta com as APIs de Streaming do TwitterCapturando o pulso do planeta com as APIs de Streaming do Twitter
Capturando o pulso do planeta com as APIs de Streaming do TwitterLuis Cipriani
 
Twitter e suas APIs de Streaming - Campus Party Brasil 7
Twitter e suas APIs de Streaming - Campus Party Brasil 7Twitter e suas APIs de Streaming - Campus Party Brasil 7
Twitter e suas APIs de Streaming - Campus Party Brasil 7Luis Cipriani
 
Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupados
Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupadosSegurança de APIs HTTP, um guia sensato para desenvolvedores preocupados
Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupadosLuis Cipriani
 
API Caching, why your server needs some rest
API Caching, why your server needs some restAPI Caching, why your server needs some rest
API Caching, why your server needs some restLuis Cipriani
 
Explaining A Programming Model for Context-Aware Applications in Large-Scale ...
Explaining A Programming Model for Context-Aware Applications in Large-Scale ...Explaining A Programming Model for Context-Aware Applications in Large-Scale ...
Explaining A Programming Model for Context-Aware Applications in Large-Scale ...Luis Cipriani
 
Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...
Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...
Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...Luis Cipriani
 
Como um verdadeiro sistema REST funciona: arquitetura e performance na Abril
Como um verdadeiro sistema REST funciona: arquitetura e performance na AbrilComo um verdadeiro sistema REST funciona: arquitetura e performance na Abril
Como um verdadeiro sistema REST funciona: arquitetura e performance na AbrilLuis Cipriani
 
Explaining Semantic Web
Explaining Semantic WebExplaining Semantic Web
Explaining Semantic WebLuis Cipriani
 
Fearless HTTP requests abuse
Fearless HTTP requests abuseFearless HTTP requests abuse
Fearless HTTP requests abuseLuis Cipriani
 

Plus de Luis Cipriani (10)

Adventures with Raspberry Pi and Twitter API
Adventures with Raspberry Pi and Twitter APIAdventures with Raspberry Pi and Twitter API
Adventures with Raspberry Pi and Twitter API
 
Capturando o pulso do planeta com as APIs de Streaming do Twitter
Capturando o pulso do planeta com as APIs de Streaming do TwitterCapturando o pulso do planeta com as APIs de Streaming do Twitter
Capturando o pulso do planeta com as APIs de Streaming do Twitter
 
Twitter e suas APIs de Streaming - Campus Party Brasil 7
Twitter e suas APIs de Streaming - Campus Party Brasil 7Twitter e suas APIs de Streaming - Campus Party Brasil 7
Twitter e suas APIs de Streaming - Campus Party Brasil 7
 
Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupados
Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupadosSegurança de APIs HTTP, um guia sensato para desenvolvedores preocupados
Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupados
 
API Caching, why your server needs some rest
API Caching, why your server needs some restAPI Caching, why your server needs some rest
API Caching, why your server needs some rest
 
Explaining A Programming Model for Context-Aware Applications in Large-Scale ...
Explaining A Programming Model for Context-Aware Applications in Large-Scale ...Explaining A Programming Model for Context-Aware Applications in Large-Scale ...
Explaining A Programming Model for Context-Aware Applications in Large-Scale ...
 
Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...
Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...
Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...
 
Como um verdadeiro sistema REST funciona: arquitetura e performance na Abril
Como um verdadeiro sistema REST funciona: arquitetura e performance na AbrilComo um verdadeiro sistema REST funciona: arquitetura e performance na Abril
Como um verdadeiro sistema REST funciona: arquitetura e performance na Abril
 
Explaining Semantic Web
Explaining Semantic WebExplaining Semantic Web
Explaining Semantic Web
 
Fearless HTTP requests abuse
Fearless HTTP requests abuseFearless HTTP requests abuse
Fearless HTTP requests abuse
 

Dernier

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Dernier (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Hbase: Introduction to column oriented databases

  • 1. HBase Introduction to column oriented databases Luís Cipriani @lfcipriani (twitter, linkedin, github, ...) 22o. GURU (2012-02-25) - Sao Paulo/Brazil sexta-feira, 24 de fevereiro de 12
  • 2. ME sexta-feira, 24 de fevereiro de 12
  • 3. intro “A BigTable HBase is a sparse, distributed, persistent multidimensional sorted map” http://research.google.com/archive/bigtable.html sexta-feira, 24 de fevereiro de 12
  • 4. intro > data model { <-- table // ... "aaaaa" : { <-- row "A" : { <-- column family "foo" : { <-- column (qualifier) 15: "y", <-- timestamp, value 4: "m" } "bar" : {...} }, "B" : { "" : {...} } }, "aaaab" : { "A" : { "foo" : {...}, "bar" : {...}, "joe" : {...} }, "B" : { "" : {...} } }, // ... } sexta-feira, 24 de fevereiro de 12
  • 5. intro > data model (Table, RowKey, Family, Column, Timestamp) → Value sexta-feira, 24 de fevereiro de 12
  • 6. intro > hadoop stack • hadoop HDFS (or not) • hadoop MapReduce • hadoop ZooKeeper • hadoop HBase • hadoop Hue, Whirr, etc... sexta-feira, 24 de fevereiro de 12
  • 8. key design > read/write model • randon reads (get) • sequential reads (scan) • partial key scans • writes (put = update) sexta-feira, 24 de fevereiro de 12
  • 9. key design > storage model http://ofps.oreilly.com/titles/9781449396107/advanced.html sexta-feira, 24 de fevereiro de 12
  • 10. key design > strategies • tall-narrow vs flat-wide • partial key scans • pagination • time series • salting • field swap • randomization • secondary indexes sexta-feira, 24 de fevereiro de 12
  • 11. key design > example sexta-feira, 24 de fevereiro de 12
  • 12. development • installation modes • standalone, pseudo-distributed, distributed • JRuby console • Access • java/jruby API (more features) • entrypoints REST, Thrift, Avro, Protobuffers • there several other libs sexta-feira, 24 de fevereiro de 12
  • 13. cons • complex config and maintenance • hot regions • no secondary index built-in • no transactions built-in • complex schema design sexta-feira, 24 de fevereiro de 12
  • 14. pros • distributed • scalable (auto-sharding) • built on Hadoop stack • handles Big Data • high performance for write and read • no SPOF • fault tolerant, no data loss • active community sexta-feira, 24 de fevereiro de 12
  • 15. Reformulação Box de Login Abril ID http://engineering.abril.com.br/ http://abr.io/hbase-intro https://pinboard.in/u:lfcipriani/t:hbase/ http://hbase.apache.org/ ? http://shop.oreilly.com/product/0636920014348.do sexta-feira, 24 de fevereiro de 12