SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
Felipe Ferreira
Conhecendo o
Natural Partner for Innovation
felipe.ferreira@indt.org.br
• NoSQL datastore built on top of HDFS (Hadoop)
• An Apache Top Level Project
• The goal is the hosting of very large tables (billions of
rows X millions of columns)
• Based on Google’s BigTable paper
What Is HBase?
• Storing large amounts of data (TB/PB)
• High throughput for a large number of requests
• Storing unstructured or variable column data
• Big Data with random read and writes
Why Use HBase?
• Only use with Big Data problems
• Read straight through files
• Write all at once or append new files
– Not random reads or writes
• Access patterns of the data are ill-defined
When to Consider Not Using HBase?
• More complete list at http://wiki.apache.org/hadoop/Hbase/PoweredBy
Hbase in production
HBase Architecture – How It works
• HBase Master
• RegionServer
• ZooKeeper
• HDFS
– NameNode/Standby NameNode
– DataNode
Meet the Daemons
Daemon Locations
Tables and Column Families
Rows and Columns
Regions
Regions
Write Path
Read Path
HBase API – How to access the data
• Data is not accessed over SQL
• You must:
– Create your own connections
– Keep track of the type of data in a column
– Give each row a key
– Access a row by its key
No SQL Means No SQL
• Gets
– Gets a row’s data based on the row key
• Puts
– Update/inserts a row with data based on the row key
• Scans
– Finds all matching rows based on the row key
– Scan logic can be increased by using filters
Types of Access
Gets
Puts
Puts
HBase Schema Design – How to design
• Designing schemas for HBase requires an in-depth knowledge
• Schema Design is ‘data-centric’ not ‘relationship-centric’
• You design around how data is accessed
• Row keys are engineered
No SQL Means No SQL
• A row key is more than the glue between two tables
• Engineering time is spent just on constructing a row key
– Contents of a row key vary by access pattern
– Often made up of several pieces of data
Row Keys
• Schema design does not start in an ERD
• Access pattern must be known and ascertained
• Denormalize to improve performance
– Fewer, bigger tables
Schema Design
HBase in production - examples
• Use of HBase to integrate SMS, chat, email and Facebook Messages into
one inbox
• HydraBase – The evolution of HBase@Facebook
• HBase provides a distributed, read/write backup of all mysql tables in
Twitter's production
• A number of applications including people search rely on HBase internally
for data generation
• Additionally, the operations team uses HBase as a timeseries database for
cluster-wide monitoring/performance data
• Uses HBase as a foundation for cloud scale storage for a variety of
applications
• Uses HBase to build a graph service for global web threat entities
evaluation and reputation
Internal Use Only
Non-profit R&D Center
founded by Nokia in 2001 in Brazil
Focused on projects
delivering solutions and products in the mobile
technology area
Technical team of 200+
Located in Brazil
Manaus | Brasilia | Recife | São Paulo
50+
invention reports
accepted by
Nokia/Microsoft to file
patent application
500+
items of scientific
production
300+
completed projects
Internal Use Only
OUR
CERTIFICATIONS
Internal Use Only
OUR
AWARDS
Eco System Saving Tips (app)
Mobile World Congress 2012
Facelock1st prize
London Hackathon | Nokia World 2010
Audio Aid
1st prize |Forum Nokia
Calling All Innovators 2009
Microsoft Data Gathering
Tele.Síntese
2012 & 2013
award
• About training in Big Data (Developer, Analyst, Admin):
http://www.indt.org/servicos/treinamentos/hadoop-developer
http://www.indt.org/servicos/treinamentos/hadoop-analyst
http://www.indt.org/servicos/treinamentos/hadoop-admin
• About Hbase
http://hbase.apache.org/
• About INDT:
http://www.indt.org
communications@indt.org.br
• About Hortonworks:
http://www.hortonworks.com
communications@indt.org.br
INFOS + CONTACT

Contenu connexe

Tendances

Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : BeginnersShweta Patnaik
 
Hadoop, Infrastructure and Stack
Hadoop, Infrastructure and StackHadoop, Infrastructure and Stack
Hadoop, Infrastructure and StackJohn Dougherty
 
H-Base in Data Base Mangement System
H-Base in Data Base Mangement SystemH-Base in Data Base Mangement System
H-Base in Data Base Mangement SystemPreetham Devisetty
 
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use caseBDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use caseDavid Lauzon
 
BDM8 - Near-realtime Big Data Analytics using Impala
BDM8 - Near-realtime Big Data Analytics using ImpalaBDM8 - Near-realtime Big Data Analytics using Impala
BDM8 - Near-realtime Big Data Analytics using ImpalaDavid Lauzon
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012larsgeorge
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big DataAndrew Brust
 
Big data Intro by Kaushik Dutta
Big data Intro by Kaushik DuttaBig data Intro by Kaushik Dutta
Big data Intro by Kaushik DuttaKaushik Dutta
 
Getting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsightGetting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsightNilesh Gule
 
Lviv EDGE 2 - NoSQL
Lviv EDGE 2 - NoSQLLviv EDGE 2 - NoSQL
Lviv EDGE 2 - NoSQLzenyk
 
ODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" SourcesODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" SourcesMark Rittman
 
Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Daniel Abadi
 

Tendances (20)

HBase
HBaseHBase
HBase
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Hadoop, Infrastructure and Stack
Hadoop, Infrastructure and StackHadoop, Infrastructure and Stack
Hadoop, Infrastructure and Stack
 
H-Base in Data Base Mangement System
H-Base in Data Base Mangement SystemH-Base in Data Base Mangement System
H-Base in Data Base Mangement System
 
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use caseBDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
 
BDM8 - Near-realtime Big Data Analytics using Impala
BDM8 - Near-realtime Big Data Analytics using ImpalaBDM8 - Near-realtime Big Data Analytics using Impala
BDM8 - Near-realtime Big Data Analytics using Impala
 
Hadoop jon
Hadoop jonHadoop jon
Hadoop jon
 
Introduction to hbase
Introduction to hbaseIntroduction to hbase
Introduction to hbase
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big Data
 
Big data stores
Big data  storesBig data  stores
Big data stores
 
Big data Intro by Kaushik Dutta
Big data Intro by Kaushik DuttaBig data Intro by Kaushik Dutta
Big data Intro by Kaushik Dutta
 
Getting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsightGetting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsight
 
Apache Drill at ApacheCon2014
Apache Drill at ApacheCon2014Apache Drill at ApacheCon2014
Apache Drill at ApacheCon2014
 
Lviv EDGE 2 - NoSQL
Lviv EDGE 2 - NoSQLLviv EDGE 2 - NoSQL
Lviv EDGE 2 - NoSQL
 
ODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" SourcesODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" Sources
 
Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012
 
Hadoop
HadoopHadoop
Hadoop
 

Similaire à Conhecendo o Apache HBase

Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars GeorgeJAX London
 
Techincal Talk Hbase-Ditributed,no-sql database
Techincal Talk Hbase-Ditributed,no-sql databaseTechincal Talk Hbase-Ditributed,no-sql database
Techincal Talk Hbase-Ditributed,no-sql databaseRishabh Dugar
 
Nyc hadoop meetup introduction to h base
Nyc hadoop meetup   introduction to h baseNyc hadoop meetup   introduction to h base
Nyc hadoop meetup introduction to h base智杰 付
 
Unit II Hadoop Ecosystem_Updated.pptx
Unit II Hadoop Ecosystem_Updated.pptxUnit II Hadoop Ecosystem_Updated.pptx
Unit II Hadoop Ecosystem_Updated.pptxBhavanaHotchandani
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
Big data solutions in azure
Big data solutions in azureBig data solutions in azure
Big data solutions in azureMostafa
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singhMayank Singh
 
Big Data and Cloud Computing
Big Data and Cloud ComputingBig Data and Cloud Computing
Big Data and Cloud ComputingFarzad Nozarian
 
Big data solutions in Azure
Big data solutions in AzureBig data solutions in Azure
Big data solutions in AzureMostafa
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsAndrew Brust
 
Building Big data solutions in Azure
Building Big data solutions in AzureBuilding Big data solutions in Azure
Building Big data solutions in AzureMostafa
 

Similaire à Conhecendo o Apache HBase (20)

Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
Techincal Talk Hbase-Ditributed,no-sql database
Techincal Talk Hbase-Ditributed,no-sql databaseTechincal Talk Hbase-Ditributed,no-sql database
Techincal Talk Hbase-Ditributed,no-sql database
 
Nyc hadoop meetup introduction to h base
Nyc hadoop meetup   introduction to h baseNyc hadoop meetup   introduction to h base
Nyc hadoop meetup introduction to h base
 
Unit II Hadoop Ecosystem_Updated.pptx
Unit II Hadoop Ecosystem_Updated.pptxUnit II Hadoop Ecosystem_Updated.pptx
Unit II Hadoop Ecosystem_Updated.pptx
 
Apache Hadoop Hive
Apache Hadoop HiveApache Hadoop Hive
Apache Hadoop Hive
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Big data solutions in azure
Big data solutions in azureBig data solutions in azure
Big data solutions in azure
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singh
 
Big Data and Cloud Computing
Big Data and Cloud ComputingBig Data and Cloud Computing
Big Data and Cloud Computing
 
Hive
HiveHive
Hive
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
Impala for PhillyDB Meetup
Impala for PhillyDB MeetupImpala for PhillyDB Meetup
Impala for PhillyDB Meetup
 
Hadoop - Apache Hbase
Hadoop - Apache HbaseHadoop - Apache Hbase
Hadoop - Apache Hbase
 
Big data solutions in Azure
Big data solutions in AzureBig data solutions in Azure
Big data solutions in Azure
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
hive.pptx
hive.pptxhive.pptx
hive.pptx
 
Building Big data solutions in Azure
Building Big data solutions in AzureBuilding Big data solutions in Azure
Building Big data solutions in Azure
 
Hive and querying data
Hive and querying dataHive and querying data
Hive and querying data
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Dernier (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Conhecendo o Apache HBase

  • 1. Felipe Ferreira Conhecendo o Natural Partner for Innovation felipe.ferreira@indt.org.br
  • 2. • NoSQL datastore built on top of HDFS (Hadoop) • An Apache Top Level Project • The goal is the hosting of very large tables (billions of rows X millions of columns) • Based on Google’s BigTable paper What Is HBase?
  • 3. • Storing large amounts of data (TB/PB) • High throughput for a large number of requests • Storing unstructured or variable column data • Big Data with random read and writes Why Use HBase?
  • 4. • Only use with Big Data problems • Read straight through files • Write all at once or append new files – Not random reads or writes • Access patterns of the data are ill-defined When to Consider Not Using HBase?
  • 5. • More complete list at http://wiki.apache.org/hadoop/Hbase/PoweredBy Hbase in production
  • 6. HBase Architecture – How It works
  • 7. • HBase Master • RegionServer • ZooKeeper • HDFS – NameNode/Standby NameNode – DataNode Meet the Daemons
  • 9. Tables and Column Families
  • 15. HBase API – How to access the data
  • 16. • Data is not accessed over SQL • You must: – Create your own connections – Keep track of the type of data in a column – Give each row a key – Access a row by its key No SQL Means No SQL
  • 17. • Gets – Gets a row’s data based on the row key • Puts – Update/inserts a row with data based on the row key • Scans – Finds all matching rows based on the row key – Scan logic can be increased by using filters Types of Access
  • 18. Gets
  • 19. Puts
  • 20. Puts
  • 21. HBase Schema Design – How to design
  • 22. • Designing schemas for HBase requires an in-depth knowledge • Schema Design is ‘data-centric’ not ‘relationship-centric’ • You design around how data is accessed • Row keys are engineered No SQL Means No SQL
  • 23. • A row key is more than the glue between two tables • Engineering time is spent just on constructing a row key – Contents of a row key vary by access pattern – Often made up of several pieces of data Row Keys
  • 24. • Schema design does not start in an ERD • Access pattern must be known and ascertained • Denormalize to improve performance – Fewer, bigger tables Schema Design
  • 25. HBase in production - examples
  • 26. • Use of HBase to integrate SMS, chat, email and Facebook Messages into one inbox • HydraBase – The evolution of HBase@Facebook
  • 27. • HBase provides a distributed, read/write backup of all mysql tables in Twitter's production • A number of applications including people search rely on HBase internally for data generation • Additionally, the operations team uses HBase as a timeseries database for cluster-wide monitoring/performance data
  • 28. • Uses HBase as a foundation for cloud scale storage for a variety of applications • Uses HBase to build a graph service for global web threat entities evaluation and reputation
  • 29. Internal Use Only Non-profit R&D Center founded by Nokia in 2001 in Brazil Focused on projects delivering solutions and products in the mobile technology area Technical team of 200+ Located in Brazil Manaus | Brasilia | Recife | São Paulo 50+ invention reports accepted by Nokia/Microsoft to file patent application 500+ items of scientific production 300+ completed projects
  • 31. Internal Use Only OUR AWARDS Eco System Saving Tips (app) Mobile World Congress 2012 Facelock1st prize London Hackathon | Nokia World 2010 Audio Aid 1st prize |Forum Nokia Calling All Innovators 2009 Microsoft Data Gathering Tele.Síntese 2012 & 2013 award
  • 32. • About training in Big Data (Developer, Analyst, Admin): http://www.indt.org/servicos/treinamentos/hadoop-developer http://www.indt.org/servicos/treinamentos/hadoop-analyst http://www.indt.org/servicos/treinamentos/hadoop-admin • About Hbase http://hbase.apache.org/ • About INDT: http://www.indt.org communications@indt.org.br • About Hortonworks: http://www.hortonworks.com communications@indt.org.br INFOS + CONTACT