SlideShare une entreprise Scribd logo
1  sur  23
Introduction to
HBase
Ciao
ciao
Vai a fare
ciao ciao
Dr. Fabio Fumarola
Contents
• BigTable
• HBase
– Shell
– Admin
– Put
– Get
– Scan
• Coding Session
2
BigTable
3
Bigtable at google
• "Bigtable is a distributed storage system for
managing structured data that is designed to scale to
a very large size: petabytes of data across thousands
of commodity servers. Many projects at Google store
data in Bigtable including web indexing, Google
Earth, and Google Finance.”
4
Feature
• Distributed
• Sparse
• Column-Oriented
• Versioned
5
1. The map is indexed by a
– <row key, column key, and a timestamp>
1. each value in the map is an uninterpreted array of
bytes.
6
(row key, column key, timestamp) => value
Key Concepts
• row key => 20120407152657
• column family => "personal:"
• column key => "personal:givenName",
"personal:surname”
• timestamp => 1239124584398
• Column value => “mario”, “rossi”
7
Example 1
8
Get row 20120407145045
9
HBase
• Use HBase when you need random, realtime read/
write access to your Big Data.This project's goal is
the hosting of very large tables -- billions of rows X
millions of columns -- atop clusters of commodity
hardware. HBase is an open-source, distributed,
versioned, column-oriented store modeled after
Google's Bigtable.
http://hbase.apache.org
10
HBase Shell
hbase(main):001:0> create 'blog', 'info', 'content'
0 row(s) in 4.3640 seconds
hbase(main):002:0> put 'blog', '20120320162535', 'info:title', 'Document-
oriented storage using CouchDB'
0 row(s) in 0.0330 seconds
hbase(main):003:0> put 'blog', '20120320162535', 'info:author', 'Bob Smith'
0 row(s) in 0.0030 seconds
hbase(main):004:0> put 'blog', '20120320162535', 'content:', 'CouchDB is a
document-oriented...'
0 row(s) in 0.0030 seconds
11
HBase shell
hbase(main):005:0> put 'blog', '20120320162535', 'info:category', 'Persistence'
0 row(s) in 0.0030 seconds
hbase(main):006:0> get 'blog', '20120320162535'
COLUMN
content:
info:author
info:category
info:title
4 row(s) in 0.0140 seconds
CELL
timestamp=1239135042862, value=CouchDB is a doc...
timestamp=1239135042755, value=Bob Smith
timestamp=1239135042982, value=Persistence
timestamp=1239135042623, value=Document-oriented...
12
HBase shell
hbase(main):015:0> get 'blog', '20120407145045', {COLUMN=>'info:author', VERSIONS=>3 }
timestamp=1239135325074, value=John Doe
timestamp=1239135324741, value=John
2 row(s) in 0.0060 seconds
hbase(main):016:0> scan 'blog', { STARTROW => '20120300', STOPROW => '20120400' }
ROW
20120320162535
20120320162535
20120320162535
20120320162535
COLUMN+CELL
column=content:, timestamp=1239135042862, value=CouchDB is...
column=info:author, timestamp=1239135042755, value=Bob Smith
column=info:category, timestamp=1239135042982, value=Persistence
column=info:title, timestamp=1239135042623, value=Document...
4 row(s) in 0.0230 seconds
13
Java API
14
Admin API
// Create a new table
Configuration conf = HBaseConfiguration.create();
HBaseAdmin admin = new HBaseAdmin(conf);
String tableName = "people";
HTableDescriptor desc = new HTableDescriptor(tableName);
desc.addFamily(new HColumnDescriptor("personal"));
desc.addFamily(new HColumnDescriptor("contactinfo"));
desc.addFamily(new HColumnDescriptor("creditcard"));
admin.createTable(desc);
System.out.printf("%s is available? %bn", tableName,
admin.isTableAvailable(tableName));
15
Client API
import static org.apache.hadoop.hbase.util.Bytes.toBytes;
// Add some data into 'people' table
Configuration conf = HBaseConfiguration.create();
Put put = new Put(toBytes("connor-john-m-43299"));
put.add(toBytes("personal"), toBytes("givenName"),
toBytes("John"));
put.add(toBytes("personal"), toBytes("mi"),
toBytes("M")); put.add(toBytes("personal"),
toBytes("surname"), toBytes("Connor"));
put.add(toBytes("contactinfo"), toBytes("email"),
toBytes("john.connor@gmail.com")); table.put(put);
table.flushCommits(); table.close();
16
Finding Data
• GET (by row key)
• Scan (by row key ranges, filtering)
17
Get
// Get a row. Ask for only the data you need.
Configuration conf = HBaseConfiguration.create();
HTable table = new HTable(conf, "people");
Get get = new Get(toBytes("connor-john-m-43299"));
get.setMaxVersions(2);
get.addFamily(toBytes("personal"));
get.addColumn(toBytes("contactinfo"),
toBytes("email"));
Result result = table.get(get);
18
Update
// Update existing values, and add a new one
Configuration conf = HBaseConfiguration.create();
HTable table = new HTable(conf, "people");
Put put = new Put(toBytes("connor-john-m-43299"));
put.add(toBytes("personal"), toBytes("surname"),
toBytes("Smith"));
put.add(toBytes("contactinfo"), toBytes("email"),
toBytes("john.m.smith@gmail.com"));
put.add(toBytes("contactinfo"), toBytes("address"),
toBytes("San Diego, CA"));
table.put(put);
table.flushCommits();
table.close();
19
Scans
// Scan rows...
Configuration conf = HBaseConfiguration.create();
HTable table = new HTable(conf, "people");
Scan scan = new Scan(toBytes(”jhon-"));
scan.addColumn(toBytes("personal"), toBytes("givenName"));
scan.addColumn(toBytes("contactinfo", toBytes("email"));
scan.addColumn(toBytes("contactinfo", toBytes("address"));
scan.setFilter(new PageFilter(numRowsPerPage));
ResultScanner scanner = table.getScanner(scan);
for (Result result : scanner) {
// process result...
}
20
Time to Code
This is when things start to do hard
21
Setup HBase Docker
• https://registry.hub.docker.com/u/banno/hbase-standalo
• https://registry.hub.docker.com/u/oddpoet/hbase-cdh5/
22
Steps
• Shell
• Java Project
– Maven
– Gradle
23

Contenu connexe

Tendances

Mysql database basic user guide
Mysql database basic user guideMysql database basic user guide
Mysql database basic user guidePoguttuezhiniVP
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQLJim Mlodgenski
 
What is new in MariaDB 10.6?
What is new in MariaDB 10.6?What is new in MariaDB 10.6?
What is new in MariaDB 10.6?Mydbops
 
Hive Quick Start Tutorial
Hive Quick Start TutorialHive Quick Start Tutorial
Hive Quick Start TutorialCarl Steinbach
 
Accessing external hadoop data sources using pivotal e xtension framework (px...
Accessing external hadoop data sources using pivotal e xtension framework (px...Accessing external hadoop data sources using pivotal e xtension framework (px...
Accessing external hadoop data sources using pivotal e xtension framework (px...Sameer Tiwari
 
Building Hybrid data cluster using PostgreSQL and MongoDB
Building Hybrid data cluster using PostgreSQL and MongoDBBuilding Hybrid data cluster using PostgreSQL and MongoDB
Building Hybrid data cluster using PostgreSQL and MongoDBAshnikbiz
 
Hadoop & Zing
Hadoop & ZingHadoop & Zing
Hadoop & ZingLong Dao
 
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL MeetupCassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL MeetupMichael Wynholds
 
Friends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSFriends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSSaumitra Srivastav
 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friendslucenerevolution
 
What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6EDB
 
HBase: Just the Basics
HBase: Just the BasicsHBase: Just the Basics
HBase: Just the BasicsHBaseCon
 
Apache Hadoop and HBase
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBaseCloudera, Inc.
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBAthiq Ahamed
 
MySQL shell and It's utilities - Praveen GR (Mydbops Team)
MySQL shell and It's utilities - Praveen GR (Mydbops Team)MySQL shell and It's utilities - Praveen GR (Mydbops Team)
MySQL shell and It's utilities - Praveen GR (Mydbops Team)Mydbops
 
MySQL Live Migration - Common Scenarios
MySQL Live Migration - Common ScenariosMySQL Live Migration - Common Scenarios
MySQL Live Migration - Common ScenariosMydbops
 
Redis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale AppsRedis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale AppsDave Nielsen
 
Storage Methods for Nonstandard Data Patterns
Storage Methods for Nonstandard Data PatternsStorage Methods for Nonstandard Data Patterns
Storage Methods for Nonstandard Data PatternsBob Burgess
 

Tendances (20)

Mysql database basic user guide
Mysql database basic user guideMysql database basic user guide
Mysql database basic user guide
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
What is new in MariaDB 10.6?
What is new in MariaDB 10.6?What is new in MariaDB 10.6?
What is new in MariaDB 10.6?
 
Hive Quick Start Tutorial
Hive Quick Start TutorialHive Quick Start Tutorial
Hive Quick Start Tutorial
 
Accessing external hadoop data sources using pivotal e xtension framework (px...
Accessing external hadoop data sources using pivotal e xtension framework (px...Accessing external hadoop data sources using pivotal e xtension framework (px...
Accessing external hadoop data sources using pivotal e xtension framework (px...
 
Hbase
HbaseHbase
Hbase
 
Building Hybrid data cluster using PostgreSQL and MongoDB
Building Hybrid data cluster using PostgreSQL and MongoDBBuilding Hybrid data cluster using PostgreSQL and MongoDB
Building Hybrid data cluster using PostgreSQL and MongoDB
 
Hadoop & Zing
Hadoop & ZingHadoop & Zing
Hadoop & Zing
 
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL MeetupCassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
 
SphinxSE with MySQL
SphinxSE with MySQLSphinxSE with MySQL
SphinxSE with MySQL
 
Friends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSFriends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFS
 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friends
 
What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6
 
HBase: Just the Basics
HBase: Just the BasicsHBase: Just the Basics
HBase: Just the Basics
 
Apache Hadoop and HBase
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBase
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
 
MySQL shell and It's utilities - Praveen GR (Mydbops Team)
MySQL shell and It's utilities - Praveen GR (Mydbops Team)MySQL shell and It's utilities - Praveen GR (Mydbops Team)
MySQL shell and It's utilities - Praveen GR (Mydbops Team)
 
MySQL Live Migration - Common Scenarios
MySQL Live Migration - Common ScenariosMySQL Live Migration - Common Scenarios
MySQL Live Migration - Common Scenarios
 
Redis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale AppsRedis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale Apps
 
Storage Methods for Nonstandard Data Patterns
Storage Methods for Nonstandard Data PatternsStorage Methods for Nonstandard Data Patterns
Storage Methods for Nonstandard Data Patterns
 

En vedette

10b. Graph Databases Lab
10b. Graph Databases Lab10b. Graph Databases Lab
10b. Graph Databases LabFabio Fumarola
 
11. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:211. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:2Fabio Fumarola
 
11. From Hadoop to Spark 2/2
11. From Hadoop to Spark 2/211. From Hadoop to Spark 2/2
11. From Hadoop to Spark 2/2Fabio Fumarola
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In DepthFabio Fumarola
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQLTony Tam
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2Fabio Fumarola
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2Fabio Fumarola
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented DatabasesFabio Fumarola
 

En vedette (11)

10b. Graph Databases Lab
10b. Graph Databases Lab10b. Graph Databases Lab
10b. Graph Databases Lab
 
3 Git
3 Git3 Git
3 Git
 
10. Graph Databases
10. Graph Databases10. Graph Databases
10. Graph Databases
 
11. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:211. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:2
 
11. From Hadoop to Spark 2/2
11. From Hadoop to Spark 2/211. From Hadoop to Spark 2/2
11. From Hadoop to Spark 2/2
 
Scala and spark
Scala and sparkScala and spark
Scala and spark
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 

Similaire à Hbase an introduction

Getting started into mySQL
Getting started into mySQLGetting started into mySQL
Getting started into mySQLSiddique Ibrahim
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataGruter
 
Grokking TechTalk 9 - Building a realtime & offline editing service from scra...
Grokking TechTalk 9 - Building a realtime & offline editing service from scra...Grokking TechTalk 9 - Building a realtime & offline editing service from scra...
Grokking TechTalk 9 - Building a realtime & offline editing service from scra...Grokking VN
 
NoSQL store everyone ignored - Postgres Conf 2021
NoSQL store everyone ignored - Postgres Conf 2021NoSQL store everyone ignored - Postgres Conf 2021
NoSQL store everyone ignored - Postgres Conf 2021Zohaib Hassan
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCCloudera, Inc.
 
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010Alex Sharp
 
Creating, Updating and Deleting Document in MongoDB
Creating, Updating and Deleting Document in MongoDBCreating, Updating and Deleting Document in MongoDB
Creating, Updating and Deleting Document in MongoDBWildan Maulana
 
Valtech - Big Data & NoSQL : au-delà du nouveau buzz
Valtech  - Big Data & NoSQL : au-delà du nouveau buzzValtech  - Big Data & NoSQL : au-delà du nouveau buzz
Valtech - Big Data & NoSQL : au-delà du nouveau buzzValtech
 
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...NoSQLmatters
 
MongoDB at RubyEnRails 2009
MongoDB at RubyEnRails 2009MongoDB at RubyEnRails 2009
MongoDB at RubyEnRails 2009Mike Dirolf
 
HTML5, just another presentation :)
HTML5, just another presentation :)HTML5, just another presentation :)
HTML5, just another presentation :)François Massart
 
W3Conf slides - The top web features from caniuse.com you can use today
W3Conf slides - The top web features from caniuse.com you can use todayW3Conf slides - The top web features from caniuse.com you can use today
W3Conf slides - The top web features from caniuse.com you can use todayadeveria
 
SQL Server 2014 Monitoring and Profiling
SQL Server 2014 Monitoring and ProfilingSQL Server 2014 Monitoring and Profiling
SQL Server 2014 Monitoring and ProfilingAbouzar Noori
 

Similaire à Hbase an introduction (20)

HBase Lightning Talk
HBase Lightning TalkHBase Lightning Talk
HBase Lightning Talk
 
Getting started into mySQL
Getting started into mySQLGetting started into mySQL
Getting started into mySQL
 
HBase.pptx
HBase.pptxHBase.pptx
HBase.pptx
 
[PSU Web 2011] HTML5 Design
[PSU Web 2011] HTML5 Design[PSU Web 2011] HTML5 Design
[PSU Web 2011] HTML5 Design
 
Intro to HTML 5 / CSS 3
Intro to HTML 5 / CSS 3Intro to HTML 5 / CSS 3
Intro to HTML 5 / CSS 3
 
The emerging world of mongo db csp
The emerging world of mongo db   cspThe emerging world of mongo db   csp
The emerging world of mongo db csp
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big Data
 
Grokking TechTalk 9 - Building a realtime & offline editing service from scra...
Grokking TechTalk 9 - Building a realtime & offline editing service from scra...Grokking TechTalk 9 - Building a realtime & offline editing service from scra...
Grokking TechTalk 9 - Building a realtime & offline editing service from scra...
 
NoSQL store everyone ignored - Postgres Conf 2021
NoSQL store everyone ignored - Postgres Conf 2021NoSQL store everyone ignored - Postgres Conf 2021
NoSQL store everyone ignored - Postgres Conf 2021
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
 
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
 
Creating, Updating and Deleting Document in MongoDB
Creating, Updating and Deleting Document in MongoDBCreating, Updating and Deleting Document in MongoDB
Creating, Updating and Deleting Document in MongoDB
 
Valtech - Big Data & NoSQL : au-delà du nouveau buzz
Valtech  - Big Data & NoSQL : au-delà du nouveau buzzValtech  - Big Data & NoSQL : au-delà du nouveau buzz
Valtech - Big Data & NoSQL : au-delà du nouveau buzz
 
Running Databases on AWS
Running Databases on AWSRunning Databases on AWS
Running Databases on AWS
 
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
 
MongoDB at RubyEnRails 2009
MongoDB at RubyEnRails 2009MongoDB at RubyEnRails 2009
MongoDB at RubyEnRails 2009
 
HTML5, just another presentation :)
HTML5, just another presentation :)HTML5, just another presentation :)
HTML5, just another presentation :)
 
W3Conf slides - The top web features from caniuse.com you can use today
W3Conf slides - The top web features from caniuse.com you can use todayW3Conf slides - The top web features from caniuse.com you can use today
W3Conf slides - The top web features from caniuse.com you can use today
 
SQL Server 2014 Monitoring and Profiling
SQL Server 2014 Monitoring and ProfilingSQL Server 2014 Monitoring and Profiling
SQL Server 2014 Monitoring and Profiling
 
PostgreSQL
PostgreSQLPostgreSQL
PostgreSQL
 

Plus de Fabio Fumarola

2 Linux Container and Docker
2 Linux Container and Docker2 Linux Container and Docker
2 Linux Container and DockerFabio Fumarola
 
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...Fabio Fumarola
 
An introduction to maven gradle and sbt
An introduction to maven gradle and sbtAn introduction to maven gradle and sbt
An introduction to maven gradle and sbtFabio Fumarola
 
Develop with linux containers and docker
Develop with linux containers and dockerDevelop with linux containers and docker
Develop with linux containers and dockerFabio Fumarola
 
Linux containers and docker
Linux containers and dockerLinux containers and docker
Linux containers and dockerFabio Fumarola
 
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce Fabio Fumarola
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and consFabio Fumarola
 

Plus de Fabio Fumarola (8)

2 Linux Container and Docker
2 Linux Container and Docker2 Linux Container and Docker
2 Linux Container and Docker
 
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
 
An introduction to maven gradle and sbt
An introduction to maven gradle and sbtAn introduction to maven gradle and sbt
An introduction to maven gradle and sbt
 
Develop with linux containers and docker
Develop with linux containers and dockerDevelop with linux containers and docker
Develop with linux containers and docker
 
Linux containers and docker
Linux containers and dockerLinux containers and docker
Linux containers and docker
 
08 datasets
08 datasets08 datasets
08 datasets
 
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and cons
 

Dernier

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 

Dernier (20)

Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 

Hbase an introduction

  • 1. Introduction to HBase Ciao ciao Vai a fare ciao ciao Dr. Fabio Fumarola
  • 2. Contents • BigTable • HBase – Shell – Admin – Put – Get – Scan • Coding Session 2
  • 4. Bigtable at google • "Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable including web indexing, Google Earth, and Google Finance.” 4
  • 5. Feature • Distributed • Sparse • Column-Oriented • Versioned 5
  • 6. 1. The map is indexed by a – <row key, column key, and a timestamp> 1. each value in the map is an uninterpreted array of bytes. 6 (row key, column key, timestamp) => value
  • 7. Key Concepts • row key => 20120407152657 • column family => "personal:" • column key => "personal:givenName", "personal:surname” • timestamp => 1239124584398 • Column value => “mario”, “rossi” 7
  • 10. HBase • Use HBase when you need random, realtime read/ write access to your Big Data.This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable. http://hbase.apache.org 10
  • 11. HBase Shell hbase(main):001:0> create 'blog', 'info', 'content' 0 row(s) in 4.3640 seconds hbase(main):002:0> put 'blog', '20120320162535', 'info:title', 'Document- oriented storage using CouchDB' 0 row(s) in 0.0330 seconds hbase(main):003:0> put 'blog', '20120320162535', 'info:author', 'Bob Smith' 0 row(s) in 0.0030 seconds hbase(main):004:0> put 'blog', '20120320162535', 'content:', 'CouchDB is a document-oriented...' 0 row(s) in 0.0030 seconds 11
  • 12. HBase shell hbase(main):005:0> put 'blog', '20120320162535', 'info:category', 'Persistence' 0 row(s) in 0.0030 seconds hbase(main):006:0> get 'blog', '20120320162535' COLUMN content: info:author info:category info:title 4 row(s) in 0.0140 seconds CELL timestamp=1239135042862, value=CouchDB is a doc... timestamp=1239135042755, value=Bob Smith timestamp=1239135042982, value=Persistence timestamp=1239135042623, value=Document-oriented... 12
  • 13. HBase shell hbase(main):015:0> get 'blog', '20120407145045', {COLUMN=>'info:author', VERSIONS=>3 } timestamp=1239135325074, value=John Doe timestamp=1239135324741, value=John 2 row(s) in 0.0060 seconds hbase(main):016:0> scan 'blog', { STARTROW => '20120300', STOPROW => '20120400' } ROW 20120320162535 20120320162535 20120320162535 20120320162535 COLUMN+CELL column=content:, timestamp=1239135042862, value=CouchDB is... column=info:author, timestamp=1239135042755, value=Bob Smith column=info:category, timestamp=1239135042982, value=Persistence column=info:title, timestamp=1239135042623, value=Document... 4 row(s) in 0.0230 seconds 13
  • 15. Admin API // Create a new table Configuration conf = HBaseConfiguration.create(); HBaseAdmin admin = new HBaseAdmin(conf); String tableName = "people"; HTableDescriptor desc = new HTableDescriptor(tableName); desc.addFamily(new HColumnDescriptor("personal")); desc.addFamily(new HColumnDescriptor("contactinfo")); desc.addFamily(new HColumnDescriptor("creditcard")); admin.createTable(desc); System.out.printf("%s is available? %bn", tableName, admin.isTableAvailable(tableName)); 15
  • 16. Client API import static org.apache.hadoop.hbase.util.Bytes.toBytes; // Add some data into 'people' table Configuration conf = HBaseConfiguration.create(); Put put = new Put(toBytes("connor-john-m-43299")); put.add(toBytes("personal"), toBytes("givenName"), toBytes("John")); put.add(toBytes("personal"), toBytes("mi"), toBytes("M")); put.add(toBytes("personal"), toBytes("surname"), toBytes("Connor")); put.add(toBytes("contactinfo"), toBytes("email"), toBytes("john.connor@gmail.com")); table.put(put); table.flushCommits(); table.close(); 16
  • 17. Finding Data • GET (by row key) • Scan (by row key ranges, filtering) 17
  • 18. Get // Get a row. Ask for only the data you need. Configuration conf = HBaseConfiguration.create(); HTable table = new HTable(conf, "people"); Get get = new Get(toBytes("connor-john-m-43299")); get.setMaxVersions(2); get.addFamily(toBytes("personal")); get.addColumn(toBytes("contactinfo"), toBytes("email")); Result result = table.get(get); 18
  • 19. Update // Update existing values, and add a new one Configuration conf = HBaseConfiguration.create(); HTable table = new HTable(conf, "people"); Put put = new Put(toBytes("connor-john-m-43299")); put.add(toBytes("personal"), toBytes("surname"), toBytes("Smith")); put.add(toBytes("contactinfo"), toBytes("email"), toBytes("john.m.smith@gmail.com")); put.add(toBytes("contactinfo"), toBytes("address"), toBytes("San Diego, CA")); table.put(put); table.flushCommits(); table.close(); 19
  • 20. Scans // Scan rows... Configuration conf = HBaseConfiguration.create(); HTable table = new HTable(conf, "people"); Scan scan = new Scan(toBytes(”jhon-")); scan.addColumn(toBytes("personal"), toBytes("givenName")); scan.addColumn(toBytes("contactinfo", toBytes("email")); scan.addColumn(toBytes("contactinfo", toBytes("address")); scan.setFilter(new PageFilter(numRowsPerPage)); ResultScanner scanner = table.getScanner(scan); for (Result result : scanner) { // process result... } 20
  • 21. Time to Code This is when things start to do hard 21
  • 22. Setup HBase Docker • https://registry.hub.docker.com/u/banno/hbase-standalo • https://registry.hub.docker.com/u/oddpoet/hbase-cdh5/ 22
  • 23. Steps • Shell • Java Project – Maven – Gradle 23