SlideShare une entreprise Scribd logo
1  sur  31
Cloning Twitter With
HBase
Dr. Fabio Fumarola
A Twitter Clone
• One of the most successful new Internet services of
recent times is Twitter.
• Since its launch it has exploded from niche usage to
usage by the general populace, with celebrities such
as Oprah Winfrey, Britney Spears, and Shaquille
O'Neal, and politicians such as Barack Obama and Al
Gore jumping into it.
2
Why Twitter?
• Simple: it does not care what you share, as a long it is less
than 140 characters
• A means to have public conversation: Twitter allows a user
to tweet and have users respond using '@' reply, comment,
or re-tweet
• Fan versus friend
• Understanding user behavior
• Easy to share through text messaging
• Easy to access through multiple devices and applications
3
Twitter Stats
• According to Compete (www.compete.com)
4
Main Features
• Allow users to post status updates (known as
'tweets' in Twitter) to the public.
• Allow users to follow and unfollow other users. Users
can follow any other user but it is not reciprocal.
• Allow users to send public messages directed to
particular users using the @ replies convention (in
Twitter this is known as mentions)
5
Main Features
• Allow users to send direct messages to other users,
messages are private to the sender and the recipient
user only (direct messages are only to a single
recipient).
• Allow users to re-tweet or forward another user's
status in their own status update.
• Provide a public timeline where all statuses are
publicly available for viewing.
• Provide APIs to allow external applications access.
6
HBAse
7
Hbase: Features
• Strictly consistent reads and writes.
• Automatic and configurable sharding of tables
• Automatic failover support between RegionServers.
• Base classes for MapReduce jobs
• Easy java API
• Block cache and Bloom Filters for real-time queries.
8
Hbase: Features
• Query predicate push down via server side Filters
• Thrift gateway and a REST-ful Web service that
supports XML, Protobuf, and binary data encoding
options
• Extensible jruby-based (JIRB) shell
• Support for exporting metrics via the Hadoop metrics
subsystem to files or Ganglia; or via JMX
9
Hbase: Installation
• It can be run in 3 settings:
– Single-node standalone
– Pseudo-distributed single-machine
– Fully-distributed cluster
• We will see how to install HBase using Docker
10
Single Node
11
Single-node standalone
• Source code at
https://github.com/fabiofumarola/NoSQLDatabasesCourses
• It uses the local file system not HDFS (not for production).
• Download the tar distribution
• Edit hbase-site.xml
• Start HBase via start-hbase.sh
• We can use jps to test if HBase is running
12
Hbase-site.xml
The folders are created automatically by HBase
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///hbase-data/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/hbase-data/zookeeper</value>
</property>
</configuration>
13
Single-node standalone
• Build the image
– docker build –tag=wheretolive/hbase:single ./
• Run the image
– docker run –d –p 2181:2181 -p 60010:60010 -p
60000:60000 -p 60020:60020 -p 60030:60030 –h hbase
--name=hbase wheretolive/hbase:single
14
Pseudo Distributed
15
Pseudo-distributed
• Run HBase in this mode means that each daemon
(HMaster, HRegionServer and Zookpeeper) run as
separate process.
• Here we can store the data into HDFS if it is available
• The main change is the hbase-site.xml
16
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
</configuration>
Pseudo-distributed
• Build the image
– docker build –tag=wheretolive/hbase:pseudo ./
• Run the image
– docker run –d –p 2181:2181 -p 60010:60010 -p
60000:60000 -p 60020:60020 -p 60030:60030 –h hbase
--name=hbase wheretolive/hbase:pseudo
17
Interacting with the Hbase Shell
18
HBase Shell
• Start the shell
• Create a table
• List the tables
19
$ ./bin/hbase shell
hbase(main):001:0>
hbase(main):001:0> create 'test', 'cf'
0 row(s) in 0.4170 seconds
=> Hbase::Table - test
hbase(main):002:0> list 'test'
TABLE
test
1 row(s) in 0.0180 seconds
=> ["test"]
HBase shell
20
hbase(main):034:0> describe 'test'
Table test is ENABLED
test
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf', BLOOMFILTER => 'ROW', VERSIONS => '1',
IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE',
DATA_BLOCK_ENCODING => 'NONE',
TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS =>
'0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
REPLICATION_SCOPE => '0'}
1 row(s) in 0.0480 seconds
HBase shell: put data
21
hbase(main):003:0> put 'test', 'row1', 'cf:a',
'value1'
0 row(s) in 0.0850 seconds
hbase(main):004:0> put 'test', 'row2', 'cf:b',
'value2'
0 row(s) in 0.0110 seconds
hbase(main):005:0> put 'test', 'row3', 'cf:c',
'value3'
0 row(s) in 0.0100 seconds
HBase shell get
22
hbase(main):007:0> get 'test', 'row1'
COLUMN CELL
cf:a timestamp=1421762485768, value=value1
1 row(s) in 0.0350 seconds
HBase shell: incr
23
hbase(main):027:0> incr 'test', 'row3', 'cf:count', 1
COUNTER VALUE = 1
0 row(s) in 0.0070 seconds
hbase(main):028:0> incr 'test', 'row3', 'cf:count', 1
COUNTER VALUE = 2
0 row(s) in 0.0210 seconds
#Get Counter
hbase(main):031:0> get_counter 'test', 'row3', 'cf:count'
COUNTER VALUE = 4
HBase shell: scan
24
hbase(main):006:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1430940122422,
value=value1
row2 column=cf:b, timestamp=1430940126703,
value=value2
row3 column=cf:c, timestamp=1430940130700,
value=value3
3 row(s) in 0.0470 seconds
HBase shell: disable and drop
25
hbase(main):008:0> disable 'test'
0 row(s) in 1.1820 seconds
hbase(main):009:0> enable 'test'
0 row(s) in 0.1770 seconds
hbase(main):011:0> drop 'test'
0 row(s) in 0.1370 seconds
https://learnhbase.wordpress.com/2013/03/02/hbase-shell-
commands/
Data Layout
26
Users: Identifier
• We need to represent users, of course, with their
– username, userid, password, the set of users following a
given user, the set of users a given user follows, and so on.
• The first question is, how should we identify a user?
• A solution is to associate a unique ID with every user.
• Every other reference to this user will be done by id.
– Create a table that stores all the ids
27
Users
28
package HBaseIA.TwitBase.model;
public abstract class User {
public String user;
public String name;
public String email;
public String password;
@Override
public String toString() {
return String.format("<User: %s, %s, %s>", user, name, email);
}
Twits
29
public abstract class Twit {
public String user;
public DateTime dt;
public String text;
@Override
public String toString() {
return String.format(
"<Twit: %s %s %s>",
user, dt, text);
}
}
Followers, following and updates
• A user might have users who
follow them, which we'll call
their followers.
• A user might follow other
users, which we'll call a
following
30
public abstract class Relation {
public String relation;
public String from;
public String to;
@Override
public String toString() {
return String.format(
"<Relation: %s %s %s>",
from,
relation,
to);
}
}
Let us analyze the code in depth
• http://www.manning.com/dimidukkhurana/
• https://github.com/hbaseinaction/twitbase
• https://github.com/hbaseinaction
31

Contenu connexe

Tendances

Hortonworks HBase Meetup Presentation
Hortonworks HBase Meetup PresentationHortonworks HBase Meetup Presentation
Hortonworks HBase Meetup Presentation
Hortonworks
 

Tendances (20)

MySQL shell and It's utilities - Praveen GR (Mydbops Team)
MySQL shell and It's utilities - Praveen GR (Mydbops Team)MySQL shell and It's utilities - Praveen GR (Mydbops Team)
MySQL shell and It's utilities - Praveen GR (Mydbops Team)
 
The Google Chubby lock service for loosely-coupled distributed systems
The Google Chubby lock service for loosely-coupled distributed systemsThe Google Chubby lock service for loosely-coupled distributed systems
The Google Chubby lock service for loosely-coupled distributed systems
 
Making Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache MesosMaking Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache Mesos
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
Mysql database basic user guide
Mysql database basic user guideMysql database basic user guide
Mysql database basic user guide
 
A brief introduction to PostgreSQL
A brief introduction to PostgreSQLA brief introduction to PostgreSQL
A brief introduction to PostgreSQL
 
Containerized Data Persistence on Mesos
Containerized Data Persistence on MesosContainerized Data Persistence on Mesos
Containerized Data Persistence on Mesos
 
SphinxSE with MySQL
SphinxSE with MySQLSphinxSE with MySQL
SphinxSE with MySQL
 
Introduction of mesos persistent storage
Introduction of mesos persistent storageIntroduction of mesos persistent storage
Introduction of mesos persistent storage
 
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
 
Amazon Aurora로 안전하게 migration 하기
Amazon Aurora로 안전하게 migration 하기Amazon Aurora로 안전하게 migration 하기
Amazon Aurora로 안전하게 migration 하기
 
MySQL PHP native driver : Advanced Functions / PHP forum Paris 2013
 MySQL PHP native driver  : Advanced Functions / PHP forum Paris 2013   MySQL PHP native driver  : Advanced Functions / PHP forum Paris 2013
MySQL PHP native driver : Advanced Functions / PHP forum Paris 2013
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6
 
Web scraping with nutch solr part 2
Web scraping with nutch solr part 2Web scraping with nutch solr part 2
Web scraping with nutch solr part 2
 
Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on Mesos
 
What is new in MariaDB 10.6?
What is new in MariaDB 10.6?What is new in MariaDB 10.6?
What is new in MariaDB 10.6?
 
MySQL Live Migration - Common Scenarios
MySQL Live Migration - Common ScenariosMySQL Live Migration - Common Scenarios
MySQL Live Migration - Common Scenarios
 
Hortonworks HBase Meetup Presentation
Hortonworks HBase Meetup PresentationHortonworks HBase Meetup Presentation
Hortonworks HBase Meetup Presentation
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache Mesos
 

Similaire à 8b. Column Oriented Databases Lab

HBase_-_data_operaet le opérations de calciletions_final.pptx
HBase_-_data_operaet le opérations de calciletions_final.pptxHBase_-_data_operaet le opérations de calciletions_final.pptx
HBase_-_data_operaet le opérations de calciletions_final.pptx
HmadSADAQ2
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
DataWorks Summit
 

Similaire à 8b. Column Oriented Databases Lab (20)

Web Services Tutorial
Web Services TutorialWeb Services Tutorial
Web Services Tutorial
 
Web services tutorial
Web services tutorialWeb services tutorial
Web services tutorial
 
03 h base-2-installation_andshell
03 h base-2-installation_andshell03 h base-2-installation_andshell
03 h base-2-installation_andshell
 
Real-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using ImpalaReal-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using Impala
 
New Security Features in Apache HBase 0.98: An Operator's Guide
New Security Features in Apache HBase 0.98: An Operator's GuideNew Security Features in Apache HBase 0.98: An Operator's Guide
New Security Features in Apache HBase 0.98: An Operator's Guide
 
Web Services PHP Tutorial
Web Services PHP TutorialWeb Services PHP Tutorial
Web Services PHP Tutorial
 
Web hacking series part 3
Web hacking series part 3Web hacking series part 3
Web hacking series part 3
 
HBase_-_data_operaet le opérations de calciletions_final.pptx
HBase_-_data_operaet le opérations de calciletions_final.pptxHBase_-_data_operaet le opérations de calciletions_final.pptx
HBase_-_data_operaet le opérations de calciletions_final.pptx
 
Configuration management with Chef
Configuration management with ChefConfiguration management with Chef
Configuration management with Chef
 
Why Managed Service Providers Should Embrace Container Technology
Why Managed Service Providers Should Embrace Container TechnologyWhy Managed Service Providers Should Embrace Container Technology
Why Managed Service Providers Should Embrace Container Technology
 
Getting to know Laravel 5
Getting to know Laravel 5Getting to know Laravel 5
Getting to know Laravel 5
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DMUpgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 
PHP FUNCTIONS
PHP FUNCTIONSPHP FUNCTIONS
PHP FUNCTIONS
 
Working with Hive Analytics
Working with Hive AnalyticsWorking with Hive Analytics
Working with Hive Analytics
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizon
 
IBM Connect 2016 - AD1548 - Building Responsive XPages Applications
IBM Connect 2016 - AD1548 - Building Responsive XPages ApplicationsIBM Connect 2016 - AD1548 - Building Responsive XPages Applications
IBM Connect 2016 - AD1548 - Building Responsive XPages Applications
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
 

Plus de Fabio Fumarola

6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
Fabio Fumarola
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
Fabio Fumarola
 
3 Git
3 Git3 Git
2 Linux Container and Docker
2 Linux Container and Docker2 Linux Container and Docker
2 Linux Container and Docker
Fabio Fumarola
 
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
Fabio Fumarola
 
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
Fabio Fumarola
 

Plus de Fabio Fumarola (20)

11. From Hadoop to Spark 2/2
11. From Hadoop to Spark 2/211. From Hadoop to Spark 2/2
11. From Hadoop to Spark 2/2
 
11. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:211. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:2
 
10b. Graph Databases Lab
10b. Graph Databases Lab10b. Graph Databases Lab
10b. Graph Databases Lab
 
10. Graph Databases
10. Graph Databases10. Graph Databases
10. Graph Databases
 
9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 
8. column oriented databases
8. column oriented databases8. column oriented databases
8. column oriented databases
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
 
3 Git
3 Git3 Git
3 Git
 
2 Linux Container and Docker
2 Linux Container and Docker2 Linux Container and Docker
2 Linux Container and Docker
 
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
 
Scala and spark
Scala and sparkScala and spark
Scala and spark
 
An introduction to maven gradle and sbt
An introduction to maven gradle and sbtAn introduction to maven gradle and sbt
An introduction to maven gradle and sbt
 
Develop with linux containers and docker
Develop with linux containers and dockerDevelop with linux containers and docker
Develop with linux containers and docker
 
Linux containers and docker
Linux containers and dockerLinux containers and docker
Linux containers and docker
 
08 datasets
08 datasets08 datasets
08 datasets
 
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and cons
 

Dernier

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
cnajjemba
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 

Dernier (20)

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 

8b. Column Oriented Databases Lab

  • 2. A Twitter Clone • One of the most successful new Internet services of recent times is Twitter. • Since its launch it has exploded from niche usage to usage by the general populace, with celebrities such as Oprah Winfrey, Britney Spears, and Shaquille O'Neal, and politicians such as Barack Obama and Al Gore jumping into it. 2
  • 3. Why Twitter? • Simple: it does not care what you share, as a long it is less than 140 characters • A means to have public conversation: Twitter allows a user to tweet and have users respond using '@' reply, comment, or re-tweet • Fan versus friend • Understanding user behavior • Easy to share through text messaging • Easy to access through multiple devices and applications 3
  • 4. Twitter Stats • According to Compete (www.compete.com) 4
  • 5. Main Features • Allow users to post status updates (known as 'tweets' in Twitter) to the public. • Allow users to follow and unfollow other users. Users can follow any other user but it is not reciprocal. • Allow users to send public messages directed to particular users using the @ replies convention (in Twitter this is known as mentions) 5
  • 6. Main Features • Allow users to send direct messages to other users, messages are private to the sender and the recipient user only (direct messages are only to a single recipient). • Allow users to re-tweet or forward another user's status in their own status update. • Provide a public timeline where all statuses are publicly available for viewing. • Provide APIs to allow external applications access. 6
  • 8. Hbase: Features • Strictly consistent reads and writes. • Automatic and configurable sharding of tables • Automatic failover support between RegionServers. • Base classes for MapReduce jobs • Easy java API • Block cache and Bloom Filters for real-time queries. 8
  • 9. Hbase: Features • Query predicate push down via server side Filters • Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options • Extensible jruby-based (JIRB) shell • Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX 9
  • 10. Hbase: Installation • It can be run in 3 settings: – Single-node standalone – Pseudo-distributed single-machine – Fully-distributed cluster • We will see how to install HBase using Docker 10
  • 12. Single-node standalone • Source code at https://github.com/fabiofumarola/NoSQLDatabasesCourses • It uses the local file system not HDFS (not for production). • Download the tar distribution • Edit hbase-site.xml • Start HBase via start-hbase.sh • We can use jps to test if HBase is running 12
  • 13. Hbase-site.xml The folders are created automatically by HBase <configuration> <property> <name>hbase.rootdir</name> <value>file:///hbase-data/hbase</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/hbase-data/zookeeper</value> </property> </configuration> 13
  • 14. Single-node standalone • Build the image – docker build –tag=wheretolive/hbase:single ./ • Run the image – docker run –d –p 2181:2181 -p 60010:60010 -p 60000:60000 -p 60020:60020 -p 60030:60030 –h hbase --name=hbase wheretolive/hbase:single 14
  • 16. Pseudo-distributed • Run HBase in this mode means that each daemon (HMaster, HRegionServer and Zookpeeper) run as separate process. • Here we can store the data into HDFS if it is available • The main change is the hbase-site.xml 16 <configuration> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> </configuration>
  • 17. Pseudo-distributed • Build the image – docker build –tag=wheretolive/hbase:pseudo ./ • Run the image – docker run –d –p 2181:2181 -p 60010:60010 -p 60000:60000 -p 60020:60020 -p 60030:60030 –h hbase --name=hbase wheretolive/hbase:pseudo 17
  • 18. Interacting with the Hbase Shell 18
  • 19. HBase Shell • Start the shell • Create a table • List the tables 19 $ ./bin/hbase shell hbase(main):001:0> hbase(main):001:0> create 'test', 'cf' 0 row(s) in 0.4170 seconds => Hbase::Table - test hbase(main):002:0> list 'test' TABLE test 1 row(s) in 0.0180 seconds => ["test"]
  • 20. HBase shell 20 hbase(main):034:0> describe 'test' Table test is ENABLED test COLUMN FAMILIES DESCRIPTION {NAME => 'cf', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.0480 seconds
  • 21. HBase shell: put data 21 hbase(main):003:0> put 'test', 'row1', 'cf:a', 'value1' 0 row(s) in 0.0850 seconds hbase(main):004:0> put 'test', 'row2', 'cf:b', 'value2' 0 row(s) in 0.0110 seconds hbase(main):005:0> put 'test', 'row3', 'cf:c', 'value3' 0 row(s) in 0.0100 seconds
  • 22. HBase shell get 22 hbase(main):007:0> get 'test', 'row1' COLUMN CELL cf:a timestamp=1421762485768, value=value1 1 row(s) in 0.0350 seconds
  • 23. HBase shell: incr 23 hbase(main):027:0> incr 'test', 'row3', 'cf:count', 1 COUNTER VALUE = 1 0 row(s) in 0.0070 seconds hbase(main):028:0> incr 'test', 'row3', 'cf:count', 1 COUNTER VALUE = 2 0 row(s) in 0.0210 seconds #Get Counter hbase(main):031:0> get_counter 'test', 'row3', 'cf:count' COUNTER VALUE = 4
  • 24. HBase shell: scan 24 hbase(main):006:0> scan 'test' ROW COLUMN+CELL row1 column=cf:a, timestamp=1430940122422, value=value1 row2 column=cf:b, timestamp=1430940126703, value=value2 row3 column=cf:c, timestamp=1430940130700, value=value3 3 row(s) in 0.0470 seconds
  • 25. HBase shell: disable and drop 25 hbase(main):008:0> disable 'test' 0 row(s) in 1.1820 seconds hbase(main):009:0> enable 'test' 0 row(s) in 0.1770 seconds hbase(main):011:0> drop 'test' 0 row(s) in 0.1370 seconds https://learnhbase.wordpress.com/2013/03/02/hbase-shell- commands/
  • 27. Users: Identifier • We need to represent users, of course, with their – username, userid, password, the set of users following a given user, the set of users a given user follows, and so on. • The first question is, how should we identify a user? • A solution is to associate a unique ID with every user. • Every other reference to this user will be done by id. – Create a table that stores all the ids 27
  • 28. Users 28 package HBaseIA.TwitBase.model; public abstract class User { public String user; public String name; public String email; public String password; @Override public String toString() { return String.format("<User: %s, %s, %s>", user, name, email); }
  • 29. Twits 29 public abstract class Twit { public String user; public DateTime dt; public String text; @Override public String toString() { return String.format( "<Twit: %s %s %s>", user, dt, text); } }
  • 30. Followers, following and updates • A user might have users who follow them, which we'll call their followers. • A user might follow other users, which we'll call a following 30 public abstract class Relation { public String relation; public String from; public String to; @Override public String toString() { return String.format( "<Relation: %s %s %s>", from, relation, to); } }
  • 31. Let us analyze the code in depth • http://www.manning.com/dimidukkhurana/ • https://github.com/hbaseinaction/twitbase • https://github.com/hbaseinaction 31

Notes de l'éditeur

  1. . You need to run HBase on HDFS to ensure all writes are preserved. Running against the local filesystem is intended as a shortcut to get you familiar with how the general system works, as the very first phase of evaluation.
  2. . You need to run HBase on HDFS to ensure all writes are preserved. Running against the local filesystem is intended as a shortcut to get you familiar with how the general system works, as the very first phase of evaluation.
  3. . You need to run HBase on HDFS to ensure all writes are preserved. Running against the local filesystem is intended as a shortcut to get you familiar with how the general system works, as the very first phase of evaluation.
  4. . You need to run HBase on HDFS to ensure all writes are preserved. Running against the local filesystem is intended as a shortcut to get you familiar with how the general system works, as the very first phase of evaluation.
  5. . You need to run HBase on HDFS to ensure all writes are preserved. Running against the local filesystem is intended as a shortcut to get you familiar with how the general system works, as the very first phase of evaluation.
  6. We use the next_user_id key in order to always get an unique ID for every new user. Then we use this unique ID to name the key holding an Hash with user&amp;apos;s data. This is a common design pattern with key-values stores! Keep it in mind.