More than Just Lines on a Map: Best Practices for U.S Bike Routes
HBase schema design Big Data TechCon Boston
1. HBase
schema
design
Headline
Goes
Here
Amandeep
Khurana
|
Solu7ons
AHere
Speaker
Name
or
Subhead
Goes
rchitect
Big
Data
TechCon,
Boston,
April
2013
1
Friday, April 12, 13
2. About
me
• Solu@ons
Architect,
Cloudera
Inc
• Amazon
Web
Services
• Interested
in
large
scale
distributed
systems
• Co-‐author,
HBase
In
Ac@on
• TwiHer:
amansk
Nick Dimiduk
Amandeep Khurana
MANNING
2
Friday, April 12, 13
3. About
the
talk
• Data
model
recap
• Data
modeling
thought
process
• Tools
and
techniques
3
Friday, April 12, 13
4. HBase
is
...
• Column
family
oriented
database
• Column
family
oriented
• Tables
consis@ng
of
rows
and
columns
• Persisted
Map
• Sparse
• Mul@
dimensional
• Sorted
• Indexed
by
rowkey,
column
and
@mestamp
• Key
Value
store
• [rowkey,
col
family,
col
qualifier,
@mestamp]
-‐>
cell
value
4
Friday, April 12, 13
5. HBase
is
not
...
• A
rela@onal
database
• No
SQL
query
language
• No
joins
• No
secondary
indexing
• No
transac@ons
5
Friday, April 12, 13
6. Data
Model
recap
It’s
not
a
rela7onal
database
system
6
Friday, April 12, 13
7. Important
terms
• Table
• Consists
of
rows
and
columns
• Row
• Has
a
bunch
of
columns.
• Iden@fied
by
a
rowkey
(primary
key)
• Column
Qualifier
• Dynamic
column
name
• Column
Family
• Column
groups
-‐
logical
and
physical
(Similar
access
paHern)
• Cell
• The
actual
element
that
contains
the
data
for
a
row-‐column
intersec@on
• Version
• Every
cell
has
mul@ple
versions.
7
Friday, April 12, 13
8. Data
coordinates
• Row
is
addressed
using
rowkey
• Cell
is
addressed
using
[rowkey
+
family
+
qualifier]
8
Friday, April 12, 13
9. Tabular
representa@on
2 Column Family - Info
1 3
Rowkey name email password
GrandpaD Mark Twain samuel@clemens.org abc123
HMS_Surprise Patrick O'Brien aubrey@sea.com abc123
The table is lexicographically
sorted on the rowkeys
SirDoyle Fyodor Dostoyevsky fyodor@brothers.net abc123
TheRealMT Sir Arthur Conan Doyle art@TheQueensMen.co.uk Langhorne
abc123
4
Cells ts1=1329088321289 ts2=1329088818321
Each cell has multiple
The coordinates used to identify data in an HBase table are: versions,
(1) rowkey, (2) column family, (3) column qualifier, (4) version typically represented by
the timestamp
of when they were
inserted into the table
(ts2>ts1)
9
Friday, April 12, 13
10. Key-‐Value
store
Keys Values
[TheRealMT, info, password, 1329088818321] abc123
[TheRealMT, info, password, 1329088321289] Langhorne
A single KeyValue instance
10
Friday, April 12, 13
11. Key-‐Value
store
[TheRealMT, info, password, 1329088818321] abc123
1 Start with coordinates of full precision
{
1329088818321 : "abc123",
[TheRealMT, info, password]
1329088321289 : "Langhorne"
}
2 Drop version and you're left with a map of version to values
Keys {
"email" : {
1329088321289 : "samuel@clemens.org"
},
"name" : {
[TheRealMT, info] 1329088321289 : "Mark Twain"
},
Values
"password" : {
1329088818321 : "abc123",
1329088321289 : "Langhorne"
}
}
3 Omit qualifier and you have a map of qualifiers to the previous maps
{
"info" : {
"email" : {
1329088321289 : "samuel@clemens.org"
},
"name" : {
1329088321289 : "Mark Twain"
[TheRealMT]
},
"password" : {
1329088818321 : "abc123",
1329088321289 : "Langhorne"
}
}
}
4 Finally, drop the column family and you have a row, a map of maps
11
Friday, April 12, 13
13. HFiles
and
physical
data
model
• HFiles
are
• Immutable
• Sorted
on
rowkey
+
qualifier
+
@mestamp
• In
the
context
of
a
column
family
per
region
"TheRealMT" , "info" , "email" , 1329088321289, "samuel@clemens.org"
"TheRealMT" , "info" , "name" , 1329088321289 , "Mark Twain"
"TheRealMT" , "info" , "password" , 1329088818321 , "abc123",
"TheRealMT" , "info" , "password" , 1329088321289 , "Langhorne"
HFile for the info column family in the users table
13
Friday, April 12, 13
14. Thinking
through
the
design
...
it’s
a
database
a?er-‐all
14
Friday, April 12, 13
15. But
isn’t
HBase
schema-‐less?
• Number
of
tables
• Rowkey
design
• Number
of
column
families
per
table.
What
goes
into
what
column
family
• Column
qualifier
names
• What
goes
into
the
cells
• Number
of
versions
15
Friday, April 12, 13
16. Rowkeys
• Rowkey
design
is
the
single
most
important
aspect
of
HBase
table
designs
• The
only
way
to
address
rows
in
HBase
16
Friday, April 12, 13
17. TwitBase
rela@onships
• Users
follow
users
• Rela@onships
need
to
be
persisted
for
usage
later
on
• Model
tables
for
the
expected
access
paHerns
• Read
paHern
• Who
does
A
follow?
• Who
follows
A?
• Does
A
follow
B?
• Write
paHern
• A
follows
B
• A
unfollows
B
17
Friday, April 12, 13
18. Start
simple
• Adjacency
list
Column Family : follows
row key:
userid column qualifier: followed user number
cell value: followed userid
Cell value
Col Qualifier
follows
TheFakeMT 1:TheRealMT 2:MTFanBoy 3:Olivia 4:HRogers
TheRealMT 1:HRogers 2:Olivia
18
Friday, April 12, 13
19. Op@mizing
the
adjacency
list
• We
need
a
count
• Where
does
a
new
followed
user
go?
follows
TheFakeMT 1:TheRealMT 2:MTFanBoy 3:Olivia 4:HRogers count:4
TheRealMT 1:HRogers 2:Olivia count:2
19
Friday, April 12, 13
20. Adding
a
new
user
Row that needs to be updated
follows
TheFakeMT 1:TheRealMT 2:MTFanBoy 3:Olivia 4:HRogers count:4
TheRealMT 1:HRogers 2:Olivia count:2
1
TheFakeMT : follows: {count -> 4}
2 increment count
Client code: TheFakeMT : follows: {count -> 5}
Step 1: Get current count
Step 2: Update count
Step 3: Add new entry 3 add new entry
Step 4: Write the new data to HBase
TheFakeMT : follows: {5 -> MTFanBoy2, count -> 5}
4
follows
TheFakeMT 1:TheRealMT 2:MTFanBoy 3:Olivia 4:HRogers 5:MTFanBoy2 count:5
TheRealMT 1:HRogers 2:Olivia count:2
20
Friday, April 12, 13
21. Transac@ons
==
not
good
• HBase
doesn’t
have
na@ve
support
(think
scale)
• Don’t
want
to
complicate
client
side
logic
• Only
solu@on
-‐>
simplify
schema
follows
TheFakeMT TheRealMT:1 MTFanBoy:1 Olivia:1 HRogers:1
TheRealMT HRogers:1 Olivia:1
21
Friday, April 12, 13
22. Revisit
the
ques@ons
• Read
paHern
• Who
all
does
A
follow?
• Who
all
follows
A?
• Does
A
follow
B?
• Write
paHern
• A
follows
B
• A
unfollows
B
22
Friday, April 12, 13
24. Denormaliza@on
• Second
table
for
reverse
rela@onship
• Otherwise
scan
across
en@re
table
and
affect
read
performance
Normalization Dreamland
Write performance
Poor design Denormalization
Read performance
23
Friday, April 12, 13
25. More
op@miza@ons
• Convert
into
tall-‐narrow
table
• Leverage
rowkey
indexing
beHer
• Gets
-‐>
short
Scans
Keeping the column family and column qualifier
names short reduces the data transferred over the
network back to the client. The KeyValue objects
become smaller.
CF : f
The + in the row key refers to concatenating
row key: CQ: followed user's name the two values. You could delimitate
follower + followed using any character you like.
eg: A-B or A,B
cell value: 1
24
Friday, April 12, 13
26. Tall-‐narrow
table
example
• Denormaliza@on
is
the
way
to
go
f Putting the user name in the column
qualifier saves you from looking up
TheFakeMT+TheRealMT Mark Twain:1 the users table for the name of the
user given an id. You can simply
TheFakeMT+MTFanBoy Amandeep Khurana:1
list out names or ids while looking
TheFakeMT+Olivia Olivia Clemens:1 at relationships just from this table.
The downside of this is that you need
TheFakeMT+HRogers Henry Rogers:1
to update the name in all the cells
TheRealMT+Olivia Olivia Clemens:1 if the user updates their name in
their profile.
TheRealMT+HRogers Henry Rogers:1 This is classic Denormalization.
25
Friday, April 12, 13
27. Uniform
rowkey
length
• MD5
the
userids
-‐>
16
bytes
+
16
bytes
rowkeys
• BeHer
distribu@on
of
load
CF : f
Using MD5 of the user ids gives you
row key: CQ: followed userid fixed lengths instead of variable
md5(follower)md5(followed) length user ids. You don't need
concatenation logic anymore.
cell value: followed users name
26
Friday, April 12, 13
28. Uniform
rowkey
length
(con@nued)
f
MD5(TheFakeMT) MD5(TheRealMT) TheRealMT:Mark Twain
MD5(TheFakeMT) MD5(MTFanBoy) MTFanBoy:Amandeep Khurana
MD5(TheFakeMT) MD5(Olivia) Olivia:Olivia Clemens
MD5(TheFakeMT) MD5(HRogers) HRogers:Henry Rogers
MD5(TheRealMT) MD5(Olivia) Olivia:Olivia Clemens
MD5(TheRealMT) MD5(HRogers) HRogers:Henry Rogers
27
Friday, April 12, 13
29. Tall
v/s
Wide
tables
storage
footprint
Logical representation of an HBase table.
Actual physical storage of the table
We'll look at what it means to Get() row r5 from this table.
CF1 CF2 HFile for CF1 HFile for CF2
r1 c1:v1 c1:v9 c6:v2
r1:CF1:c1:t1:v1
r2 c1:v2 c3:v6 r2:CF1:c1:t2:v2
r1:CF2:c1:t1:v9
r2:CF1:c3:t3:v6
r1:CF2:c6:t4:v2
r3 c2:v3 c5:v6 r3:CF1:c2:t1:v3
r3:CF2:c5:t4:v6
r4:CF1:c2:t1:v4
r5:CF2:c7:t3:v8
r4 c2:v4 r5:CF1:c1:t2:v1
r5:CF1:c3:t3:v5
r5 c1:v1 c3:v5 c7:v8
Result object returned for a Get() on row r5
r5:CF1:c1:t2:v1
r5:CF1:c3:t3:v5 KeyValue objects
r5:cf2:c7:t3:v8
Key Value
Row Col Col Time Cell
Key Fam Qual Stamp Value
Structure of a KeyValue object
28
Friday, April 12, 13
30. Rowkey
design
• Single
most
important
aspect
of
designing
tables
• Depends
on
expected
access
paHerns
• HFiles
are
sorted
on
Key
part
of
KeyValue
objects
"TheRealMT" , "info" , "email" , 1329088321289, "samuel@clemens.org"
"TheRealMT" , "info" , "name" , 1329088321289 , "Mark Twain"
"TheRealMT" , "info" , "password" , 1329088818321 , "abc123",
"TheRealMT" , "info" , "password" , 1329088321289 , "Langhorne"
HFile for the info column family in the users table
29
Friday, April 12, 13
31. Write
op@mized
• Distribute
writes
across
the
cluster
• Issue
most
pronounced
with
@me
series
data
• Hashing
hash("TheRealMT") -> random byte[]
• Sal@ng
int salt = new Integer(new Long(timestamp).hashCode()).shortValue()
% <number of region servers>;
byte[] rowkey = Bytes.add(Bytes.toBytes(salt) + Bytes.toBytes("|") +
Bytes.toBytes(timestamp));
30
Friday, April 12, 13
32. Read
op@mized
• Data
to
be
accessed
together
should
be
stored
together
• eg:
twit
streams
-‐
last
10
twits
by
the
users
I
follow Olivia1
Olivia2
1Olivia
1TheRealMT
Olivia5 2Olivia
Olivia7 2TheFakeMT
Olivia9 2TheRealMT
TheFakeMT2 3TheFakeMT
TheFakeMT3 4TheFakeMT
TheFakeMT4 5Olivia
TheFakeMT5 5TheFakeMT
TheFakeMT6 5TheRealMT
TheRealMT1 6TheFakeMT
TheRealMT2 7Olivia
TheRealMT5 8TheRealMT
TheRealMT8 9Olivia
31
Friday, April 12, 13
33. Rela@onal
to
Non
rela@onal
• Rela@onal
concepts
• En@@es
• AHributes
• Rela@onships
• En@@es
• Table
is
a
table.
Not
much
going
on
there
• Users
table
contains...
users.
Those
are
en@@es
• Good
place
to
start
32
Friday, April 12, 13
34. Rela@onal
to
Non
rela@onal
• AHributes
• Iden@fying
• Primary
keys.
Compound
keys
• Maps
to
rowkeys
• Non-‐iden@fying
• Other
columns
• Maps
to
column
qualifiers
and
cells
• Rela@onships
• Foreign
keys,
junc@on
tables,
joins.
• Non-‐existent
in
HBase.
Instead
try
to
denormalize
33
Friday, April 12, 13
35. Nested
En@@es
• Column
Qualifiers
can
contain
data
instead
of
just
a
column
name
hbase table
row key
column family
fixed qualifier → timestamp → value
Nested entities
repeating entity
variable qualifier → timestamp → value
34
Friday, April 12, 13
36. Schema
design
summary
• Schema
can
make
or
break
the
performance
you
get
• Rowkey
is
the
single
most
important
thing
• Use
tricks
like
hashing
and
sal@ng
• Denormalize
to
your
advantage
• There
are
no
joins
• Isolate
access
paHerns
• Separate
CFs
or
even
separate
tables
• Shorter
names
-‐>
lower
storage
footprint
• Column
qualifiers
can
be
used
to
store
data
and
not
just
column
names
• Big
difference
as
compared
to
RDBMS
35
Friday, April 12, 13