SlideShare une entreprise Scribd logo
1  sur  80
www.huawei.com
Security Level:
HUAWEI TECHNOLOGIES CO., LTD.
Faster HBase queries
Introducing hindex – Secondary indexes for HBase
Bhupendra Kumar Jain
bhupendra.jain@huawei.com
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 2
$ whoami
 Senior System Architect @ Huawei India
 System Lead for Huawei Hadoop-HBase component
 Apache HBase Contributor
 10+ years of experience in Business Intelligence Domain
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 3
Agenda
 HBase – A brief introduction
 Introduction to hindex
 Usage
 Test Results
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 4
Agenda
 HBase – A brief introduction
 Introduction to hindex
 Usage
 Test Results
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 5
Apache HBase
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
 an open-source, distributed,
versioned, non-relational
database
 modeled after Google’s BigTable
 leverages distributed data storage
provided by HDFS
 allows random, read/write access
to data in HDFS
 GOAL: hosting very large tables -
billions of rows X millions of
columns
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 6
HBase – Introduction
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
Sorted lexicographically
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 7
HBase – Introduction
name gender dept title mobile
Raj M SE 534
Ram M SSE
Anu M 123
Pia F 326
salary dob
1230 …
…
…
…
key
123
135
141
142
Jay SOA SA 521
Suma SOA SSE
Som M OIH SE
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
Sorted Sparse any number of columns
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 8
HBase – Introduction
Sorted Sparse
name gender dept title mobile
Raj M OIH SE 534
Ram M SSE
Anu M 123
Pia F 326
salary dob
1230 …
…
…
…
key
123
135
141
142
Jay SOA SA 521
Suma SOA SSE
Som M OIH SE
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
Multi-dimensional
SSE
SSE
TL
data is versioned
Key=123
cf1
name T1 Raj
gender T1 M
dept T1 OIH
title T1 SE
title T2 SSE
mobile T1 534
cf2
salary T3 1230
dob T1 19880830
SSE
(key, column family, column, timestamp) -> value
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 9
HBase – Introduction
Sorted DistributedSparse leverages HDFS, data replicated across nodes
name gender dept title mobile
Raj M SE 534
Ram M SSE
Anu M 123
Pia F 326
salary dob
1230 …
…
…
…
key
123
135
141
142
Jay SOA SA 521
Suma SOA SSE
Som M OIH SE
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
Protects against
failing node
Multi-dimensional
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 10
HBase – Introduction
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
Regions
Sorted Sparse Auto-sharded split and re-distributed as data growsDistributedMulti-dimensional
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 11
HBase – Introduction
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
Regions
Region Range
R1 120-145
R2 145-170
R3 …
… …
METASorted Sparse Auto-shardedDistributedMulti-dimensional
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 12
HBase – Introduction
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
Regions
Column Families
Region Range
R1 120-145
R2 145-170
R3 …
… …
META
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 13
HBase – Introduction
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
Regions
Column Families
HFile1 HFile3
HFile2 HFile4
Region Range
R1 120-145
R2 145-170
R3 …
… …
META
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 15
HBase – Introduction
HMaster
Mem
store
HFile1 HFile2
Store cf1
Mem
store
HFile3
HBlock1
HBlock2
:
HBlockN
HFile4
Store cf2
Region 1 Region 2
Region Server 1 Region Server 2
 Master, region servers and zookeeper
 Table horizontally divided into regions
 Columns grouped into Column families – Vertical partition of tables
 Memstore, HFiles in DFS. HFiles logically split into smaller blocks. Data read
write happen as blocks
 MapReduce integration
HBlock1
HBlock2
:
HBlockN
HBlock1
HBlock2
:
HBlockN
HBlock1
HBlock2
:
HBlockN
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 16
Coprocessors
 Allow to run client-supplied code on server-side,
 Can extend the functionality of HBase without changing the kernel
Pre-Action
Action
Post-Action
Client
 Observers – Like triggers
 Runs extended functionality before
or after an action through hooks
provided by coprocessor
framework.
 EndPoints – Like Stored procedures
 Can run any time from client
 The endpoint implementation will
then be executed remotely at the
target region(s)
 results from those executions will
be returned to the client.
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 17
Filters
Source: Lars, George, HBase The Definitive Guide, O’Reilly Media. 2011
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 18
HBase – Query
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
SELECT NAME
FROM PERSON
WHERE KEY=141
NOTE: HBase does not have native
support for SQL like query interface.
It is used here for the sake of easy
understanding. Similar query can be
done using scanners and filters.
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 19
HBase – Query by Rowkey
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
SELECT NAME
FROM PERSON
WHERE KEY=141
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 20
HBase – Query by Column (w/o Index)
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
SELECT NAME
FROM PERSON
WHERE MOBILE=123
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 21
HBase – Query by Column (w/o Index)
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
SELECT NAME
FROM PERSON
WHERE MOBILE=123
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 22
HBase – Query by Column (w/o Index)
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
SELECT NAME
FROM PERSON
WHERE MOBILE=123
Full table
scan
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 23
HBase – Query by Column (w/o Index)
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
SELECT NAME
FROM PERSON
WHERE MOBILE=123
Full table
scan
 Billions of records  full table
scan evil to performance
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 24
HBase – Query by Column (w/o Index)
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
SELECT NAME
FROM PERSON
WHERE MOBILE=123
Full table
scan
 Billions of records  full table
scan evil to performance
 Side effects – Client timeouts,
lease expiring
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 25
Agenda
 HBase – A brief introduction
 Introduction to hindex
 Usage
 Test Results
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 26
Introducing hindex
 Coprocessor based server side implementation of secondary indexing solution
 Separate index table, used by all indexes of a table.
 Region wise indexing (aka local indexing)
 Custom load balancer co-locates the index table regions with actual table regions.
 Index table rowkey construction is:
region start key + index name+ indexed column(s) value + user table rowkey
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 27
hindex: Architecture
HBase
Client
HMaster
Balancer
Indexing
Coprocessor
Coprocesor
Host
ClientExt
RowKey cf1:col1
001 A
002 B
003 Z
004 C
005 A
006 A
… …
RegionServer
Coprocesor
Host
RowKey Index table cf
001_A_001
001_A_005
001_A_006
001_B_002
001_C_004
001_Z_003
…
 Coprocessor handles the
index data
 A custom LoadBalancer does
collocation
 Client Extn allows specifying
index details while creating
table, not needed for
read/write
Primary User table Secondary index table
Client App
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 28
HBase – Query by Column (w/ Index)
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
SELECT NAME
FROM PERSON
WHERE MOBILE=123
Index column: MOBILE
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 31
HBase – Query by Column (w/ Index)
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
idx1
123_141
254_135
326_142
534_123
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
126_148
521_145
665_152
…
SELECT NAME
FROM PERSON
WHERE MOBILE=123
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 32
HBase – Query by Column (w/ Index)
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
idx1
123_141
254_135
326_142
534_123
126_148
521_145
665_152
…
Regions
Index maintained per
region, not globally
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 33
HBase – Query by Column (w/ Index)
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
idx1
123_141
254_135
326_142
534_123
126_148
521_145
665_152
…
Regions
Index maintained per
region, not globally
Network calls
avoided
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 34
HBase – Query by Column (w/ Index)
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
salary dob
1230 …
1750 …
2100 …
2270 …
key
123
135
141
142
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH SE 665
… … … … …
4300 …
1550 …
1270 …
.. …
145
148
152
…
idx1
123_141
254_135
326_142
534_123
126_148
521_145
665_152
…
Regions
Index maintained per
region, not globally
Handle
 Region Movement
 Region Splits
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 35
Regions Co-location
HMaster
Balancer
RS1
RS2
R1
R2
Client
R1
R2
Actual Table
Index Table
A
B
B
C
A
B
B
C
Create table
with regions
R1
R2
R1
R2
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 36
Put operation
 Table-> t1 & Column family -> cf
 Index-> idx1(cf:q1) & idx2(cf:q2)
 Index table -> t1_idx
HRegionServer
A
Coprocessor
Client
User Region
R1
Index Region
R1
A
B
A
B
Put
‘t1’,’AAB’,
’cf:q1’,’5’,
’cf:q2’,’z1’
Put ‘t1_idx’,’Aidx15AAB’
Put ‘t1_idx’,’Aidx2z1AAB’
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 37
Scan Operation
1) Create scanner for index region at server side
HRegionServer
A
Coprocessor
Client
User Region
R1
Index Region
R1
A
B
A
B
Create scanner
(condition cf:q1=5)
Create scanner on index region
Start row : Aidx15
Stop row : Aidx16
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 38
Scan Operation
2) Scan index table at server side and seek to exact rows in the user table
HRegionServer
A
Coprocessor
Client
User Region
R1
Index Region
R1
A
B
A
B
next()Seek to exact row
1
2
3
4
5
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 39
Scan Operation
 Coprocessors read index and seek to
exact row in the user table
 Doing seeks on HFiles based on
rowkey obtained from index data
 HFiles reads as block by block
 Default block size is 64kb
 Skipping block reads from HDFS
where data not at all present
 Some times skipping full HFile
 No need to read index details back to
client avoiding network extra usage.
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 40
hindex: Usage
SELECT NAME
FROM PERSON
WHERE (DEPT=‘OIH’ OR TITLE=‘TL’)
AND (400 > MOBILE AND MOBILE > 500)
Filters (with equal or range conditions)
AND  Filters list with MUST_PASS_ALL
OR  Filters list with MUST_PASS_ONE
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 41
HBase – Query with AND
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
key
123
135
141
142
idx1
BDI_142
OIH_123
OIH_141
OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH TL 665
… … … … …
145
148
152
…
OIH_152
SOA_145
SOA_148
…
idx2
TL_141
TL_142
TL_152
SA_145
SE_123
SSE_135
SSE_148
…
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 42
HBase – Query with AND
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
key
123
135
141
142
idx1_BDI_142
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH TL 665
… … … … …
145
148
152
…
idx1_OIH_152
idx1_SOA_145
idx1_SOA_148
…
idx2_TL_141
idx2_TL_142
idx2_TL_152
idx2_SA_145
…
1) Single index table per user table,
easier to collocate
2) Index table row keys have index
name to store each index data
together
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 43
HBase – Query with AND
name gender dept title mobile
Raj M OIH SE 534
Ram M OIH SSE 254
Anu M OIH TL 123
Pia F BDI TL 326
key
123
135
141
142
idx1_BDI_142
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA SA 521
Suma F SOA SSE 126
Som M OIH TL 665
… … … … …
145
148
152
…
idx1_OIH_152
idx1_SOA_145
idx1_SOA_148
…
idx2_TL_141
idx2_TL_142
idx2_TL_152
idx2_SA_145
…
Create two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 44
HBase – Query with AND
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
idx2_TL_145
Idx2_TL_148
Idx2_TL_150
idx2_TL_152
Idx2_TL_160
idx2_SA_141
…
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 45
HBase – Query with AND
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
idx2_TL_145
Idx2_TL_148
Idx2_TL_150
idx2_TL_152
Idx2_TL_160
idx2_SA_141
…
123 < 142
scanner-1 can jump to idx1_OIH_142
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 46
HBase – Query with AND
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
idx2_TL_145
Idx2_TL_148
Idx2_TL_150
idx2_TL_152
Idx2_TL_160
idx2_SA_141
…
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 47
HBase – Query with AND
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
idx2_TL_145
Idx2_TL_148
Idx2_TL_150
idx2_TL_152
Idx2_TL_160
idx2_SA_141
…
142 = 142 (meets our condition)
- Can fetch required data
- Move both scanners to next
Pia
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 48
HBase – Query with AND
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
idx2_TL_145
Idx2_TL_148
Idx2_TL_150
idx2_TL_152
Idx2_TL_160
idx2_SA_141
…
152 > 145
scanner-2 can jump to idx2_TL_152
Pia
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 49
HBase – Query with AND
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
Idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
Idx2_TL_160
idx2_SA_141
…
152 = 152 (meets our condition)
- Can fetch required data
- Move both scanners to next
Pia
Som
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 50
HBase – Query with AND
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
Idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
idx2_TL_160
idx2_SA_141
…
Scanner-1 reaches end
Close both scanners
Pia
Som
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 51
HBase – Query with AND (w/ 2-column index)
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_TL_160
idx1_OIH_SA_141
idx1_OIH_TL_142
idx1_OIH_TL_152
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_SOA_SSE_135
idx1_SOA_TL_145
Idx1_SOA_TL_148
Idx1_SOA_TL_150
Idx1_SOA_TL_150
…
1 scanner:
start: idx1_OIH_TL, end: idx1_OIH_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 52
HBase – Query with AND (w/ 2-column index)
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_TL_160
idx1_OIH_SA_141
idx1_OIH_TL_142
idx1_OIH_TL_152
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_SOA_SSE_135
idx1_SOA_TL_145
Idx1_SOA_TL_148
Idx1_SOA_TL_150
Idx1_SOA_TL_150
…
1 scanner:
start: idx1_OIH_TL, end: idx1_OIH_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 53
HBase – Query with AND (w/ 2-column index)
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_TL_160
idx1_OIH_SA_141
idx1_OIH_TL_142
idx1_OIH_TL_152
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_SOA_SSE_135
idx1_SOA_TL_145
Idx1_SOA_TL_148
Idx1_SOA_TL_150
Idx1_SOA_TL_150
…
Pia
Meets condition
- Can fetch required data
1 scanner:
start: idx1_OIH_TL, end: idx1_OIH_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 54
HBase – Query with AND (w/ 2-column index)
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_TL_160
idx1_OIH_SA_141
idx1_OIH_TL_142
idx1_OIH_TL_152
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_SOA_SSE_135
idx1_SOA_TL_145
Idx1_SOA_TL_148
Idx1_SOA_TL_150
Idx1_SOA_TL_150
…
Pia
Som
Meets condition
- Can fetch required data
1 scanner:
start: idx1_OIH_TL, end: idx1_OIH_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 55
HBase – Query with AND (w/ 2-column index)
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_TL_160
idx1_OIH_SA_141
idx1_OIH_TL_142
idx1_OIH_TL_152
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
AND TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_SOA_SSE_135
idx1_SOA_TL_145
Idx1_SOA_TL_148
Idx1_SOA_TL_150
Idx1_SOA_TL_150
…
Pia
Som
Scanner reaches end
Close the scanner
1 scanner:
start: idx1_OIH_TL, end: idx1_OIH_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 56
HBase – Query with OR
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
OR TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
Idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
idx2_TL_160
idx2_SA_141
…
123 <= 142
- Get required results
- Move scanner-1 to next till exceeds 142
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 57
HBase – Query with OR
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
OR TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
Idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
idx2_TL_160
idx2_SA_141
…
Raj
123 <= 142
- Get required results
- Move scanner-1 to next till exceeds 142
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 58
HBase – Query with OR
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
OR TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
Idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
idx2_TL_160
idx2_SA_141
…
141 <= 142
- Get required results
- Move scanner-1 to next till exceeds 142
Raj
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 59
HBase – Query with OR
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
OR TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
Idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
idx2_TL_160
idx2_SA_141
…
Raj
Anu
141 <= 142
- Get required results
- Move scanner-1 to next till exceeds 142
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 60
HBase – Query with OR
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
OR TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
Idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
idx2_TL_160
idx2_SA_141
…
142 == 142
- Get required results
- Move both scanners in this case
Raj
Anu
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 61
HBase – Query with OR
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
OR TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
Idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
idx2_TL_160
idx2_SA_141
…
Raj
Anu
Pia
142 == 142
- Get required results
- Move both scanners in this case
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 62
HBase – Query with OR
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
OR TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
idx2_TL_160
idx2_SA_141
…
Raj
Anu
Pia
152 >= 145 (scanner-2 is behind)
- Get required results
- Move scanner-2 to next till exceeds 152
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 63
HBase – Query with OR
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
OR TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
idx2_TL_160
idx2_SA_141
…
Raj
Anu
Pia
Jay
152 >= 145 (scanner-2 is behind)
- Get required results
- Move scanner-2 to next till exceeds 152
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 64
HBase – Query with OR
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
OR TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
idx2_TL_160
idx2_SA_141
…
Raj
Anu
Pia
Jay
Suma
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 65
HBase – Query with OR
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
OR TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
idx2_TL_160
idx2_SA_141
…
Raj
Anu
Pia
Jay
Suma
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 66
HBase – Query with OR
name gender dept title mobile
Raj M OIH SE 534
Ram M SOA SSE 254
Anu M OIH SA 123
Pia F OIH TL 326
key
123
135
141
142
idx1_BDI_160
idx1_OIH_123
idx1_OIH_141
idx1_OIH_142
SELECT NAME
FROM PERSON
WHERE DEPT=OIH
OR TITLE=TL
Jay M SOA TL 521
Suma F SOA TL 126
M SOA TL 325
Som M OIH TL 665
Su F BDI TL 928
… … … … …
145
148
150
152
160
…
idx1_OIH_152
idx1_SOA_135
idx1_SOA_148
Idx1_SOA_150
…
idx2_TL_142
idx2_TL_145
idx2_TL_148
idx2_TL_150
idx2_TL_152
idx2_TL_160
idx2_SA_141
…
Raj
Anu
Pia
Jay
Suma
Som
two scanners
1) start: idx1_OIH, end: idx1_OII
2) Start: idx2_TL, end: idx2_TM
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 67
Region Split
Rowkey cf:col1
01 A
02 A
03 C
04 B
05 X
06 A
07 A
01
09
Rowkey cf
01_A_01
01_A_02
01_A_06
01_A_07
01_B_04
01_C_03
01_X_05
01
09
Split at
this point
User table region Index table region
Rowkey cf:col1
01 A
02 A
03 C
04 B
Rowkey cf:col1
05 X
06 A
07 A
Rowkey cf
01_A_01
01_A_02
01_B_04
01_C_03
01
05
05
09
01
05
Rowkey Cf
05_A_06
05_A_07
05_X_05
05
09
 Explicit split on index region is
avoided (using custom split policy
for index tables)
 When user table region splits,
corresponding index region also
splits
 Split key for index region same as
that of user region
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 68
Region Split
Rowkey cf:col1
01 A
02 A
03 C
04 B
05 X
06 A
07 A
01
09
Rowkey cf
01_A_01
01_A_02
01_A_06
01_A_07
01_B_04
01_C_03
01_X_05
01
09
User table region
Index table region
HalfStoreFileReader –
Daughter A
HalfStoreFileReader –
Daughter B
IndexHalfStoreFileReader
–Daughter A
–Daughter B
 Custom HalfStoreFileReader for
reading index daughter regions.
 IndexHalfStoreFileReader – Both
half store file readers start at same
point i.e. beginning of file
 Checks actual table rowkey and decide
KV corresponding to it or not
 IndexHalfStoreFileReader for
daughter B - Changes the key as per
the daughter start key.
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 69
Agenda
 HBase – A brief introduction
 Introduction to hindex
 Usage
 Test Results
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 70
Usage: Getting Started
 For HBase 0.94.x
 https://github.com/Huawei-Hadoop/hindex/
 For HBase 0.98 or trunk
 https://issues.apache.org/jira/browse/HBASE-10222
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 71
Usage: Configurations
Name Value
hbase.coprocessor.master.classes org.apache.hadoop.hbase.index.coprocessor
.master.IndexMasterObserver
hbase.coprocessor.region.classes org.apache.hadoop.hbase.index.coprocessor
.regionserver.IndexRegionObserver
hbase.coprocessor.wal.classes org.apache.hadoop.hbase.index.coprocessor
.wal.IndexWALObserver
hbase.master.loadbalancer.class org.apache.hadoop.hbase.index.SecIndexLoa
dBalancer
hbase.use.secondary.index true
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 72
Usage: Creating Index
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 73
Usage: Tools
 TableIndexer tool to create index(es) for existing data
 Bulk load tool to load user data to user table and index it at same time
 Tool to check regions co-location and repair if any co-location mismatches
$HBASE_HOME/bin/hbase org.apache.hadoop.hbase.index.mapreduce.TableIndexer
-Dtablename.to.index=table -Dtable.columns.index= ‘IDX1=>cf1:[q1->datatype& length]’
$HBASE_HOME/bin/hbase org.apache.hadoop.hbase.index.mapreduce.IndexImportTsv
-Dimporttsv.columns=a,b,c -Dimporttsv.bulk.output=hdfs://storefile-outputdir <tablename>
<hdfs-data-inputdir>
$HBASE_HOME/bin/hbase org.apache.hadoop.hbase.index.util.SecondaryIndexColocator
$HBASE_HOME/bin/hbase org.apache.hadoop.hbase.index.mapreduce.IndexLoadIncrementalHFiles
<hdfs://storefileoutput> <tablename>
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 74
Agenda
 HBase – A brief introduction
 Introduction to hindex
 Usage
 Test Results
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 75
Test Results: Put Performance
Hardware Architecture : x86_64
CPU(s) : 24 (2 threads per core)
RS Heap size: 8GB
Topology 5 Region Servers
100 Regions (user table)
Data 100 GB data
500 bytes per record
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 76
Test Results: Scan Performance
idx1  cf:q1 idx2  cf:q2
Search for a column value
Hardware Architecture : x86_64
CPU(s) : 24 (2 threads per core)
RS Heap size: 8GB
Topology 5 Region Servers
100 Regions (user table)
Data 50 GB data
500 bytes per record
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 77
Test Results: Query with AND
idx1  cf:q1
idx2  cf:q2
Hardware Architecture : x86_64
CPU(s) : 24 (2 threads per core)
RS Heap size: 8GB
Topology 5 Region Servers
100 Regions (user table)
Data 50 GB data
500 bytes per record
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 78
Test Results: Scan w/ Range Query
idx3  cf:q3
Hardware Architecture : x86_64
CPU(s) : 24 (2 threads per core)
RS Heap size: 8GB
Topology 5 Region Servers
100 Regions (user table)
Data 50 GB data
500 bytes per record
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 79
Test Results: Scan w/ Multi Column Index
idx4 -> cf:q1,cf:q3
Hardware Architecture : x86_64
CPU(s) : 24 (2 threads per core)
RS Heap size: 8GB
Topology 5 Region Servers
100 Regions (user table)
Data 50 GB data
500 bytes per record
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 80
Summary
 Design
 Supports multiple indexes and multi-column indexes on a table
 Supports indexing on part of a column value
 Supports equal and range condition scans using index
 Supports dynamic add/drop index
 Supports hints to skip index scan or specific indexes to use in the scan.
 Intelligent Filter evaluation
 Application usage
 No changes required to perform read and write operations.
 Use IndexAdmin (client extension) to perform admin operations like create, enable,
disable and drop on indexed table.
 Need not perform admin operations separately on index table.
 Upgrade/Integration
 Minimal code changes in HBase kernel. HBase version upgrade is very easy.
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 81
Roadmap
 Contribute to HBase community (In progress – refer HBASE-9203)
 HBCK tool support for Secondary index tables
 Pluggable Scan-Evaluation
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 82
Q & A
https://github.com/Huawei-Hadoop/hindex/
mail to: bhupendra.jain@huawei.com
Thank you
www.huawei.com
Copyright©2011 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including, without limitation, statements
regarding the future financial and operating results, future product portfolio, new technology, etc. There are a
number of factors that could cause actual results and developments to differ materially from those expressed
or implied in the predictive statements. Therefore, such information is provided for reference purpose only and
constitutes neither an offer nor an acceptance. Huawei may change the information at any time without notice.

Contenu connexe

Similaire à Faster HBase queries

From MSS to TelScale - Mobicents Summit 2011
From MSS to TelScale - Mobicents Summit 2011From MSS to TelScale - Mobicents Summit 2011
From MSS to TelScale - Mobicents Summit 2011
telestax
 
Cx600 x1-m & cx600-x2-m v800 r005c01 configuration guide - interface and data...
Cx600 x1-m & cx600-x2-m v800 r005c01 configuration guide - interface and data...Cx600 x1-m & cx600-x2-m v800 r005c01 configuration guide - interface and data...
Cx600 x1-m & cx600-x2-m v800 r005c01 configuration guide - interface and data...
pajcp
 
VMware_Snapshot sessions_Horizon vision and strategy
VMware_Snapshot sessions_Horizon vision and strategyVMware_Snapshot sessions_Horizon vision and strategy
VMware_Snapshot sessions_Horizon vision and strategy
AnnSteyaert_vmware
 
Configure Proxy and Firewall (Iptables)
Configure Proxy and Firewall (Iptables)Configure Proxy and Firewall (Iptables)
Configure Proxy and Firewall (Iptables)
Tola LENG
 
533955144-MSAN-MA5600T-Basic-Operation.pdf
533955144-MSAN-MA5600T-Basic-Operation.pdf533955144-MSAN-MA5600T-Basic-Operation.pdf
533955144-MSAN-MA5600T-Basic-Operation.pdf
docteurgyneco1
 

Similaire à Faster HBase queries (14)

Hcna intermediate lab
Hcna intermediate labHcna intermediate lab
Hcna intermediate lab
 
Cisco Switches vs. Huawei Switches
Cisco Switches vs. Huawei SwitchesCisco Switches vs. Huawei Switches
Cisco Switches vs. Huawei Switches
 
From MSS to TelScale - Mobicents Summit 2011
From MSS to TelScale - Mobicents Summit 2011From MSS to TelScale - Mobicents Summit 2011
From MSS to TelScale - Mobicents Summit 2011
 
R Leon Sapp the IT Professional
R Leon Sapp the IT ProfessionalR Leon Sapp the IT Professional
R Leon Sapp the IT Professional
 
Top 13 best security practices
Top 13 best security practicesTop 13 best security practices
Top 13 best security practices
 
Systems engineer ( rhce certified )
Systems engineer ( rhce certified )Systems engineer ( rhce certified )
Systems engineer ( rhce certified )
 
Cx600 x1-m & cx600-x2-m v800 r005c01 configuration guide - interface and data...
Cx600 x1-m & cx600-x2-m v800 r005c01 configuration guide - interface and data...Cx600 x1-m & cx600-x2-m v800 r005c01 configuration guide - interface and data...
Cx600 x1-m & cx600-x2-m v800 r005c01 configuration guide - interface and data...
 
Unlocking the SDN and NFV Transformation
Unlocking the SDN and NFV TransformationUnlocking the SDN and NFV Transformation
Unlocking the SDN and NFV Transformation
 
Smart networking with service meshes
Smart networking with service meshes  Smart networking with service meshes
Smart networking with service meshes
 
VMware_Snapshot sessions_Horizon vision and strategy
VMware_Snapshot sessions_Horizon vision and strategyVMware_Snapshot sessions_Horizon vision and strategy
VMware_Snapshot sessions_Horizon vision and strategy
 
Top 13 best security practices for Azure
Top 13 best security practices for AzureTop 13 best security practices for Azure
Top 13 best security practices for Azure
 
E Snet Raf Essc Jan2005
E Snet Raf Essc Jan2005E Snet Raf Essc Jan2005
E Snet Raf Essc Jan2005
 
Configure Proxy and Firewall (Iptables)
Configure Proxy and Firewall (Iptables)Configure Proxy and Firewall (Iptables)
Configure Proxy and Firewall (Iptables)
 
533955144-MSAN-MA5600T-Basic-Operation.pdf
533955144-MSAN-MA5600T-Basic-Operation.pdf533955144-MSAN-MA5600T-Basic-Operation.pdf
533955144-MSAN-MA5600T-Basic-Operation.pdf
 

Plus de datamantra

Plus de datamantra (20)

Multi Source Data Analysis using Spark and Tellius
Multi Source Data Analysis using Spark and TelliusMulti Source Data Analysis using Spark and Tellius
Multi Source Data Analysis using Spark and Tellius
 
State management in Structured Streaming
State management in Structured StreamingState management in Structured Streaming
State management in Structured Streaming
 
Spark on Kubernetes
Spark on KubernetesSpark on Kubernetes
Spark on Kubernetes
 
Understanding transactional writes in datasource v2
Understanding transactional writes in  datasource v2Understanding transactional writes in  datasource v2
Understanding transactional writes in datasource v2
 
Introduction to Datasource V2 API
Introduction to Datasource V2 APIIntroduction to Datasource V2 API
Introduction to Datasource V2 API
 
Exploratory Data Analysis in Spark
Exploratory Data Analysis in SparkExploratory Data Analysis in Spark
Exploratory Data Analysis in Spark
 
Core Services behind Spark Job Execution
Core Services behind Spark Job ExecutionCore Services behind Spark Job Execution
Core Services behind Spark Job Execution
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloads
 
Structured Streaming with Kafka
Structured Streaming with KafkaStructured Streaming with Kafka
Structured Streaming with Kafka
 
Understanding time in structured streaming
Understanding time in structured streamingUnderstanding time in structured streaming
Understanding time in structured streaming
 
Spark stack for Model life-cycle management
Spark stack for Model life-cycle managementSpark stack for Model life-cycle management
Spark stack for Model life-cycle management
 
Productionalizing Spark ML
Productionalizing Spark MLProductionalizing Spark ML
Productionalizing Spark ML
 
Introduction to Structured streaming
Introduction to Structured streamingIntroduction to Structured streaming
Introduction to Structured streaming
 
Building real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark StreamingBuilding real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark Streaming
 
Testing Spark and Scala
Testing Spark and ScalaTesting Spark and Scala
Testing Spark and Scala
 
Understanding Implicits in Scala
Understanding Implicits in ScalaUnderstanding Implicits in Scala
Understanding Implicits in Scala
 
Migrating to Spark 2.0 - Part 2
Migrating to Spark 2.0 - Part 2Migrating to Spark 2.0 - Part 2
Migrating to Spark 2.0 - Part 2
 
Migrating to spark 2.0
Migrating to spark 2.0Migrating to spark 2.0
Migrating to spark 2.0
 
Scalable Spark deployment using Kubernetes
Scalable Spark deployment using KubernetesScalable Spark deployment using Kubernetes
Scalable Spark deployment using Kubernetes
 
Introduction to concurrent programming with akka actors
Introduction to concurrent programming with akka actorsIntroduction to concurrent programming with akka actors
Introduction to concurrent programming with akka actors
 

Dernier

👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
gajnagarg
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 

Dernier (20)

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 

Faster HBase queries

  • 1. www.huawei.com Security Level: HUAWEI TECHNOLOGIES CO., LTD. Faster HBase queries Introducing hindex – Secondary indexes for HBase Bhupendra Kumar Jain bhupendra.jain@huawei.com
  • 2. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 2 $ whoami  Senior System Architect @ Huawei India  System Lead for Huawei Hadoop-HBase component  Apache HBase Contributor  10+ years of experience in Business Intelligence Domain
  • 3. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 3 Agenda  HBase – A brief introduction  Introduction to hindex  Usage  Test Results
  • 4. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 4 Agenda  HBase – A brief introduction  Introduction to hindex  Usage  Test Results
  • 5. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 5 Apache HBase name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 …  an open-source, distributed, versioned, non-relational database  modeled after Google’s BigTable  leverages distributed data storage provided by HDFS  allows random, read/write access to data in HDFS  GOAL: hosting very large tables - billions of rows X millions of columns
  • 6. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 6 HBase – Introduction name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … Sorted lexicographically
  • 7. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 7 HBase – Introduction name gender dept title mobile Raj M SE 534 Ram M SSE Anu M 123 Pia F 326 salary dob 1230 … … … … key 123 135 141 142 Jay SOA SA 521 Suma SOA SSE Som M OIH SE … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … Sorted Sparse any number of columns
  • 8. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 8 HBase – Introduction Sorted Sparse name gender dept title mobile Raj M OIH SE 534 Ram M SSE Anu M 123 Pia F 326 salary dob 1230 … … … … key 123 135 141 142 Jay SOA SA 521 Suma SOA SSE Som M OIH SE … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … Multi-dimensional SSE SSE TL data is versioned Key=123 cf1 name T1 Raj gender T1 M dept T1 OIH title T1 SE title T2 SSE mobile T1 534 cf2 salary T3 1230 dob T1 19880830 SSE (key, column family, column, timestamp) -> value
  • 9. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 9 HBase – Introduction Sorted DistributedSparse leverages HDFS, data replicated across nodes name gender dept title mobile Raj M SE 534 Ram M SSE Anu M 123 Pia F 326 salary dob 1230 … … … … key 123 135 141 142 Jay SOA SA 521 Suma SOA SSE Som M OIH SE … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … Protects against failing node Multi-dimensional
  • 10. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 10 HBase – Introduction name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … Regions Sorted Sparse Auto-sharded split and re-distributed as data growsDistributedMulti-dimensional
  • 11. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 11 HBase – Introduction name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … Regions Region Range R1 120-145 R2 145-170 R3 … … … METASorted Sparse Auto-shardedDistributedMulti-dimensional
  • 12. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 12 HBase – Introduction name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … Regions Column Families Region Range R1 120-145 R2 145-170 R3 … … … META
  • 13. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 13 HBase – Introduction name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … Regions Column Families HFile1 HFile3 HFile2 HFile4 Region Range R1 120-145 R2 145-170 R3 … … … META
  • 14. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 15 HBase – Introduction HMaster Mem store HFile1 HFile2 Store cf1 Mem store HFile3 HBlock1 HBlock2 : HBlockN HFile4 Store cf2 Region 1 Region 2 Region Server 1 Region Server 2  Master, region servers and zookeeper  Table horizontally divided into regions  Columns grouped into Column families – Vertical partition of tables  Memstore, HFiles in DFS. HFiles logically split into smaller blocks. Data read write happen as blocks  MapReduce integration HBlock1 HBlock2 : HBlockN HBlock1 HBlock2 : HBlockN HBlock1 HBlock2 : HBlockN
  • 15. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 16 Coprocessors  Allow to run client-supplied code on server-side,  Can extend the functionality of HBase without changing the kernel Pre-Action Action Post-Action Client  Observers – Like triggers  Runs extended functionality before or after an action through hooks provided by coprocessor framework.  EndPoints – Like Stored procedures  Can run any time from client  The endpoint implementation will then be executed remotely at the target region(s)  results from those executions will be returned to the client.
  • 16. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 17 Filters Source: Lars, George, HBase The Definitive Guide, O’Reilly Media. 2011
  • 17. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 18 HBase – Query name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … SELECT NAME FROM PERSON WHERE KEY=141 NOTE: HBase does not have native support for SQL like query interface. It is used here for the sake of easy understanding. Similar query can be done using scanners and filters.
  • 18. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 19 HBase – Query by Rowkey name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … SELECT NAME FROM PERSON WHERE KEY=141
  • 19. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 20 HBase – Query by Column (w/o Index) name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … SELECT NAME FROM PERSON WHERE MOBILE=123
  • 20. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 21 HBase – Query by Column (w/o Index) name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … SELECT NAME FROM PERSON WHERE MOBILE=123
  • 21. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 22 HBase – Query by Column (w/o Index) name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … SELECT NAME FROM PERSON WHERE MOBILE=123 Full table scan
  • 22. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 23 HBase – Query by Column (w/o Index) name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … SELECT NAME FROM PERSON WHERE MOBILE=123 Full table scan  Billions of records  full table scan evil to performance
  • 23. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 24 HBase – Query by Column (w/o Index) name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … SELECT NAME FROM PERSON WHERE MOBILE=123 Full table scan  Billions of records  full table scan evil to performance  Side effects – Client timeouts, lease expiring
  • 24. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 25 Agenda  HBase – A brief introduction  Introduction to hindex  Usage  Test Results
  • 25. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 26 Introducing hindex  Coprocessor based server side implementation of secondary indexing solution  Separate index table, used by all indexes of a table.  Region wise indexing (aka local indexing)  Custom load balancer co-locates the index table regions with actual table regions.  Index table rowkey construction is: region start key + index name+ indexed column(s) value + user table rowkey
  • 26. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 27 hindex: Architecture HBase Client HMaster Balancer Indexing Coprocessor Coprocesor Host ClientExt RowKey cf1:col1 001 A 002 B 003 Z 004 C 005 A 006 A … … RegionServer Coprocesor Host RowKey Index table cf 001_A_001 001_A_005 001_A_006 001_B_002 001_C_004 001_Z_003 …  Coprocessor handles the index data  A custom LoadBalancer does collocation  Client Extn allows specifying index details while creating table, not needed for read/write Primary User table Secondary index table Client App
  • 27. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 28 HBase – Query by Column (w/ Index) name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … SELECT NAME FROM PERSON WHERE MOBILE=123 Index column: MOBILE
  • 28. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 31 HBase – Query by Column (w/ Index) name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 idx1 123_141 254_135 326_142 534_123 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … 126_148 521_145 665_152 … SELECT NAME FROM PERSON WHERE MOBILE=123
  • 29. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 32 HBase – Query by Column (w/ Index) name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … idx1 123_141 254_135 326_142 534_123 126_148 521_145 665_152 … Regions Index maintained per region, not globally
  • 30. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 33 HBase – Query by Column (w/ Index) name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … idx1 123_141 254_135 326_142 534_123 126_148 521_145 665_152 … Regions Index maintained per region, not globally Network calls avoided
  • 31. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 34 HBase – Query by Column (w/ Index) name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 salary dob 1230 … 1750 … 2100 … 2270 … key 123 135 141 142 Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH SE 665 … … … … … 4300 … 1550 … 1270 … .. … 145 148 152 … idx1 123_141 254_135 326_142 534_123 126_148 521_145 665_152 … Regions Index maintained per region, not globally Handle  Region Movement  Region Splits
  • 32. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 35 Regions Co-location HMaster Balancer RS1 RS2 R1 R2 Client R1 R2 Actual Table Index Table A B B C A B B C Create table with regions R1 R2 R1 R2
  • 33. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 36 Put operation  Table-> t1 & Column family -> cf  Index-> idx1(cf:q1) & idx2(cf:q2)  Index table -> t1_idx HRegionServer A Coprocessor Client User Region R1 Index Region R1 A B A B Put ‘t1’,’AAB’, ’cf:q1’,’5’, ’cf:q2’,’z1’ Put ‘t1_idx’,’Aidx15AAB’ Put ‘t1_idx’,’Aidx2z1AAB’
  • 34. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 37 Scan Operation 1) Create scanner for index region at server side HRegionServer A Coprocessor Client User Region R1 Index Region R1 A B A B Create scanner (condition cf:q1=5) Create scanner on index region Start row : Aidx15 Stop row : Aidx16
  • 35. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 38 Scan Operation 2) Scan index table at server side and seek to exact rows in the user table HRegionServer A Coprocessor Client User Region R1 Index Region R1 A B A B next()Seek to exact row 1 2 3 4 5
  • 36. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 39 Scan Operation  Coprocessors read index and seek to exact row in the user table  Doing seeks on HFiles based on rowkey obtained from index data  HFiles reads as block by block  Default block size is 64kb  Skipping block reads from HDFS where data not at all present  Some times skipping full HFile  No need to read index details back to client avoiding network extra usage.
  • 37. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 40 hindex: Usage SELECT NAME FROM PERSON WHERE (DEPT=‘OIH’ OR TITLE=‘TL’) AND (400 > MOBILE AND MOBILE > 500) Filters (with equal or range conditions) AND  Filters list with MUST_PASS_ALL OR  Filters list with MUST_PASS_ONE
  • 38. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 41 HBase – Query with AND name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 key 123 135 141 142 idx1 BDI_142 OIH_123 OIH_141 OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH TL 665 … … … … … 145 148 152 … OIH_152 SOA_145 SOA_148 … idx2 TL_141 TL_142 TL_152 SA_145 SE_123 SSE_135 SSE_148 …
  • 39. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 42 HBase – Query with AND name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 key 123 135 141 142 idx1_BDI_142 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH TL 665 … … … … … 145 148 152 … idx1_OIH_152 idx1_SOA_145 idx1_SOA_148 … idx2_TL_141 idx2_TL_142 idx2_TL_152 idx2_SA_145 … 1) Single index table per user table, easier to collocate 2) Index table row keys have index name to store each index data together
  • 40. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 43 HBase – Query with AND name gender dept title mobile Raj M OIH SE 534 Ram M OIH SSE 254 Anu M OIH TL 123 Pia F BDI TL 326 key 123 135 141 142 idx1_BDI_142 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA SA 521 Suma F SOA SSE 126 Som M OIH TL 665 … … … … … 145 148 152 … idx1_OIH_152 idx1_SOA_145 idx1_SOA_148 … idx2_TL_141 idx2_TL_142 idx2_TL_152 idx2_SA_145 … Create two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 41. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 44 HBase – Query with AND name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 idx2_TL_145 Idx2_TL_148 Idx2_TL_150 idx2_TL_152 Idx2_TL_160 idx2_SA_141 … two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 42. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 45 HBase – Query with AND name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 idx2_TL_145 Idx2_TL_148 Idx2_TL_150 idx2_TL_152 Idx2_TL_160 idx2_SA_141 … 123 < 142 scanner-1 can jump to idx1_OIH_142 two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 43. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 46 HBase – Query with AND name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 idx2_TL_145 Idx2_TL_148 Idx2_TL_150 idx2_TL_152 Idx2_TL_160 idx2_SA_141 … two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 44. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 47 HBase – Query with AND name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 idx2_TL_145 Idx2_TL_148 Idx2_TL_150 idx2_TL_152 Idx2_TL_160 idx2_SA_141 … 142 = 142 (meets our condition) - Can fetch required data - Move both scanners to next Pia two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 45. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 48 HBase – Query with AND name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 idx2_TL_145 Idx2_TL_148 Idx2_TL_150 idx2_TL_152 Idx2_TL_160 idx2_SA_141 … 152 > 145 scanner-2 can jump to idx2_TL_152 Pia two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 46. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 49 HBase – Query with AND name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 Idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 Idx2_TL_160 idx2_SA_141 … 152 = 152 (meets our condition) - Can fetch required data - Move both scanners to next Pia Som two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 47. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 50 HBase – Query with AND name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 Idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 idx2_TL_160 idx2_SA_141 … Scanner-1 reaches end Close both scanners Pia Som two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 48. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 51 HBase – Query with AND (w/ 2-column index) name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_TL_160 idx1_OIH_SA_141 idx1_OIH_TL_142 idx1_OIH_TL_152 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_SOA_SSE_135 idx1_SOA_TL_145 Idx1_SOA_TL_148 Idx1_SOA_TL_150 Idx1_SOA_TL_150 … 1 scanner: start: idx1_OIH_TL, end: idx1_OIH_TM
  • 49. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 52 HBase – Query with AND (w/ 2-column index) name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_TL_160 idx1_OIH_SA_141 idx1_OIH_TL_142 idx1_OIH_TL_152 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_SOA_SSE_135 idx1_SOA_TL_145 Idx1_SOA_TL_148 Idx1_SOA_TL_150 Idx1_SOA_TL_150 … 1 scanner: start: idx1_OIH_TL, end: idx1_OIH_TM
  • 50. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 53 HBase – Query with AND (w/ 2-column index) name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_TL_160 idx1_OIH_SA_141 idx1_OIH_TL_142 idx1_OIH_TL_152 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_SOA_SSE_135 idx1_SOA_TL_145 Idx1_SOA_TL_148 Idx1_SOA_TL_150 Idx1_SOA_TL_150 … Pia Meets condition - Can fetch required data 1 scanner: start: idx1_OIH_TL, end: idx1_OIH_TM
  • 51. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 54 HBase – Query with AND (w/ 2-column index) name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_TL_160 idx1_OIH_SA_141 idx1_OIH_TL_142 idx1_OIH_TL_152 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_SOA_SSE_135 idx1_SOA_TL_145 Idx1_SOA_TL_148 Idx1_SOA_TL_150 Idx1_SOA_TL_150 … Pia Som Meets condition - Can fetch required data 1 scanner: start: idx1_OIH_TL, end: idx1_OIH_TM
  • 52. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 55 HBase – Query with AND (w/ 2-column index) name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_TL_160 idx1_OIH_SA_141 idx1_OIH_TL_142 idx1_OIH_TL_152 SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_SOA_SSE_135 idx1_SOA_TL_145 Idx1_SOA_TL_148 Idx1_SOA_TL_150 Idx1_SOA_TL_150 … Pia Som Scanner reaches end Close the scanner 1 scanner: start: idx1_OIH_TL, end: idx1_OIH_TM
  • 53. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 56 HBase – Query with OR name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 Idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 idx2_TL_160 idx2_SA_141 … 123 <= 142 - Get required results - Move scanner-1 to next till exceeds 142 two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 54. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 57 HBase – Query with OR name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 Idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 idx2_TL_160 idx2_SA_141 … Raj 123 <= 142 - Get required results - Move scanner-1 to next till exceeds 142 two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 55. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 58 HBase – Query with OR name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 Idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 idx2_TL_160 idx2_SA_141 … 141 <= 142 - Get required results - Move scanner-1 to next till exceeds 142 Raj two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 56. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 59 HBase – Query with OR name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 Idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 idx2_TL_160 idx2_SA_141 … Raj Anu 141 <= 142 - Get required results - Move scanner-1 to next till exceeds 142 two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 57. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 60 HBase – Query with OR name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 Idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 idx2_TL_160 idx2_SA_141 … 142 == 142 - Get required results - Move both scanners in this case Raj Anu two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 58. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 61 HBase – Query with OR name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 Idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 idx2_TL_160 idx2_SA_141 … Raj Anu Pia 142 == 142 - Get required results - Move both scanners in this case two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 59. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 62 HBase – Query with OR name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 idx2_TL_160 idx2_SA_141 … Raj Anu Pia 152 >= 145 (scanner-2 is behind) - Get required results - Move scanner-2 to next till exceeds 152 two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 60. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 63 HBase – Query with OR name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 idx2_TL_160 idx2_SA_141 … Raj Anu Pia Jay 152 >= 145 (scanner-2 is behind) - Get required results - Move scanner-2 to next till exceeds 152 two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 61. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 64 HBase – Query with OR name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 idx2_TL_160 idx2_SA_141 … Raj Anu Pia Jay Suma two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 62. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 65 HBase – Query with OR key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 idx2_TL_160 idx2_SA_141 … Raj Anu Pia Jay Suma two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326
  • 63. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 66 HBase – Query with OR name gender dept title mobile Raj M OIH SE 534 Ram M SOA SSE 254 Anu M OIH SA 123 Pia F OIH TL 326 key 123 135 141 142 idx1_BDI_160 idx1_OIH_123 idx1_OIH_141 idx1_OIH_142 SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL Jay M SOA TL 521 Suma F SOA TL 126 M SOA TL 325 Som M OIH TL 665 Su F BDI TL 928 … … … … … 145 148 150 152 160 … idx1_OIH_152 idx1_SOA_135 idx1_SOA_148 Idx1_SOA_150 … idx2_TL_142 idx2_TL_145 idx2_TL_148 idx2_TL_150 idx2_TL_152 idx2_TL_160 idx2_SA_141 … Raj Anu Pia Jay Suma Som two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM
  • 64. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 67 Region Split Rowkey cf:col1 01 A 02 A 03 C 04 B 05 X 06 A 07 A 01 09 Rowkey cf 01_A_01 01_A_02 01_A_06 01_A_07 01_B_04 01_C_03 01_X_05 01 09 Split at this point User table region Index table region Rowkey cf:col1 01 A 02 A 03 C 04 B Rowkey cf:col1 05 X 06 A 07 A Rowkey cf 01_A_01 01_A_02 01_B_04 01_C_03 01 05 05 09 01 05 Rowkey Cf 05_A_06 05_A_07 05_X_05 05 09  Explicit split on index region is avoided (using custom split policy for index tables)  When user table region splits, corresponding index region also splits  Split key for index region same as that of user region
  • 65. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 68 Region Split Rowkey cf:col1 01 A 02 A 03 C 04 B 05 X 06 A 07 A 01 09 Rowkey cf 01_A_01 01_A_02 01_A_06 01_A_07 01_B_04 01_C_03 01_X_05 01 09 User table region Index table region HalfStoreFileReader – Daughter A HalfStoreFileReader – Daughter B IndexHalfStoreFileReader –Daughter A –Daughter B  Custom HalfStoreFileReader for reading index daughter regions.  IndexHalfStoreFileReader – Both half store file readers start at same point i.e. beginning of file  Checks actual table rowkey and decide KV corresponding to it or not  IndexHalfStoreFileReader for daughter B - Changes the key as per the daughter start key.
  • 66. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 69 Agenda  HBase – A brief introduction  Introduction to hindex  Usage  Test Results
  • 67. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 70 Usage: Getting Started  For HBase 0.94.x  https://github.com/Huawei-Hadoop/hindex/  For HBase 0.98 or trunk  https://issues.apache.org/jira/browse/HBASE-10222
  • 68. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 71 Usage: Configurations Name Value hbase.coprocessor.master.classes org.apache.hadoop.hbase.index.coprocessor .master.IndexMasterObserver hbase.coprocessor.region.classes org.apache.hadoop.hbase.index.coprocessor .regionserver.IndexRegionObserver hbase.coprocessor.wal.classes org.apache.hadoop.hbase.index.coprocessor .wal.IndexWALObserver hbase.master.loadbalancer.class org.apache.hadoop.hbase.index.SecIndexLoa dBalancer hbase.use.secondary.index true
  • 69. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 72 Usage: Creating Index
  • 70. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 73 Usage: Tools  TableIndexer tool to create index(es) for existing data  Bulk load tool to load user data to user table and index it at same time  Tool to check regions co-location and repair if any co-location mismatches $HBASE_HOME/bin/hbase org.apache.hadoop.hbase.index.mapreduce.TableIndexer -Dtablename.to.index=table -Dtable.columns.index= ‘IDX1=>cf1:[q1->datatype& length]’ $HBASE_HOME/bin/hbase org.apache.hadoop.hbase.index.mapreduce.IndexImportTsv -Dimporttsv.columns=a,b,c -Dimporttsv.bulk.output=hdfs://storefile-outputdir <tablename> <hdfs-data-inputdir> $HBASE_HOME/bin/hbase org.apache.hadoop.hbase.index.util.SecondaryIndexColocator $HBASE_HOME/bin/hbase org.apache.hadoop.hbase.index.mapreduce.IndexLoadIncrementalHFiles <hdfs://storefileoutput> <tablename>
  • 71. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 74 Agenda  HBase – A brief introduction  Introduction to hindex  Usage  Test Results
  • 72. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 75 Test Results: Put Performance Hardware Architecture : x86_64 CPU(s) : 24 (2 threads per core) RS Heap size: 8GB Topology 5 Region Servers 100 Regions (user table) Data 100 GB data 500 bytes per record
  • 73. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 76 Test Results: Scan Performance idx1  cf:q1 idx2  cf:q2 Search for a column value Hardware Architecture : x86_64 CPU(s) : 24 (2 threads per core) RS Heap size: 8GB Topology 5 Region Servers 100 Regions (user table) Data 50 GB data 500 bytes per record
  • 74. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 77 Test Results: Query with AND idx1  cf:q1 idx2  cf:q2 Hardware Architecture : x86_64 CPU(s) : 24 (2 threads per core) RS Heap size: 8GB Topology 5 Region Servers 100 Regions (user table) Data 50 GB data 500 bytes per record
  • 75. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 78 Test Results: Scan w/ Range Query idx3  cf:q3 Hardware Architecture : x86_64 CPU(s) : 24 (2 threads per core) RS Heap size: 8GB Topology 5 Region Servers 100 Regions (user table) Data 50 GB data 500 bytes per record
  • 76. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 79 Test Results: Scan w/ Multi Column Index idx4 -> cf:q1,cf:q3 Hardware Architecture : x86_64 CPU(s) : 24 (2 threads per core) RS Heap size: 8GB Topology 5 Region Servers 100 Regions (user table) Data 50 GB data 500 bytes per record
  • 77. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 80 Summary  Design  Supports multiple indexes and multi-column indexes on a table  Supports indexing on part of a column value  Supports equal and range condition scans using index  Supports dynamic add/drop index  Supports hints to skip index scan or specific indexes to use in the scan.  Intelligent Filter evaluation  Application usage  No changes required to perform read and write operations.  Use IndexAdmin (client extension) to perform admin operations like create, enable, disable and drop on indexed table.  Need not perform admin operations separately on index table.  Upgrade/Integration  Minimal code changes in HBase kernel. HBase version upgrade is very easy.
  • 78. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 81 Roadmap  Contribute to HBase community (In progress – refer HBASE-9203)  HBCK tool support for Secondary index tables  Pluggable Scan-Evaluation
  • 79. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 82 Q & A https://github.com/Huawei-Hadoop/hindex/ mail to: bhupendra.jain@huawei.com
  • 80. Thank you www.huawei.com Copyright©2011 Huawei Technologies Co., Ltd. All Rights Reserved. The information in this document may contain predictive statements including, without limitation, statements regarding the future financial and operating results, future product portfolio, new technology, etc. There are a number of factors that could cause actual results and developments to differ materially from those expressed or implied in the predictive statements. Therefore, such information is provided for reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the information at any time without notice.

Notes de l'éditeur

  1. Good afternoon everyone. I am . In this session, we are going to learn about hindex. Hindex adds capability to index HBase table columns. [next]
  2. I am an active contributer to Apache HBase for around 2 years now. I am a software engineer with Huawei R&D in Bangalore. Our team delivers a stable Hadoop distribution that is internally used by various Huawei products and solutions. In 2013, we added capability to add indexes for Hbase table columns, and Huawei has recently open sourced the same. [next]
  3. I am not sure how many of us here are aware of Hbase, we will start this session with a very brief overview. We will follow that with an introduction to hindex. We will be covering some of the details of how we have implemented this. We will also learn about using hindex, and will cover a few tools that we have enhanced/developed for using with indexed tables. In the end, we will check some performance benchmark results, to help us see how much we could potentially benefit with hindex. [next]
  4. So let us get started with an overview of Hbase first. [next]
  5. Apache Hbase is an open source, key-value store, belongs to the family of nosql data stores. It is inspired by Google’s BigTable. And it uses HDFS for actual storage of data files. HBase allows random read and write access to data. Goal of Apache HBase project is to be able to host very very large tables – that could contain billions of rows and millions of columns. [next]
  6. Hbase is a key value store as I mentioned. So you could retrieve data using a key. We can consider Hbase table as a map, where the keys are sorted, lexicographically. So if we know the key, HBase can easily find our data, without having to scan large amounts of data. [next]
  7. A row in Hbase table can have any arbitrary number of columns, and doesn’t necessarily contains all the columns. So the table could be very sparse, depending on usage. It is worth noting that no storage is needed for absent information, there will be just no cell for a column that does not have any value. [next]
  8. So we said that Hbase is a key value store, like a map. A value is uniquely defined not just by rowkey, but also more dimensions, like column-family, column-qualifier, and a timestamp. So you could think of this as a table cell, which can have different values for different timestamps – so we can use this dimension for versioning. [next]
  9. HBase sits on top of HDFS, leveraging its capability of data replication across multiple nodes. This distribution provides a layer of protection against, say, a node within the cluster failing. [next]
  10. As the data in Hbase table grows, it is automatically split and redistributed. This auto-sharding is transparent to the user at runtime. Each shard/split is also called a region in Hbase terminology. [next]
  11. So Hbase table generally comprise of multiple regions, having contiguous ranges of rows. The list of regions, and the range of keys it contains, is stored in a special META table. [next]
  12. The HBase Architecture has two main services:  Hmaster : coordinates the HBase Cluster Responsible for Administarive operations Load balancing Region server failures  HRegionServer: responsible to handle a subset of the table’s data. Read/write happen through region servers. Mapreduce integration:
  13. To implement hindex, we have leveraged some specific Hbase capabilities, like coprocessors and filters. Let me walk you through some of these. There are two kind of coprocessors, we have used “Observer” type in hindex. Observer type of coprocessor is like a hook, that can be called before and after an Hbase action in invoked. So if you want to implement security checks, you would implement such a check in an “Observer” type of coprocessor, and it can be configured to be called whenever data is read. [next]
  14. To limit the output of an scan you can use filters To search for a column value we can use single column value filters.
  15. Point queries are very fast in hbase.
  16. We need to go through each record by record and see whether mobile no is equal to 123.
  17. That’s full table scan.
  18. To avoid full table scans we can create secondary indexes on columns of interest in the query condition.
  19. We have added secondary index support to Hbase. Now we can see the details
  20. In indexing coprocessor every thing done by server.
  21. Now we can see how we handle queries with indexes
  22. Basically index data is maintained in inverted tables as shown here. Note: Lets see whether we can move the slide up before introducing hindex.
  23. When we put to the indexed columns then we will prepare index puts and add to index region in the coprocessors Since both user region and index region are colocated there is no extra network overhead to write to index table. This need to be highlighted.
  24. Explain like when we have a filters like conditions in the scan then we first check any best fit index available for the filters or not. If any index available we create scanners on index region and keep aside which will be used when we fetch records.
  25. Basically in the coprocessors first we find the best indexes to scan and identify key ranges for the indexes.
  26. Find maximum row key and jump the other index to maximum.
  27. Its basically like merge phase in merge sort. We first check the row keys of both in the indexes and continue scanning the index giving smaller rowkey. Scan until we reach rowkey smaller or equal to the rowkey in right index. If we get bigger rowkey we switch to right index and scan it.
  28. To check with 0.94.x version, you can take code from https://github.com/Huawei-Hadoop/hindex/ To check with 0.98 or trunk take the patch at https://issues.apache.org/jira/browse/HBASE-10222
  29. Configure the above properties and the servers both master and region server instances.
  30. Explain like Create index specification add columns to it collect all the indices add the indices as meta information to HTD
  31. Explain in this order 1) Topology 2)Table details 3)Data size and records 4)Then results.
  32. Scan tests we have done with 50 GB data and each record size is 500 bytes.
  33. In general we have multiple indexes matching we will select the index having less records. For that we need to have stats about data. Which is very difficult to maintain. We can observe that index merge also performing as equal like only scanning only with index having less records.
  34. In general we have multiple indexes matching we will select the index having less records. For that we need to have stats about data. Which is very difficult to maintain. We can observe that index merge also performing as equal like only scanning only with index having less records.