Meet HBase 2.0 and Phoenix-5.0

1 © Hortonworks Inc. 2011–2018. All rights reserved
Meet HBase 2.0 and Phoenix 5.0
Ankit Singhal and Rajeshbabu Chintaguntla
HBase & Phoenix Engineer in Hortonworks

About Us
Ankit Singhal
- Phoenix PMC
- HBase dev @Hortonworks
Rajeshbabu Chintaguntla
- Phoenix PMC and HBase Committer
- HBase dev @Hortonworks

Outline
Recent releases
Versioning / Compatibility
Changes in behavior
Major Features in HBase-2.0
Phoenix- 5.0

Outline
Recent releases
Changes in behavior
Phoenix-5.0
Picture taken from: https://www.ostraining.com/blog/joomla/content-versioning-a-sneak-peek-at-joomla-s-newest-proposed-feature/

2018 (repo and releases)
(master) 3.0.0-SNAPSHOT
(branch-1) 1.5.0-SNAPSHOT
(branch-1.4)
1.4.3
1.3.21.3.1
(branch-1.3)
1.2.0 1.2.6
1.1.0 1.1.13
(branch-1.2)
(branch-1.1)
(branch-2) 2.1.0-SNAPSHOT
1.4.4
2.0.-beta-1-2 2.0.0 2.1.0
(expected)
3.0.0
1.4.5

How to choose a release
• If you are on 0.98.x or earlier, time to upgrade!
• 1.0 is EOL’ed. Use 1.3.x at least
• Both 1.3.2 and 1.4.4 are pretty stable
• Moving between minor versions is easy for 1.x
• Starting from scratch, use 1.4.4 or 2.0.1(RC)/2.1.0(planned)
Picture taken from:http://meritcd.com/blogs/improve-your-decision-making-improve-your-leadership-2/

Outline
Recent releases
Changes in behavior
Phoenix-5.0

Semantic Versioning
Starting with the 1.0 release, HBase works toward Semantic Versioning
MAJOR.MINOR.PATCH[-identifiers]
PATCH: only BC bug fixes.
MINOR: BC new features
MAJOR: Incompatible changes

To be, or not to be (Compatible)

To be, or not to be (Compatible)
• Compatibility is NOT a simple yes or no
• Many dimensions
• source, binary, wire, command line, dependencies etc
• Read https://hbase.apache.org/book.html#upgrading

HBase API surface
• Client API
• Explicitly marked with InterfaceAudience.Public
• Get/Put/Table/Connection, etc
• LimitedPrivate API
• Explicitly marked with InterfaceAudience.LimitedPrivate
• Coprocessors, replication APIs
• Private API
• Explicitly marked with InterfaceAudience.Private
• All other classes not marked
• Also InterfaceAudience.{Stable,Evolving,Unstable}

Major Minor Patch
Client-Server Wire Compatibility ✗ ✓ ✓
Server-Server Compatibility ✗ ✓ ✓
File Format Compatibility ✗* ✓ ✓
Client API Compatibility ✗ ✓ ✓
Client Binary Compatibility ✗ ✗ ✓
Server Side Limited API Compatibility ✗ ✗*/✓* ✓
Dependency Compatibility ✗ ✓ ✓
Operation Compatibility ✗ ✗ ✓
Compatibilities

Outline
Recent releases
Changes in behavior
Phoenix-5.0

Changes in behavior – HBase-2.0
• JDK-8+ only
• Hadoop-2.7+ and Hadoop-3 support
• Filter and Coprocessor changes
• Managed connections are gone.
• “Rolling upgradable not supported” from 1.x
• Updated dependencies
• Deprecated client interfaces are gone
https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_ktczrlKHK8N4SZzs/edit#heading=h.v21r9nz8g01j

How to migrate on HBase-2.0
• 2.0 contains more API clean up
• Cleanup PB and guava “leaks” into the API
• Some deprecated APIs (HConnection, HTable, HBaseAdmin, etc) going away
• Start using JDK-8 (and G1). You will like it.
• 1.x client should be able to do read / write / scan against 2.0 clusters
• Some DDL / Admin operations may not work during Rolling upgrade

Outline
Recent releases
Changes in behavior
Phoenix-5.0

Offheaping Read Path
• Goals
• Make use of all available RAM
• Reduce temporary garbage for reading
• Make bucket cache as fast as LRU cache
• HBase block cache can be configured as L1 / L2
• “Bucket cache” can be backed by onheap, off-heap or SSD
• Different sized buckets in 4MB slices
• HFile blocks placed in different buckets
• Cell comparison only used to deal with byte[]
• Copy the whole block to heap

https://www.slideshare.net/HBaseCon/offheaping-the-apache-hbase-read-path

https://blogs.apache.org/hbase/entry/offheaping_the_read_path_in1

Offheap Read-Path in Production - The Alibaba story (https://blogs.apache.org/hbase/)

Offheaping Write Path
• Reduce GC, improve throughput predictability
• Use off-heap buffer pools to read RPC requests
• Remove garbage created in Codec parsing, MSLAB copying, etc
• MSLAB is used for heap fragmentation
• Now MSLAB can allocate offheap pooled buffers [2MB]

Offheaping Write Path
• Allows bigger total memstore space
• Cell data goes off-heap
• Cell metadata (and CSLM#Entry, etc) is on-heap (~100 byte overhead)
• Async WAL is used to avoid copying Cells into heap
• Works with new “Compacting Memstore”

Compacting Memstore
• In memory flushes
• Reduce index overhead per cell
• Gains proportional to cell size
• In memory compaction
• Eliminate duplicates
• Gains are proportional to duplicates
• Reduce flushes to disk
• Reduce file creation
• Reduce write amplification

Compacting Memstore
https://www.slideshare.net/HBaseCon/apache-hbase-accelerated-inmemory-flush-and-compaction

https://www.slideshare.net/HBaseCon/apache-hbase-accelerated-inmemory-flush-and-compaction

Async Client
• A complete new client maintained in HBase
• Different than OpenTSDB/asynchbase
• Fully async end to end
• Based on Netty and JDK-8 CompletableFutures
• Includes most critical functionality including Admin / DDL
• Some advanced functionality (read replicas, etc) are not done yet
• Have both sync and async client.

Async Client

Region Assignment
• Region assignment is one of the pain points
• Complete overhaul of region assignment
• Based on the internal “Procedure V2” framework
• Series of operations are broken down into states in State Machine
• State’s are serialized to durable storage
• Uses a WAL on HDFS
• Backup master can recover the state from where left off

Backup / Restore
• Need a reliable native Backup mechanism

Backup / Restore
• Full backups and Incremental backups
• Backup destination to a remote FS
• HDFS, S3, ADLS, WASB, and any Hadoop-compatible FS
• Full backup based on “snapshots”
• Incremental backup ships Write-Ahead-Logs
• Backup + Replication are complimentary (for a full DR solution)

Backup / Restore
• Flexible restore options

FileSystem Quotas
• HBase supports quotas
• Request count or size per sec,min,hour,day (per server)
• Num tables / num regions in namespace
• New work enables quotas for disk space usage
• Master periodically gathers info about disk space usage
• Regionservers enforce limit when notified from master
• Limits per table or per namespace (not per user)
• Violation policy: {DISABLE, NO_WRITES_COMPACTIONS, NO_WRITES, NO_INSERTS}

FileSystem Quotas
• Much more complex when snapshots are in the picture (similar to quotas with hard
links in FS)
• Snapshot usage is counted against the originating table
• A File is only counted once (even original table, or snapshot or restored table refers to
it)
• More info: HBASE-16961 => ‘prod_website’, LIMIT => ‘7125GB’
hbase> set_quota TYPE => SIZE, VIOLATION_POLICY
=>DELETES_WITH_COMPACTIONS, NAMESPACE =>
‘prod_website’, LIMIT => ‘7125GB’

Will be released for the first time
• These will be released for the first time in Apache (although other backports exist)
• Medium Object Support
• Region Server groups
• Favored Nodes enhancements
• HBase-Spark module
• WAL’s and data files in separate FileSystems

Outline
Recent releases
Changes in behavior
Phoenix- 5.0

Phoenix 5.0 - Outline
• API cleanup
• Index Improvements.
• Join Improvements.
• Major Features

HBase 0.98 Name(s) HBase 1.x + 2.x name(s)
HConnectionManager,
ConnectionManager
ConnectionFactory
HConnection, ClusterConnection Connection
HBaseAdmin Admin
HTable(HTableInterface) Table
RegionLocator
HTableDescriptor, HColumnDescriptor,
HRegionInfo
TableDescriptor,
ColumnFamilyDescriptor,
RegionInfo

Join Improvements

Cost based Optimizations for Joins
• Phoenix support two Join algorithms
• Hash Join
• Sort Merge Join.
• Prior to 4.14.0 user need to explicitly Hint compiler to use Sort Merge Join when
required.
• From 4.14.0 onwards best join algorithm selection happen automatically based on
query join predicate and ordering

• Sort merge join wins over Hash Join when both the children are ordered.
• Hash-join wins over sort-merge-join w/ smaller side ordered.
• Hash-join wins over sort-merge-join w/o any side ordered.
• Hash-join wins over sort-merge-join w/ both sides ordered in an order-by query – client
need to sort merge the results.
• Multi-table join: a mix of join strategies chosen based on cost.

Index Improvements

Index Write Failure Handling
• Phoenix supports many strategies for handling index failures.
• Blocking writes to data table until index restored to consistent state.
• Disable index with timestamp at which failure occurred and rebuild index in the background until
consistent state reached.
• Disable index and manual index rebuilding from MR tool to avoid overhead at server.
• Kill the RS so that the index updates replayed or recovered to consistent state.

Index Write Failure Handling (Cont’d..)
Client Retries On Index Failure
• Prior to Phoenix 5.0.0 the retry happen at server on index write failures and disable
index even if the write not succeeded after multiple retries.
• This adds unnecessary overhead to server and block the handlers.
• Now the client handles the retries without any blocking.

Incremental Index Rebuilding
• When data accumulated to rebuild is more(hours or days of data) after index failures
then it’s difficult to rebuild the indexes in single pass.
• So in latest version the index rebuilding happen in batches with check pointing by
timestamp so that any failures happen in the middle rebuilding starts from last
successful checkpoint.
• Automatic rebuilding stops when user initiates manual index rebuilding.

MR based Index Rebuilding
• Alter index with REBUILD ASYNC option.
• Use IndexTool to rebuild the index.

Global Index performance Improvement
• Presplit index table before asynchronous index building.
• Until unless we know about the indexed column value data set it’s difficult to pre define the
boundaries of index table.
• Automatic process has implemented in 4.14.0 version where tool identifies the approximate split
keys by collecting the index key values from data table sample(or full table) where the sampling
rate can be changed from 0 to 100%.
• The tool builds Equi-depth histogram to identify the approximate index region boundaries.

Local Index performance Improvements
• As of now when we scan through local index we need to touch base all the regions
irrespective of filters.
• In the current version we can filter the regions to scan for local indexes based on data
table leading primary key filter conditions.

Kafka Integration

Apache Kafka Integration
• Apache Kafka plugin helps to reliably and efficiently stream
large amounts of data onto HBase using Phoenix API. As part
of this plugin we are providing PhoenixConsumer to receive
messages from KafkaProducer.

Major Features

Support for GRANT and REVOKE commands
• Uses HBase ACLs only, Same Permissions are R - Read, W - Write, X - Execute, C - Create
and A - Admin.
• Access needs to be given on data table which automatically get propagated to all its
indexes and views
• No need to have WRITE access on SYSTEM.CATALOG table now for DDLs

Query log
• Enable centralized recall and analyzation of queries executed against Phoenix.
• We have around 18+ metrics logged in the table
• Logging is done Asynchronously and can be controlled through log
level(TRACE,DEBUG,INFO,OFF)

Python driver for Phoenix Query Server
• It implements the Python DB 2.0 API to access Phoenix via the Phoenix Query Server
• Example

Persistent Subquery Results Cache - WIP
• Persisting expensive subqueries data beyond life of query helps in significant
performance gain when the same subquery repeated in multiple queries.
• Make use of coprocessor based server cache to avoid caching at client.
• WIP at https://issues.apache.org/jira/browse/PHOENIX-4666

Upgrades/New
• Apache Omid Integration for Transactions Support( along with Tephra)
• Hive 3.0
• Hadoop 3.0
• Spark 2.3.0

Questions?

Thank you

Meet HBase 2.0 and Phoenix-5.0

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Meet HBase 2.0 and Phoenix-5.0

Similar to Meet HBase 2.0 and Phoenix-5.0 (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Meet HBase 2.0 and Phoenix-5.0

Editor's Notes