This document summarizes different approaches to achieving high availability in HBase. It discusses HBase region replicas, asynchronous WAL replication, and timeline consistency introduced in HBase 1.1 to provide increased read availability. It also describes WanDisco's non-stop HBase implementation using Paxos consensus and Facebook's HydraBase which uses RAFT consensus, with HydraBase designating region replicas as active or witness. The document compares these approaches on attributes like consensus algorithm, read/write availability, strong consistency, and support for multi-datacenter deployments.
4. HBASEHIGHAVAILABILITY
BEYOND MTTR ….
30 seconds downtime (aggressive MTTR time) is still too
much
CERTAIN CLASSES OF APPS ARE WILLING TO TOLERATE
DECREASED CONSISTENCY GUARANTEES IN FAVOR OF
AVAILABILITY
» Especially for READs
INTRODUCED REGION REPLICA & TIMELINE CONSISTENCY
IN HBASE
Community effort in the Apache jira HBASE-10070
4
5. HBASEHIGHAVAILABILITY
HBASE-10070: “REGION REPLICA”
5
FOR EVERY REGION OF THE TABLE, THERE CAN BE MORE THAN ONE REPLICA
» Each region replica is hosted by a different region server
ONE REPLICA PER REGION IS THE “PRIMARY”
» Only this can accepts WRITEs
» All reads from this region replica return the most recent data
OTHER REPLICAS, ALSO CALLED “SECONDARIES” FOLLOW THE PRIMARY
» Via Asynchronous WAL replication
» May not have received the recent data
» Serve data from the same data files (hfiles) as primary
11. HBASEHIGHAVAILABILITY
THE CURRENT STATE OF HBASE-10070
• Available in HBase 1.1.0
• Provides HA reads but not writes
• Simplest to implement
• Consensus-based HA more work than
having No consensus
11
13. HBASEHIGHAVAILABILITY
HIGH AVAILABILITY – SOLUTION
PARAMETERS
• Where does the WAL go?
• Are Region Servers Active-Active or Active-
Standby?
• Do you use quorum?
• What consensus algorithm to achieve
quorum?
13
15. HBASEHIGHAVAILABILITY
WAN DISCO: ARCHITECTURE
• WAL on Local file system
• Multiple Region Servers host a region
• Reads and writes to a key served by any
replica
• Paxos-based Consensus
15
16. HBASEHIGHAVAILABILITY
WAN DISCO: WRITE PATH
• All replicas equal for reads and writes
• Client sends the Put to any one replica
• Chosen replica co-ordinates with peers
• Memstore and WAL updated after
consensus is achieved
16
17. HBASEHIGHAVAILABILITY
WAN DISCO: WRITE PATH
Client
WAL
MEM
Replica_id=R1
Replica_id=R2
Replica_id=R3
Networkpartition
WAL
MEM
WAL
MEM
X=2
X=2
X=2
WRITE X=2
X=2
X=2
18. HBASEHIGHAVAILABILITY
WAN DISCO: READ PATH
• Reads go to any one replica in the set
• On timeout, client does co-ordinated read
to other replicas
• Could potentially return stale data
18
21. HBASEHIGHAVAILABILITY
WAN DISCO: REGION SERVER FAILURE
• No downtime on reads or writes upon RS
failure
• Failed RS will not persist writes
• Failed RS might be on the read path, client
re-issues read on timeout
21