SlideShare une entreprise Scribd logo
1  sur  39
| HBaseCon 2016 | May 24, 20161
Rolling Out Apache HBase
for Mobile Offerings at Visa
Partha Saha
pasaha@visa.com
CW Chung
cchung@visa.com
| HBaseCon 2016 | May 24, 20162
Data loaded in real-time
Over 100 Billion rows as
history from most recent
Milli-second response times
for write/read
What this talk is about – A choice of NoSQL at Visa
Scale
Speed
Real-time
| HBaseCon 2016 | May 24, 20163
An example of a mobile offering
Add card to wallet
Pay
For
Purchase
See your transaction
Right away
along with
recent history
Need
NoSQL
Here
| HBaseCon 2016 | May 24, 20164
We chose HBase as a NoSQL solution.
We built a scalable and real-time Transaction History
Service.
We migrated prominent Mobile wallet offerings to the
Service.
This talk is about our learnings over the last year.
| HBaseCon 2016 | May 24, 20165
This talk …
1. We assume some knowledge and familiarity of HBase.
2. We used HBase 1.0.0 with Cloudera Distribution CDH 5.4.3, so our observations
are based on that version of HBase.
3. We cover the important learning events along the way of adoption of HBase
in Visa
1. These can help new teams adopting HBase so that they avoid the same
pitfalls.
2. Our learning continues as we take on more interesting and challenging
opportunities.
| HBaseCon 2016 | May 24, 20166
Is YCSB a good way to compare NoSQL options?
| HBaseCon 2016 | May 24, 20167
It is actually not…
• Unless you know how to configure your NoSQL options for optimal performance…
• You may be driven to another solution, because its performance seems “smoother”
and easier to explain by rudimentary knowledge.
0
20000
40000
60000
1
12
23
34
45
56
67
78
89
100
111
122
133
144
155
166
177
188
199
210
221
232
243
254
265
276
287
298
309
Series2
0
20000
40000
1
12
23
34
45
56
67
78
89
100
111
122
133
144
Series2
• It is a great tool however to observe how system configuration changes
performance, and explore the configuration space for various workloads.
| HBaseCon 2016 | May 24, 20168
Our YCSB experience…
• Very easy to set up!
• Got a baseline of HBase performance of the cluster. Rerun after significant
configuration & application code changes.
• Key parameters used:
– # of client threads
– # of operations
– # records in Data Set
– Workload mix of read/update/insert. (We added 100% insert/update workload).
– Use a bash driver script to test various combinations of parameters.
• Latency measurement type can be in histogram or timeseries. Both were useful.
| HBaseCon 2016 | May 24, 20169
Should you design yourself out of major compactions?
| HBaseCon 2016 | May 24, 201610
Not worth the trouble when you are starting…
• An argument may be made that if we need an “N” day rolling look back, we can
have daily tables that we create before and delete past the look back window. We
can then reason about how to compact each daily file. Will that make the system
operate better?
• Write amplification is a well known problem and gets a lot of attention, but
however, worrying about the problem during early design stages seemed like
premature optimization.
• We thought that we could always optimize later through rolling compactions and
diurnal patterns of traffic later once patterns of reads and writes were fully
understood.
| HBaseCon 2016 | May 24, 201611
Does your design need transactional support?
| HBaseCon 2016 | May 24, 201612
We analyzed our secondary and primary key
read/writes.
Primary key Fact
pk1
pk2
Seconda
ry key
Associations
sk1 {pk1}
sk2 {pk1, pk2}
Query keys for facts
Register
associations
• We concluded, by tracing reads and failures
through updates that inconsistencies were
short lived.
• We would have used a transaction support
library otherwise.
| HBaseCon 2016 | May 24, 201613
How do you hands-on learn about HBase without
going into Production?
| HBaseCon 2016 | May 24, 201614
We built a Continuous Integration and Learning
Environment
Build
Server
git/
Stash
Bamboo
Artifactory Client
Bamboo plan
Chef
Client
- Checkout
- Build
- Upload
- Deploy
- Run test
Test
Server
| HBaseCon 2016 | May 24, 201615
How do get Operations ready for HBase in Production?
| HBaseCon 2016 | May 24, 201616
We allocated one developer for 1 day/week to monitor
production problems …
Bangalore
India
Foster City
CA, USA
1. We shadowed the real
production
2. Any production
problem was given
priority by the whole
team
3. We used 2 sites for
24x7 eyes
4. Added Alert and
Monitoring dashboards
5. We launched only when
when we met certain
metrics
| HBaseCon 2016 | May 24, 201617
Loading data in real-time as it is read
| HBaseCon 2016 | May 24, 201618
We used a micro-batch approach
Pre-
Processor
Listing &
Sender
Tracker
Loader Master
Receiver
Loader Worker
Batch
Processor
LLF Reader
HBase
Load
Batch
Processor
LLF Reader
HBase
Load
Batch
Processor
Stream
Reader
HBase
Load
Listing &
Sender
Tracker
Notification
Master
Receiver
Notification
Worker
Batch
Processor
LLF
Reader
HBase
Registration
Query
Send
Notification
Batch
Processor
LLF
Reader
HBase
Registration
Query
Send
Notification
Batch
Processor
Stream
Reader
HBase Query
Send
Notification
IPC IPC
Micro-Batch (250 ms) Control and State Files
readswrites
1 per Stream 1..N per Master 1 per Stream1..N per Master
Stream N
Stream 2
stream1
….....
tail
We had to build an approach to remember and retry from
any point in each stream
| HBaseCon 2016 | May 24, 201619
Reading via Web Servers
| HBaseCon 2016 | May 24, 201620
The web-services Front End
Audit
DB
MQ
Config
Service
Access
Authorization
Encryption
UtilityAudit
Load
Distribution
Plugin
Cache
Subscription
Service
Failover
Service
BusinessComponent
DataService
Web
Service
Wrapper
Rest
Controller
API
Request
Transform
Response
Transform
Domain
Objects
Audit
Listener
HBaseAPI
HBase
Plugin
HBase Cluster
Gateway
and
Load
Balancer
| HBaseCon 2016 | May 24, 201621
Availability
| HBaseCon 2016 | May 24, 201622
We used 2 data centers to get availability
Data Center 1
Streams
Data Center 2
Streams
Replication of
non-native
streams
We use shadow tables to write for the other
when the other is down, and drain the shadow
tables for the other to catch-up
| HBaseCon 2016 | May 24, 201623
Learning your Data Center clock
| HBaseCon 2016 | May 24, 201624
HBase is sensitive to clock skew…
• Kerberos services do not tolerate more than a few minutes of clock skew.
• Warnings are generated for a small skews, large skews kill region-servers.
| HBaseCon 2016 | May 24, 201625
Client retries
| HBaseCon 2016 | May 24, 201626
Client retries & IOExceptions
• Default HBase timeout/retries settings can take tens of minutes to timeout:
– hbase.rpc.timeout: 60 sec
– hbase.client.retries.number: 35
– hbase.client.pause: 100 msec (grows to 10 sec quickly after back-off)
– Longer when factor in potential retries by zookeeper!
– Blogs by Lars Hofhansl: “HBase Client timeouts”, “HBase client response times”
• We choose Fail Fast strategy, as end user device will do end-to-end retry.
• Timeout/retries settings: 1 sec timeout, 3 total tries.
– Works well for the same data center, as well as across data centers
• However, once a while, clients see IOExceptions!
– Caused by Region Server (busy in GC, major/minor compaction, … ?)
– Or the Network?
– Or the Client itself?
| HBaseCon 2016 | May 24, 201627
Correlating client exceptions
| HBaseCon 2016 | May 24, 201628
Correlating client exceptions
• Client side:
– Turn on hbase client debugging:
• log4j.logger.org.apache.hbase.client=DEBUG
• log4j.logger.org.apache.hbase.ipc=DEBUG
– Catch the exceptions to print out specific Region Server name:
• IOException, RetriesExhaustedWithDetailsException
• Server side:
– Then look into the specific Region Server log of that server.
• Works well when you know the specific server causing the IOExceptions.
– What if not?
| HBaseCon 2016 | May 24, 201629
Correlating client exceptions
• Build Root Cause Analysis software to:
– Collect the relevant logs from the sources:
• Client: application logs, hbase client logs, GC logs
• Hadoop server: HBase, HDFS, Zookeeper server and GC logs
• Cluster events: Cloudera Manage API
• Other logs: KDC logs, Kerberos canary, network latency monitoring
– Parse the logs (single line, multi-line text, json, xml) into csv files.
– Normalize data and time format, apply date and time range filtering.
– Apply text filtering and text reduction on verbose lines.
– Output: events csv, sorted by time and server, suitable for grep/awk/sort, hive/sql.
• Quickly get an total view of the sequence of events of various services.
• Sometime can identify the smoking gun (e.g. exception caused by GC ).
• Still useful in the few cases when no smoking gun can be found!
– Trouble-shooting is also a process of elimination.
| HBaseCon 2016 | May 24, 201630
Kerberos Gotchas
| HBaseCon 2016 | May 24, 201631
Kerberos Gotchas – what we have learned
• Hostname uses FQDN (Fully Qualified Domain Name, like server123.abc.com)
• Use TCP rather than UDP (set udp_preference_limit = 1 in krb5.conf)
• KDC (MIT Kerberos) server:
– Configure to start up several kdc processes to handle bursty traffic (use –w option).
– Set up a backup kdc for higher availability.
• Debugging tips:
– $ export KRB5_TRACE=/dev/stderr (or to a file)
– $ log4j: -Dsun.security.krb5.debug=true
• Kerberos support is built into the Java JRE, using internal classes:
– Oracle JDK: com.sun classes; on IBM AIX: com.ibm
– Hadoop is built and tested against Oracle JDK ( mileage on AIX JDK varies).
• Good references (besides the usual documents on Kerberos, and HBase User mailing list):
– Steve Loughran: Hadoop and Kerberos: The Madness beyond the Gate.
– HBase and Hadoop common source code: UserGroupInformation.java.
| HBaseCon 2016 | May 24, 201632
Kerberos Gotchas – what we learned
– Renewing a TGT Ticket (Ticket Granting Ticket)
• After kinit successfully, application principal gets a Kerberos TGT ticket.
• By default, the TGT ticket is good for 10 hours.
• For long-running applications, 10 hours obviously is not enough: need to renew TGT.
• Initially uses a process/thread to do a kinit once every few hours.
– Still ran into some IOExceptions at the time of TGT of renewal.
– Not the recommended way for long-running applications.
• Now uses UGI API (UserGroupInformation): loginUserFromKeytab( ).
– Does not require a separate process/thread to do TGT renewal.
– Hadoop/HBase client class library will catch the exception due to TGT expiration, and will do a
reloginFromKeytab( ) to renew TGT automatically.
– Also considering spawn a thread and proactively invoke CheckTGTAndRelogin( ).
– Ongoing investigation: client occasionally still experiencing momentary IOException around the
time ticket renewal.
– Referral Ticket: when on realm is set up to trust another realm, be aware of the additional
kdc calls resulted when the kinit principal is from the trusted realm.
| HBaseCon 2016 | May 24, 201633
Garbage Collection
| HBaseCon 2016 | May 24, 201634
Garbage Collection
• Use G1 on Oracle JDK 1.8
• Basically using settings as recommended by folks from HBaseCon2015.
– By Eric Kaczmarek, Yanping Wang, Liqi Yi
• Set target GC pause to 100 msec; Young Gen to ~1GB.
• Observation consistent with their published results:
– Observed gc time in production:
• 100 msec or less: 67%
• 400 msec or less: 99.98%
• Important to track the actual production gc time, as Production and Test cluster
shows somewhat different distribution.
| HBaseCon 2016 | May 24, 201635
GC Duration comparison: production vs perf cluster
| HBaseCon 2016 | May 24, 201636
GC: How Good is MaxGCPauseMillis as a Target?
MaxGCPauseMillis = 100 Production Cluster
(gc in msec)
Test Cluster
(gc in msec)
# of gc events 165192 199883
Avg / Std Dev / Max 87.1 / 64.9 / 1530 msec 81.9 / 37.2 / 1370 msec
50 percentile (median) 80 msec 90 msec
95 percentile /
99% / 99.9% / 99.99%
210 msec /
270 / 450 / 660 msec
120 msec /
140 / 510 / 780 msec
Percentile of: 100 msec /
200 / 300 / 400 msec
67% /
95% / 99.4% / 99.8%
85% /
99.4% / 99.6% / 99.8%
| HBaseCon 2016 | May 24, 201637
In Conclusion…
| HBaseCon 2016 | May 24, 201638
Adopting an open source product is a journey…
• Learning from previous adoption successes is crucial – if use case has not been
tried/analyzed/written about before, chances are we have to pay for learning and
having alternate choices is a good idea.
• Making only one major technology change at a time is always a good idea.
• Setting up appropriate expectations through team members and agile processes is
important.
• Going to production scenario early as shadow and learning through frequent
releases is helpful.
• We believe extra capacity for peak workloads was very helpful.
• Having source code is very useful in learning and trouble-shooting.
| HBaseCon 2016 | May 24, 201639
It Takes a Village! Thank you!
Alexandr Peyko
Amit Sharma
Anthony Chu
Arindam Chakraborty
Artem Savinov
Aviral Agarwal
Bala Saravanan Kannan
Ben Crane
Carl Duque
Chetan Talanki
Debasis Mullick
Deepankar Palit
Hong Zhu
Igor Karpenko
Igor Peller
Igor Ulianitski
Jay Gardner
Jim Gordon
Karthikeyan Manickavasagan
Liang Gao
Murali Reddy
Nandakumar Jayakumar
Nimish Shah
Peter Meigs
Pradyot Sikdar
Praveen Rudraraju
Rajat Raj
Raj Merchia
Ralph Blore
Ranjan Dutta
Ricardo De Ocampo Domingo
Robert Walsh
Sabu Peter
Sam Hamilton
Sandeep Reddy
Satyaban Nandi
Soumya Das
Srijoy Aditya
Srinivas Reddy Surasani
Suchismita Nayak
Suresh Pulikara
Ujjwal Kumar
Vikash Talanki
Vinay Sarda
Waqar Hasan
Winnie Chau
Xuepeng (Hans) Li
Yanyan Hao
Yusuf Rahaman
Amandeep Khurana
Jeongho Park
Jugoslav Djajic
Justin Hayes
Michael Stack

Contenu connexe

Tendances

HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsightHBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsightHBaseCon
 
HBaseCon 2013: Near Real Time Indexing for eBay Search
HBaseCon 2013: Near Real Time Indexing for eBay SearchHBaseCon 2013: Near Real Time Indexing for eBay Search
HBaseCon 2013: Near Real Time Indexing for eBay SearchCloudera, Inc.
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon
 
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeHBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeMichael Stack
 
HBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project StatusHBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project StatusMichael Stack
 
Unified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaUnified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaDataWorks Summit
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
 
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...HBaseCon
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at NeteaseHBaseCon
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseCloudera, Inc.
 
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path HBaseCon
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudHBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudMichael Stack
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian MeyersAmazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyershuguk
 
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBaseHBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBaseMichael Stack
 
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Omid Vahdaty
 
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devicesHBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devicesMichael Stack
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale Hakka Labs
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera FieldHBaseCon
 
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...Redis Labs
 

Tendances (20)

HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsightHBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
 
HBaseCon 2013: Near Real Time Indexing for eBay Search
HBaseCon 2013: Near Real Time Indexing for eBay SearchHBaseCon 2013: Near Real Time Indexing for eBay Search
HBaseCon 2013: Near Real Time Indexing for eBay Search
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeHBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
 
HBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project StatusHBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project Status
 
Unified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaUnified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache Samza
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBase
 
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWSHBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudHBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian MeyersAmazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyers
 
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBaseHBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
 
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...
 
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devicesHBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera Field
 
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
 

En vedette

Apache HBase at Airbnb
Apache HBase at Airbnb Apache HBase at Airbnb
Apache HBase at Airbnb HBaseCon
 
Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search HBaseCon
 
Apache HBase - Just the Basics
Apache HBase - Just the BasicsApache HBase - Just the Basics
Apache HBase - Just the BasicsHBaseCon
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesHBaseCon
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiHBaseCon
 
Solving Multi-tenancy and G1GC in Apache HBase
Solving Multi-tenancy and G1GC in Apache HBase Solving Multi-tenancy and G1GC in Apache HBase
Solving Multi-tenancy and G1GC in Apache HBase HBaseCon
 
Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory HBaseCon
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction HBaseCon
 
Apache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBaseApache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBaseHBaseCon
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseHBaseCon
 
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightHBaseCon
 
In Search of Database Nirvana: Challenges of Delivering HTAP
In Search of Database Nirvana: Challenges of Delivering HTAPIn Search of Database Nirvana: Challenges of Delivering HTAP
In Search of Database Nirvana: Challenges of Delivering HTAPHBaseCon
 
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web ArchivingHBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web ArchivingHBaseCon
 
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...HBaseCon
 
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...Cloudera, Inc.
 
Real-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudReal-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudHBaseCon
 
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBaseHBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBaseHBaseCon
 
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon
 
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems Cloudera, Inc.
 

En vedette (20)

Apache HBase at Airbnb
Apache HBase at Airbnb Apache HBase at Airbnb
Apache HBase at Airbnb
 
Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search Improvements to Apache HBase and Its Applications in Alibaba Search
Improvements to Apache HBase and Its Applications in Alibaba Search
 
Apache HBase - Just the Basics
Apache HBase - Just the BasicsApache HBase - Just the Basics
Apache HBase - Just the Basics
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New Features
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at Xiaomi
 
Solving Multi-tenancy and G1GC in Apache HBase
Solving Multi-tenancy and G1GC in Apache HBase Solving Multi-tenancy and G1GC in Apache HBase
Solving Multi-tenancy and G1GC in Apache HBase
 
Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction
 
Apache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBaseApache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBase
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBase
 
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
 
In Search of Database Nirvana: Challenges of Delivering HTAP
In Search of Database Nirvana: Challenges of Delivering HTAPIn Search of Database Nirvana: Challenges of Delivering HTAP
In Search of Database Nirvana: Challenges of Delivering HTAP
 
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web ArchivingHBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving
HBaseCon 2015: Warcbase - Scaling 'Out' and 'Down' HBase for Web Archiving
 
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
A Graph Service for Global Web Entities Traversal and Reputation Evaluation B...
 
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
HBaseCon 2013: Apache Drill - A Community-driven Initiative to Deliver ANSI S...
 
Real-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudReal-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the Cloud
 
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBaseHBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
 
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
 
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
 

Similaire à Rolling Out Apache HBase for Mobile Offerings at Visa

Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...Data Con LA
 
OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoNathaniel Braun
 
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012larsgeorge
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven productsLars Albertsson
 
Testing at-cloud-speed sans-app-sec-austin-2013
Testing at-cloud-speed sans-app-sec-austin-2013Testing at-cloud-speed sans-app-sec-austin-2013
Testing at-cloud-speed sans-app-sec-austin-2013Matt Tesauro
 
A step by-step process to design and manage a successful sap bi implementatio...
A step by-step process to design and manage a successful sap bi implementatio...A step by-step process to design and manage a successful sap bi implementatio...
A step by-step process to design and manage a successful sap bi implementatio...Xoomworks Business Intelligence
 
HBase Backups
HBase BackupsHBase Backups
HBase BackupsHBaseCon
 
Enterprise Use Case Webinar - PaaS Metering and Monitoring
Enterprise Use Case Webinar - PaaS Metering and Monitoring Enterprise Use Case Webinar - PaaS Metering and Monitoring
Enterprise Use Case Webinar - PaaS Metering and Monitoring WSO2
 
Architecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud DetectionArchitecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud Detectionhadooparchbook
 
23 LAMP Stack #burningkeyboards
23 LAMP Stack #burningkeyboards23 LAMP Stack #burningkeyboards
23 LAMP Stack #burningkeyboardsDenis Ristic
 
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Sparkhbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and SparkMichael Stack
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Deployment of SAP Solutions on AWS (Level 200)
Deployment of SAP Solutions on AWS (Level 200)Deployment of SAP Solutions on AWS (Level 200)
Deployment of SAP Solutions on AWS (Level 200)Amazon Web Services
 
Hbase Backups: Backups in the Enterprise
Hbase Backups: Backups in the EnterpriseHbase Backups: Backups in the Enterprise
Hbase Backups: Backups in the EnterpriseSalesforce Engineering
 
Couchbase Chennai Meetup: Developing with Couchbase- made easy
Couchbase Chennai Meetup:  Developing with Couchbase- made easyCouchbase Chennai Meetup:  Developing with Couchbase- made easy
Couchbase Chennai Meetup: Developing with Couchbase- made easyKarthik Babu Sekar
 
Couchbase Singapore Meetup #2: Why Developing with Couchbase is easy !!
Couchbase Singapore Meetup #2:  Why Developing with Couchbase is easy !! Couchbase Singapore Meetup #2:  Why Developing with Couchbase is easy !!
Couchbase Singapore Meetup #2: Why Developing with Couchbase is easy !! Karthik Babu Sekar
 
Four Ways to Improve ASP .NET Performance and Scalability
 Four Ways to Improve ASP .NET Performance and Scalability Four Ways to Improve ASP .NET Performance and Scalability
Four Ways to Improve ASP .NET Performance and ScalabilityAlachisoft
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster Cloudera, Inc.
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster Cloudera, Inc.
 

Similaire à Rolling Out Apache HBase for Mobile Offerings at Visa (20)

Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
 
OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ Criteo
 
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
Testing at-cloud-speed sans-app-sec-austin-2013
Testing at-cloud-speed sans-app-sec-austin-2013Testing at-cloud-speed sans-app-sec-austin-2013
Testing at-cloud-speed sans-app-sec-austin-2013
 
A step by-step process to design and manage a successful sap bi implementatio...
A step by-step process to design and manage a successful sap bi implementatio...A step by-step process to design and manage a successful sap bi implementatio...
A step by-step process to design and manage a successful sap bi implementatio...
 
HBase Backups
HBase BackupsHBase Backups
HBase Backups
 
Enterprise Use Case Webinar - PaaS Metering and Monitoring
Enterprise Use Case Webinar - PaaS Metering and Monitoring Enterprise Use Case Webinar - PaaS Metering and Monitoring
Enterprise Use Case Webinar - PaaS Metering and Monitoring
 
Architecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud DetectionArchitecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud Detection
 
23 LAMP Stack #burningkeyboards
23 LAMP Stack #burningkeyboards23 LAMP Stack #burningkeyboards
23 LAMP Stack #burningkeyboards
 
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Sparkhbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Deployment of SAP Solutions on AWS (Level 200)
Deployment of SAP Solutions on AWS (Level 200)Deployment of SAP Solutions on AWS (Level 200)
Deployment of SAP Solutions on AWS (Level 200)
 
Hbase Backups: Backups in the Enterprise
Hbase Backups: Backups in the EnterpriseHbase Backups: Backups in the Enterprise
Hbase Backups: Backups in the Enterprise
 
Couchbase Chennai Meetup: Developing with Couchbase- made easy
Couchbase Chennai Meetup:  Developing with Couchbase- made easyCouchbase Chennai Meetup:  Developing with Couchbase- made easy
Couchbase Chennai Meetup: Developing with Couchbase- made easy
 
Couchbase Singapore Meetup #2: Why Developing with Couchbase is easy !!
Couchbase Singapore Meetup #2:  Why Developing with Couchbase is easy !! Couchbase Singapore Meetup #2:  Why Developing with Couchbase is easy !!
Couchbase Singapore Meetup #2: Why Developing with Couchbase is easy !!
 
Four Ways to Improve ASP .NET Performance and Scalability
 Four Ways to Improve ASP .NET Performance and Scalability Four Ways to Improve ASP .NET Performance and Scalability
Four Ways to Improve ASP .NET Performance and Scalability
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
 

Plus de HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on BeamHBaseCon
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at HuaweiHBaseCon
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程HBaseCon
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践HBaseCon
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台HBaseCon
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comHBaseCon
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at HuaweiHBaseCon
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0HBaseCon
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon
 
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon
 

Plus de HBaseCon (20)

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environment
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
 
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
 

Dernier

AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benonimasabamasaba
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 

Dernier (20)

AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 

Rolling Out Apache HBase for Mobile Offerings at Visa

  • 1. | HBaseCon 2016 | May 24, 20161 Rolling Out Apache HBase for Mobile Offerings at Visa Partha Saha pasaha@visa.com CW Chung cchung@visa.com
  • 2. | HBaseCon 2016 | May 24, 20162 Data loaded in real-time Over 100 Billion rows as history from most recent Milli-second response times for write/read What this talk is about – A choice of NoSQL at Visa Scale Speed Real-time
  • 3. | HBaseCon 2016 | May 24, 20163 An example of a mobile offering Add card to wallet Pay For Purchase See your transaction Right away along with recent history Need NoSQL Here
  • 4. | HBaseCon 2016 | May 24, 20164 We chose HBase as a NoSQL solution. We built a scalable and real-time Transaction History Service. We migrated prominent Mobile wallet offerings to the Service. This talk is about our learnings over the last year.
  • 5. | HBaseCon 2016 | May 24, 20165 This talk … 1. We assume some knowledge and familiarity of HBase. 2. We used HBase 1.0.0 with Cloudera Distribution CDH 5.4.3, so our observations are based on that version of HBase. 3. We cover the important learning events along the way of adoption of HBase in Visa 1. These can help new teams adopting HBase so that they avoid the same pitfalls. 2. Our learning continues as we take on more interesting and challenging opportunities.
  • 6. | HBaseCon 2016 | May 24, 20166 Is YCSB a good way to compare NoSQL options?
  • 7. | HBaseCon 2016 | May 24, 20167 It is actually not… • Unless you know how to configure your NoSQL options for optimal performance… • You may be driven to another solution, because its performance seems “smoother” and easier to explain by rudimentary knowledge. 0 20000 40000 60000 1 12 23 34 45 56 67 78 89 100 111 122 133 144 155 166 177 188 199 210 221 232 243 254 265 276 287 298 309 Series2 0 20000 40000 1 12 23 34 45 56 67 78 89 100 111 122 133 144 Series2 • It is a great tool however to observe how system configuration changes performance, and explore the configuration space for various workloads.
  • 8. | HBaseCon 2016 | May 24, 20168 Our YCSB experience… • Very easy to set up! • Got a baseline of HBase performance of the cluster. Rerun after significant configuration & application code changes. • Key parameters used: – # of client threads – # of operations – # records in Data Set – Workload mix of read/update/insert. (We added 100% insert/update workload). – Use a bash driver script to test various combinations of parameters. • Latency measurement type can be in histogram or timeseries. Both were useful.
  • 9. | HBaseCon 2016 | May 24, 20169 Should you design yourself out of major compactions?
  • 10. | HBaseCon 2016 | May 24, 201610 Not worth the trouble when you are starting… • An argument may be made that if we need an “N” day rolling look back, we can have daily tables that we create before and delete past the look back window. We can then reason about how to compact each daily file. Will that make the system operate better? • Write amplification is a well known problem and gets a lot of attention, but however, worrying about the problem during early design stages seemed like premature optimization. • We thought that we could always optimize later through rolling compactions and diurnal patterns of traffic later once patterns of reads and writes were fully understood.
  • 11. | HBaseCon 2016 | May 24, 201611 Does your design need transactional support?
  • 12. | HBaseCon 2016 | May 24, 201612 We analyzed our secondary and primary key read/writes. Primary key Fact pk1 pk2 Seconda ry key Associations sk1 {pk1} sk2 {pk1, pk2} Query keys for facts Register associations • We concluded, by tracing reads and failures through updates that inconsistencies were short lived. • We would have used a transaction support library otherwise.
  • 13. | HBaseCon 2016 | May 24, 201613 How do you hands-on learn about HBase without going into Production?
  • 14. | HBaseCon 2016 | May 24, 201614 We built a Continuous Integration and Learning Environment Build Server git/ Stash Bamboo Artifactory Client Bamboo plan Chef Client - Checkout - Build - Upload - Deploy - Run test Test Server
  • 15. | HBaseCon 2016 | May 24, 201615 How do get Operations ready for HBase in Production?
  • 16. | HBaseCon 2016 | May 24, 201616 We allocated one developer for 1 day/week to monitor production problems … Bangalore India Foster City CA, USA 1. We shadowed the real production 2. Any production problem was given priority by the whole team 3. We used 2 sites for 24x7 eyes 4. Added Alert and Monitoring dashboards 5. We launched only when when we met certain metrics
  • 17. | HBaseCon 2016 | May 24, 201617 Loading data in real-time as it is read
  • 18. | HBaseCon 2016 | May 24, 201618 We used a micro-batch approach Pre- Processor Listing & Sender Tracker Loader Master Receiver Loader Worker Batch Processor LLF Reader HBase Load Batch Processor LLF Reader HBase Load Batch Processor Stream Reader HBase Load Listing & Sender Tracker Notification Master Receiver Notification Worker Batch Processor LLF Reader HBase Registration Query Send Notification Batch Processor LLF Reader HBase Registration Query Send Notification Batch Processor Stream Reader HBase Query Send Notification IPC IPC Micro-Batch (250 ms) Control and State Files readswrites 1 per Stream 1..N per Master 1 per Stream1..N per Master Stream N Stream 2 stream1 …..... tail We had to build an approach to remember and retry from any point in each stream
  • 19. | HBaseCon 2016 | May 24, 201619 Reading via Web Servers
  • 20. | HBaseCon 2016 | May 24, 201620 The web-services Front End Audit DB MQ Config Service Access Authorization Encryption UtilityAudit Load Distribution Plugin Cache Subscription Service Failover Service BusinessComponent DataService Web Service Wrapper Rest Controller API Request Transform Response Transform Domain Objects Audit Listener HBaseAPI HBase Plugin HBase Cluster Gateway and Load Balancer
  • 21. | HBaseCon 2016 | May 24, 201621 Availability
  • 22. | HBaseCon 2016 | May 24, 201622 We used 2 data centers to get availability Data Center 1 Streams Data Center 2 Streams Replication of non-native streams We use shadow tables to write for the other when the other is down, and drain the shadow tables for the other to catch-up
  • 23. | HBaseCon 2016 | May 24, 201623 Learning your Data Center clock
  • 24. | HBaseCon 2016 | May 24, 201624 HBase is sensitive to clock skew… • Kerberos services do not tolerate more than a few minutes of clock skew. • Warnings are generated for a small skews, large skews kill region-servers.
  • 25. | HBaseCon 2016 | May 24, 201625 Client retries
  • 26. | HBaseCon 2016 | May 24, 201626 Client retries & IOExceptions • Default HBase timeout/retries settings can take tens of minutes to timeout: – hbase.rpc.timeout: 60 sec – hbase.client.retries.number: 35 – hbase.client.pause: 100 msec (grows to 10 sec quickly after back-off) – Longer when factor in potential retries by zookeeper! – Blogs by Lars Hofhansl: “HBase Client timeouts”, “HBase client response times” • We choose Fail Fast strategy, as end user device will do end-to-end retry. • Timeout/retries settings: 1 sec timeout, 3 total tries. – Works well for the same data center, as well as across data centers • However, once a while, clients see IOExceptions! – Caused by Region Server (busy in GC, major/minor compaction, … ?) – Or the Network? – Or the Client itself?
  • 27. | HBaseCon 2016 | May 24, 201627 Correlating client exceptions
  • 28. | HBaseCon 2016 | May 24, 201628 Correlating client exceptions • Client side: – Turn on hbase client debugging: • log4j.logger.org.apache.hbase.client=DEBUG • log4j.logger.org.apache.hbase.ipc=DEBUG – Catch the exceptions to print out specific Region Server name: • IOException, RetriesExhaustedWithDetailsException • Server side: – Then look into the specific Region Server log of that server. • Works well when you know the specific server causing the IOExceptions. – What if not?
  • 29. | HBaseCon 2016 | May 24, 201629 Correlating client exceptions • Build Root Cause Analysis software to: – Collect the relevant logs from the sources: • Client: application logs, hbase client logs, GC logs • Hadoop server: HBase, HDFS, Zookeeper server and GC logs • Cluster events: Cloudera Manage API • Other logs: KDC logs, Kerberos canary, network latency monitoring – Parse the logs (single line, multi-line text, json, xml) into csv files. – Normalize data and time format, apply date and time range filtering. – Apply text filtering and text reduction on verbose lines. – Output: events csv, sorted by time and server, suitable for grep/awk/sort, hive/sql. • Quickly get an total view of the sequence of events of various services. • Sometime can identify the smoking gun (e.g. exception caused by GC ). • Still useful in the few cases when no smoking gun can be found! – Trouble-shooting is also a process of elimination.
  • 30. | HBaseCon 2016 | May 24, 201630 Kerberos Gotchas
  • 31. | HBaseCon 2016 | May 24, 201631 Kerberos Gotchas – what we have learned • Hostname uses FQDN (Fully Qualified Domain Name, like server123.abc.com) • Use TCP rather than UDP (set udp_preference_limit = 1 in krb5.conf) • KDC (MIT Kerberos) server: – Configure to start up several kdc processes to handle bursty traffic (use –w option). – Set up a backup kdc for higher availability. • Debugging tips: – $ export KRB5_TRACE=/dev/stderr (or to a file) – $ log4j: -Dsun.security.krb5.debug=true • Kerberos support is built into the Java JRE, using internal classes: – Oracle JDK: com.sun classes; on IBM AIX: com.ibm – Hadoop is built and tested against Oracle JDK ( mileage on AIX JDK varies). • Good references (besides the usual documents on Kerberos, and HBase User mailing list): – Steve Loughran: Hadoop and Kerberos: The Madness beyond the Gate. – HBase and Hadoop common source code: UserGroupInformation.java.
  • 32. | HBaseCon 2016 | May 24, 201632 Kerberos Gotchas – what we learned – Renewing a TGT Ticket (Ticket Granting Ticket) • After kinit successfully, application principal gets a Kerberos TGT ticket. • By default, the TGT ticket is good for 10 hours. • For long-running applications, 10 hours obviously is not enough: need to renew TGT. • Initially uses a process/thread to do a kinit once every few hours. – Still ran into some IOExceptions at the time of TGT of renewal. – Not the recommended way for long-running applications. • Now uses UGI API (UserGroupInformation): loginUserFromKeytab( ). – Does not require a separate process/thread to do TGT renewal. – Hadoop/HBase client class library will catch the exception due to TGT expiration, and will do a reloginFromKeytab( ) to renew TGT automatically. – Also considering spawn a thread and proactively invoke CheckTGTAndRelogin( ). – Ongoing investigation: client occasionally still experiencing momentary IOException around the time ticket renewal. – Referral Ticket: when on realm is set up to trust another realm, be aware of the additional kdc calls resulted when the kinit principal is from the trusted realm.
  • 33. | HBaseCon 2016 | May 24, 201633 Garbage Collection
  • 34. | HBaseCon 2016 | May 24, 201634 Garbage Collection • Use G1 on Oracle JDK 1.8 • Basically using settings as recommended by folks from HBaseCon2015. – By Eric Kaczmarek, Yanping Wang, Liqi Yi • Set target GC pause to 100 msec; Young Gen to ~1GB. • Observation consistent with their published results: – Observed gc time in production: • 100 msec or less: 67% • 400 msec or less: 99.98% • Important to track the actual production gc time, as Production and Test cluster shows somewhat different distribution.
  • 35. | HBaseCon 2016 | May 24, 201635 GC Duration comparison: production vs perf cluster
  • 36. | HBaseCon 2016 | May 24, 201636 GC: How Good is MaxGCPauseMillis as a Target? MaxGCPauseMillis = 100 Production Cluster (gc in msec) Test Cluster (gc in msec) # of gc events 165192 199883 Avg / Std Dev / Max 87.1 / 64.9 / 1530 msec 81.9 / 37.2 / 1370 msec 50 percentile (median) 80 msec 90 msec 95 percentile / 99% / 99.9% / 99.99% 210 msec / 270 / 450 / 660 msec 120 msec / 140 / 510 / 780 msec Percentile of: 100 msec / 200 / 300 / 400 msec 67% / 95% / 99.4% / 99.8% 85% / 99.4% / 99.6% / 99.8%
  • 37. | HBaseCon 2016 | May 24, 201637 In Conclusion…
  • 38. | HBaseCon 2016 | May 24, 201638 Adopting an open source product is a journey… • Learning from previous adoption successes is crucial – if use case has not been tried/analyzed/written about before, chances are we have to pay for learning and having alternate choices is a good idea. • Making only one major technology change at a time is always a good idea. • Setting up appropriate expectations through team members and agile processes is important. • Going to production scenario early as shadow and learning through frequent releases is helpful. • We believe extra capacity for peak workloads was very helpful. • Having source code is very useful in learning and trouble-shooting.
  • 39. | HBaseCon 2016 | May 24, 201639 It Takes a Village! Thank you! Alexandr Peyko Amit Sharma Anthony Chu Arindam Chakraborty Artem Savinov Aviral Agarwal Bala Saravanan Kannan Ben Crane Carl Duque Chetan Talanki Debasis Mullick Deepankar Palit Hong Zhu Igor Karpenko Igor Peller Igor Ulianitski Jay Gardner Jim Gordon Karthikeyan Manickavasagan Liang Gao Murali Reddy Nandakumar Jayakumar Nimish Shah Peter Meigs Pradyot Sikdar Praveen Rudraraju Rajat Raj Raj Merchia Ralph Blore Ranjan Dutta Ricardo De Ocampo Domingo Robert Walsh Sabu Peter Sam Hamilton Sandeep Reddy Satyaban Nandi Soumya Das Srijoy Aditya Srinivas Reddy Surasani Suchismita Nayak Suresh Pulikara Ujjwal Kumar Vikash Talanki Vinay Sarda Waqar Hasan Winnie Chau Xuepeng (Hans) Li Yanyan Hao Yusuf Rahaman Amandeep Khurana Jeongho Park Jugoslav Djajic Justin Hayes Michael Stack