4. Motivation
§ Traditional DBMS: § NoSQL stores:
• SQL • SQL
• Consistent • Consistent
• Scalable • Scalable
§ Some use cases require
• Scalability
• Consistent updates
5. Motivation
§ Lower latency
• MapReduce latencies measured in hours
• Usually low latency systems don’t provide strong
consistency guarantees
§ Requested feature
• Support for transactions in HBase is a recurrent request
§ Percolator
• Google implemented transactions on BigTable
9. Omid
§ ‘Hope’ in Persian
§ Optimistic Concurrency Control system
• Lock free
• Aborts transactions at commit time
§ Implements Snapshot Isolation
§ Low overhead
§ https://github.com/yahoo/omid
10. Snapshot Isolation
§ Each transaction reads from its own snapshot
§ Snapshot:
• Immutable
• Identified by creation time
• Contains all values committed before creation time
§ Transactions conflict if two conditions apply:
A) Write to the same row
B) Overlap in time
11. Overlapping transactions
Transaction 1
Transaction 2
Transaction 3
Transaction 4
Time
§ Transaction 2 could conflict with 1 or 3
• If they write to the same row
§ Transaction 4 doesn’t conflict
• No overlapping transactions
12. Simple API
§ Based on Java Transaction API and HBase API
§ TransactionManager:
Transaction begin();
void commit(Transaction t) throws RollbackException;
void rollback(Transaction t);
§ TTable:
Result get(Transaction t, Get g);
void put(Transaction t, Put p);
ResultScanner getScanner(Transaction t, Scan s);
14. Architecture
1. Centralized server
2. Transactional metadata replicated to clients
3. Store transaction ids in HBase timestamps
HBase Client
HBase Client S
Omid
HBase Client
Clients O
(2) HBase
Status Oracle (SO) HBase
Region
HBase
Region
Servers
HBase
Region
Servers
Region
Servers
(1) Servers
(3)
15. Omid Performance
§ Centralized server = bottleneck?
• Focus on good performance
• In our experiments, it never became the bottleneck
30
2 rows
4 rows
25 8 rows
16 rows
32 rows
20 128 rows
Latency in ms
512 rows
15
10
5
0
1K 10K 100K
Throughput in TPS
16. Metadata Replication
§ Transactional metadata is replicated to clients
§ Clients guarantee Snapshot Isolation
• Ignore data not in their snapshot
• Talk directly to HBase
• Conflicts resolved by the Status Oracle
§ Expensive, but scalable up to 1000 clients
17. Fault Tolerance
§ Omid uses BookKeeper for fault tolerance
• BookKeeper is a Distributed Write Ahead Log
§ Before answering a commit request
• Log it to BookKeeper asynchronously
• Wait for a reply
• Notify the client
§ If the Status Oracle crashes
• Recover the state from BookKeeper
18. Fault Tolerance Overhead
§ Omid batches writes to BookKeeper
• Write every 5 ms or when the batch > 1 KB
§ Recovery: reads log’s tail (bounded time)
Downtime < 25 s
18000
Throughput in TPS
15000
12000
9000
6000
3000
0
5900 5950 6000 6050 6100
Time in s
20. TF-IDF
§ Term Frequency – Inverse Document Frequency
• How important is a word to a document
• Useful for search engines
§ Given a set of words (query)
• Return documents with highest TF-IDF
21. TF-IDF on Tweets
§ Given a set of words
• Return relevant HashTags
§ Document
• collection of tweets with same hashtag
§ Update the index incrementally
22. Implementation
§ Read tweets’ stream, put them in queue
§ Workers process each tweet in parallel
• One transaction per tweet
• Update frequencies consistently
• In case of abort, retry
§ Queries have a consistent view of the database
23. Problems
§ Hard to distribute load to other machines
§ Complex processing cause big transactions
• More likely to conflict
• More expensive to retry
§ API is too low level for data processing
25. Future work
§ Framework for Incremental Processing
• Simpler API
• Trigger-based, easy to decompose operations
• Auto scalable
§ Integrate Omid with other Data Stores
26. Contributors
Ben Reed
Flavio Junqueira
Francisco Pérez-Sorrosal
Ivan Kelly
Matthieu Morel
Maysam Yabandeh