3. Features
• Seamless integration with Hadoop
• Distributed operation
– Fault tolerance
– Load balancing
– Easily add/remove nodes
• Non-technical reasons
– Large community
– Large scale online user cases
4. When should I use ?
• Not for all problems
• Hundreds of millions or billions or rows
• Live without extra features provided by
RDBMS (typed columns, secondary indexes,
transactions)
• Enough hardware
5. History
•
•
•
•
2006: BigTable paper published by Google.
2006 (end of year): HBase development starts.
2008: HBase becomes Hadoop sub-project.
2010: HBase becomes Apache top-level
project.