2. NoSQL in Real World
In 2006
developed and uses BigTable [1]
developed and uses Dynamo [2]
uses CouchDB [3]
In 2009
developed and uses Sherpa [4]
developed and uses Voldemort [5]
developed and used Cassandra [6], now uses HBase [7]
, and use Cassandra [6]
, , , , and
use MongoDB [8]
In 2011
introduced their own NoSQL [9]
Karamjit Kaur (TU) NoSQL Databases April 2013 2 / 27
3. Introduction to NoSQL Databases
NoSQL = “No SQL”
NoSQL = “Not Only SQL”
Carlo Strozzi used the term NoSQL in 1998 to name his lightweight,
open-source relational database that did not expose the standard SQL
interface [10].
In 2009, Eric Evans reintroduced the term to describe the growing
non-RDBMS movement [11].
Broadly refers to a set of data stores that do not use SQL or a relational
model to store data.
Karamjit Kaur (TU) NoSQL Databases April 2013 3 / 27
4. NoSQL Database Model
Does Not
Use SQL as the query language
Require fixed table schemas
Support join operations
Give all ACID properties (provides BASE [12] instead)
Does
Scale horizontally [13]
Provide eventual consistency [14]
Support shared nothing architecture
Karamjit Kaur (TU) NoSQL Databases April 2013 4 / 27
5. Why Now?
Three V’s
Velocity – Speed of data in and out
Volume – Large amount of data, Scalability
Variety – Semi-structured or unstructured data, Impedance mismatch
Availability of cheap main memory
Change in architecture – from Web 1.0 to Web 2.0+
Need for high availability
High personnel cost
Karamjit Kaur (TU) NoSQL Databases April 2013 5 / 27
6. Benefits of Relational Databases
Incredibly mature
Provides immediate consistency
Integrity of data is enforced
Efficient use of storage space if properly normalized
Powerful query language
Help is plentiful and easy to find
Supported by everyone and everything
Karamjit Kaur (TU) NoSQL Databases April 2013 6 / 27
7. Problems with Relational Databases
Vertical scaling (scaling up) [13]
Replication with strong consistency limits availability
Single point of failure
Object relational impedance mismatch [15]
Static, rigid and inflexible design
Poor handling of semi-structured and non-structured data
Expensive join operations due to normalization
Karamjit Kaur (TU) NoSQL Databases April 2013 7 / 27
8. NoSQL Advantages
No unwanted complexity
High throughput
Horizontal scalability
Economical
Avoidance of expensive object-relational mapping
Flexible data model
Reduced DBA workload
Karamjit Kaur (TU) NoSQL Databases April 2013 8 / 27
10. Key-Value Data Store
Data is organized as an associative array of entries
Key based storage, updation and retrieval
Allow the application developer to store schema-less data
Fast storage and retrieval
Transparent partition and replication (based on keys)
Most famous key-value data store: Amazon’s Dynamo [2]
Other examples: Redis [16], Voldemort [5]
Karamjit Kaur (TU) NoSQL Databases April 2013 10 / 27
12. Document Data Store
Stores, retrieves and manages semi-structured data
Support multiple types of documents and nested documents too
Each document is identified by a unique key
Provides API that allow retrieving documents based on their contents
Different documents may have different fields
Examples: Cassandra [6], Hbase [7]
Karamjit Kaur (TU) NoSQL Databases April 2013 12 / 27
14. Column-oriented Data Store
Also called extensible record stores
Data is stored column-wise instead of row-wise
Group of columns is called column family and is analogous to table in
relational database
Columns of a table are distributed over multiple nodes by using column
groups
New columns can be easily added to column families
Each row can have a different set of columns
Allows versioning of data
Most famous column-oriented data store: Google’s Bigtable [1]
Other examples: CouchDB [3], MongoDB [8]
Karamjit Kaur (TU) NoSQL Databases April 2013 14 / 27
16. Graph-based Data Store
Employ nodes (like entities), properties (attributes), and edges (rela-
tionships)
Faster for associative data sets
Can scale to large data sets without joins
Every element contains a direct pointer to its adjacent element
Traverse graph to find the data
Efficient for representing social networks and storing sparse data
Examples: Neo4j [15], Infinite Graph [17]
Karamjit Kaur (TU) NoSQL Databases April 2013 16 / 27
18. NoSQL Disadvantages
Not mature enough
Lack of support
Standardization pending
Less expertise
Require redesigning
Reluctance of enterprises to adopt non-ACID databases
Karamjit Kaur (TU) NoSQL Databases April 2013 18 / 27
19. New SQL Databases = SQL + NoSQL Databases
Term coined by research group named ’451’ in their famous report,
“NoSQL, NewSQL and Beyond” [18]
Preserve SQL
Uses traditional ACID notion for transactions
Offer high performance
Offer scale-out, shared-nothing architecture, capable of running on a
large number of nodes without creating bottle-necks
Examples: VoltDB [19], Xeround [20], NuoDB [21], JustOneDB [22]
etc.
Karamjit Kaur (TU) NoSQL Databases April 2013 19 / 27
20. One Size Does Not Fit All
Redis for user sessions: Rapid access for reads and writes
RDBMS for financial data: Transactional updates and reporting
Riak for shopping cart: High availability across multiple locations
Neo4J for recommendations: Rapidly traverse links between friends,
product purchases and ratings
MongoDB for product catalog: Lots of reads, infrequent writes
Cassandra for analytics and user activity logs: High volume of writes
on multiple nodes
Karamjit Kaur (TU) NoSQL Databases April 2013 20 / 27
22. References I
[1] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach,
M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber, “Bigtable: a
distributed storage system for structured data,” in Proc. of the 7th
symposium on Operating systems design and implementation OSDI
’06, Berkeley, CA, 2006, pp. 205–218.
[2] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and
W. Vogels, “Dynamo: Amazon’s highly available key-value store,” in
Proc. of twenty-first ACM SIGOPS symposium on Operating systems
principles, Stevenson, Washington, USA, 2007.
[3] (2010) The CouchDB website. [Online]. Available:
http://couchdb.apache.org/
Karamjit Kaur (TU) NoSQL Databases April 2013 22 / 27
23. References II
[4] D. Arseneau. (2010, Aug.) 10 things you should know about nosql
databases. [Online]. Available: http://www.techrepublic.com/blog/
10things/10-things-you-should-know-about-nosql-databases/1772
[5] Project voldemort: A distributed database. [Online]. Available:
http://project-voldemort.com/
[6] A. Lakshman and P. Malik, “Cassandra - a decentralized structured
storage system,” Technical Report, Cornell University, 2009.
[7] The HBase website. [Online]. Available: http://hbase.apache.org/
[8] The mongodb’s website. [Online]. Available:
http://www.mongodb.org/
[9] The oracle website. [Online]. Available: http://www.oracle.com/
technetwork/products/nosqldb/overview/index.html
Karamjit Kaur (TU) NoSQL Databases April 2013 23 / 27
24. References III
[10] C. Strozzi. Nosql a relational database management system. [Online].
Available: http:
//www.strozzi.it/cgi-bin/CSA/tw7/I/en US/nosql/Home%20Page
[11] E. Evans. (2009, May) Nosql 2009. [Online]. Available:
http://blog.sym-link.com/2009/05/12/nosql 2009.html
[12] D. Pritchett, “Base: An acid alternative,” ACM Queue, pp. 48–55,
May 2008.
[13] T. Hoff. (2009, Aug.) An unorthodox approach to database design:
The coming of the shard. [Online]. Available: http://highscalability.
com/unorthodox-approach-database-design-coming-shard
[14] S. Gilbert and N. Lynch, “Brewer’s conjecture and the feasibility of
consistent, available, partition-tolerant web services,” ACM SIGACT
News, vol. 33, pp. 51–59, 2002.
Karamjit Kaur (TU) NoSQL Databases April 2013 24 / 27
25. References IV
[15] (2006, Nov.) The neo database. [Online]. Available:
http://dist.neo4j.org/neo-technology-introduction.pdf
[16] J. Zawodny, “Redis: Lightweight key/value store that goes the extra
mile,” Linux Magazine, Aug. 2009.
[17] The infinite graph website. [Online]. Available:
http://www.infinitegraph.com/
[18] M. Aslett. (2011, Apr.) Nosql, newsql and beyond: The answer to
sprained relational databases. [Online]. Available:
http://blogs.the451group.com/information management/2011/04/
15/nosql-newsql-and-beyond/
[19] The voltdb website. [Online]. Available: http://voltdb.com/
[20] The xeround website. [Online]. Available: http://xeround.com/
[21] The nuodb website. [Online]. Available: http://www.nuodb.com/
Karamjit Kaur (TU) NoSQL Databases April 2013 25 / 27
26. References V
[22] The JustOneDB website. [Online]. Available:
http://www.justonedb.com/
Karamjit Kaur (TU) NoSQL Databases April 2013 26 / 27