4. Why ? Not only SQL
• Size
• Distributed data with accelerating growth of data
• Scalability & elasticity (at low cost!)
• Connectedness
• Global linked data• Global linked data
• Semi-structure
• Flexible schemas / semi-structured data
• Complex queries
• Architecture
• Data mining and association toward more complex data modeling
• Transactions / strong consistency / integrity
• Geographic distribution (multiple datacenters)
4/12/2011 Creative Commons Attribution-Share Alike 3.0 4
8. NoSQL Taxonomy
Key-Value stores
• Simple K/V lookups (DHT)
Column stores
• Each key is associated with many attributes (columns)
• NoSQL column stores are actually hybrid row/column stores
• • Different from “pure” relational column stores!• • Different from “pure” relational column stores!
Document stores
• Store semi-structured documents (JSON)
• Map/Reduce based materialization, sorting, aggregation, etc.
Graph databases
• Scale, semi-structure data model
More …
4/12/2011 Creative Commons Attribution-Share Alike 3.0 8
12. Why Graph Databases?
Data mining
• You can make algorithms for searching patterns and add AI
High-critical environments
• You can apply neo4j for high load databases and optimize the
queries and reduce costs on hardware use
• Engineering in biochemical components• Engineering in biochemical components
• You can make algorithms for helping the study of protein synthesys,
for example
Discrete event simulation
• You can apply a pattern and behavior and assign everything to a
graph database
Social graph
• Everything in user related “tastes” can be organized in a graph
Network architecture
4/12/2011 Creative Commons Attribution-Share Alike 3.0 12
13. When should I use a Graph DB ?
Massive data volumes
• Massively distributed architecture required to store the data
• Google, Amazon, Yahoo, Facebook – 10-100K servers
Extreme query workload
• Impossible to efficiently do joins at that scale with an RDBMS
Have a complex and evolving data modelHave a complex and evolving data model
• Big part of domain is expressed as relationships
• Schema flexibility (migration) is not trivial at large scale
• Schema changes can be gradually introduced with NoSQL
• Few mandatory and many optional attributes
• Have SQL queries that span many table joins
Many YES => maybe a Graph DB is a good choice
4/12/2011 13Creative Commons Attribution-Share Alike 3.0
14. When NOT use Graph DB
• Don't have a graph related problem ?
• Not too much changing requirements ?
• Easy to organized data into:
− Tables, Documents or Key-Value models ?− Tables, Documents or Key-Value models ?
Few & well defined relationships in the domain ?
Don't have SQL queries that span many table joins ?
Many YES => maybe Graph DB not a good choice
4/12/2011 14Creative Commons Attribution-Share Alike 3.0
15. Undirected Graph
• dots (vertices) + lines
(edges) = graphs.
• The Undirected Graph
VerticesVertices
• All vertices denote the
same
• type of object.
Edges
• All edges denote the same
type of relationship.
• All edges denote a
symmetric relationship.
4/12/2011 Creative Commons Attribution-Share Alike 3.0 15
16. Directed, Multiple Relational Graph
Vertices
• Vertices can be
different type of object.
EdgesEdges
• Edges can be different
type of relationship.
• All edges denote an
asymmetric
relationship.
4/12/2011 Creative Commons Attribution-Share Alike 3.0 16
18. Benefits of Graph Database
• Express your domain as a Graph
− Domain Modeling Friendly
− No O/R mismatch
− Efficient storage of Semi Structured InformationEfficient storage of Semi Structured Information
− Schema Less
• Express Queries as Traversals
− Fast deep traversal instead of slow SQL queries that
span many table joins
4/12/2011 18Creative Commons Attribution-Share Alike 3.0
23. Why Neo4j ?
• Widely deployed graph db in the world
• ACID, persistent, embedded/server
• Robust: 24/7 production since 2003
• Mature: lots of production deployments
Scalable: High Availability, Master failover• Scalable: High Availability, Master failover
• Community: ecosystem of tools, bindings, frameworks
• Product: OSGi, Spatial, RDF, languages
• Available under AGPLv3 and as commercial product
• But the first one is free! For ALL use-cases
4/12/2011 Creative Commons Attribution-Share Alike 3.0 23