This document provides an overview of graph databases and Neo4j. It discusses how graph databases are better suited than relational databases for interconnected data and have simpler data models. Neo4j is highlighted as a graph database that uses nodes, edges and properties to represent data and uses the Cypher query language. It is fully ACID compliant, open source, and has a large active community.
4. Key Value
Stores
Most Based on Dynamo: Amazon Highly Available
Key-Value Store
Data Model:
Global key-value mapping
Big scalable Hash Map
Highly fault tolerant (typically)
Examples:
Redis, Riak, Voldemort
5. Pros & Cons
Pros:
Simple data model
Scalable
Cons:
Create your own “foreign keys”
Poor for complex data
6. Column Family
Most Based on Big Table: Google’s Distributed
Storage System for Structured Data
Data Model:
A big table, with column families
Map Reduce for querying/processing
Examples:
HBase, HyperTable, Cassandra
7. Pros & Cons
Pros:
Supports Simi-Structured Data
Naturally Indexed (columns)
Scalable
Cons:
Poor for interconnected data
9. Pros & Cons
Pros:
Simple, powerful data model
Scalable
Cons:
Poor for interconnected data
Query model limited to keys and indexes
Map reduce for larger queries
13. A Graph Database uses graph structure with nodes, edges
and properties to represent and store data.
By definition, a graph database is any storage system that
provides index-free adjacency. This means that every
element contains a direct pointer to its adjacent element
and no index lookups are necessary.
Graph databases focus on the interconnection between
Entities.
Graph Database definition
14. Compared with RDBMS
Graph databases are often faster for associative data sets
Map more directly to the structure of object-oriented
applications
Scale more naturally to large data sets as they do not typically
require expensive join operations.
As they depend less on a rigid schema, they are more suitable
to manage ad-hoc and changing data with evolving schemas.
18. Edges
Edges are the lines that connect nodes to nodes or nodes to
properties and they represent the Relationship between the
two.
Most of the important information is really stored in the
edges.
Meaningful patterns emerge when one examines the
connections and interconnections of nodes, properties and
edges.
19.
20.
21. What is Neo4j?
• A Graph Database
• Property Graph
• Full ACID (atomicity, consistency, isolation, durability)
• High Availability (with Enterprise Edition)
• 32 Billion Nodes, 32 Billion Relationships,
64 Billion Properties
• Embedded Server
• REST API
22. Key Features
• Runs on major platforms : Mac | Windows | Unix
• Extensive documentation
• Active community
• Open Source
23. CYPHER
Cypher is a declarative graph query language that allows for
expressive and efficient querying and updating of the graph
store without having to write traversal through the graph
structure in code.
24. CYPHER
START: Starting points in the graph, obtained via index lookups or by element IDs.
MATCH: The graph pattern to match, bound to the starting points in START.
WHERE: Filtering criteria.
RETURN: What to return.
CREATE: Creates nodes and relationships.
DELETE: Removes nodes, relationships and properties.
SET: Set values to properties.
FOREACH: Performs updating actions once per element in a list.
WITH: Divides a query into multiple, distinct parts.
Notes de l'éditeur
Dynamo is a set of techniques
Fault tolerant : it enables continue operating after of failure some of its coponents