Neo4j is a graph database. It is an embedded, disk-based, fully transactional Java persistence engine that stores data structured in graphs rather than in tables. A graph (mathematical lingo for a network) is a flexible data structure that allows a more agile and rapid style of development.
2. We all know the
relational model.
Attendees It has been predominant
for a long time.
username fullname registration speaker payment
mtiberg Michael Tiberg null no 0
thobe Tobias Ivarsson 2010-04-07 yes 0
joe John Doe 2010-02-05 no 700
... ... ... ... ...
2
3. Attendees
The relational model has
username fullname registration speaker payment a few problems, such as:
•poor support for sparse
data
•modifying the data
mtiberg Michael Tiberg null no 0 model is almost
exclusively done through
adding tables
thobe Tobias Ivarsson 2010-04-07 yes 0
joe John Doe 2010-02-05 no 700
... ... ... ... ...
Location
username latitude longitude title publish
thobe 55°36'47.70"N 12°58'34.50"E Malmö yes
San
joe 37°49'36.00"N 122°25'22.00"W no
Francisco
... ... ... ... ...
3
4. Attendees Sessions
username fullname registration speaker payment id title time room ...
... ... ... ... ...
mtiberg Michael Tiberg null no 0
... ... ... ... ...
thobe Tobias Ivarsson 2010-04-07 yes 0
Session attendance
joe John Doe 2010-02-05 no 700 session user
... ... ... ... ... ... ...
Location ... ...
username latitude longitude title publish
More complication...
thobe 55°36'47.70"N 12°58'34.50"E Malmö yes ... ...
... ...
After a while, modeling ... ...
complex relationships ... ...
leads to complicated
...... ......
San ......
schemasjoe 37°49'36.00"N 122°25'22.00"W no ......
Francisco ...... ......
...... ......
... ... ... ... ...
4
5. Most focus on scaling to large numbers
Most of the emerging
database technologies
are concerned with
scaling to huge amounts
A of data and massive load.
They do so by making
data opaque and
G B distribute elements based
on key.
F C
E D
5
6. Most focus on scaling to large numbers
Most of the emerging
database technologies
are concerned with
scaling to huge amounts
A of data and massive load.
They do so by making
data opaque and
G B distribute elements based
on key.
F C
E D
5
7. Most focus on scaling to large numbers
Most of the emerging
database technologies
are concerned with
scaling to huge amounts
A of data and massive load.
They do so by making
data opaque and
G B distribute elements based
on key.
F C
E D
5
8. Most focus on scaling to large numbers
Most of the emerging
database technologies
are concerned with
scaling to huge amounts
A of data and massive load.
They do so by making
data opaque and
G B distribute elements based
on key.
F C
E D
5
9. Most focus on scaling to large numbers
Most of the emerging
database technologies
are concerned with
scaling to huge amounts
A of data and massive load.
They do so by making
data opaque and
G B distribute elements based
on key.
F C
E D
5
10. Scaling to size vs. Scaling to complexity
Size
Key/Value stores
Bigtable clones
Document databases
Graph databases
Complexity
6
11. Scaling to size vs. Scaling to complexity
Size
Key/Value stores
Bigtable clones
Document databases
Graph databases
Billions of nodes
and relationships
> 90% of use cases
Complexity
6
12. The Property Graph data model
•Nodes
•Relationships bet ween Nodes
•Relationships have Labels
•Relationships are directed, but traversed at
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties 7
13. The Property Graph data model
•Nodes
•Relationships bet ween Nodes
•Relationships have Labels
•Relationships are directed, but traversed at
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties 7
14. The Property Graph data model
•Nodes
•Relationships bet ween Nodes
•Relationships have Labels
•Relationships are directed, but traversed at
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties 7
15. The Property Graph data model
LIVES WITH
LOVES
OWNS
DRIVES
•Nodes
•Relationships bet ween Nodes
•Relationships have Labels
•Relationships are directed, but traversed at
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties 7
16. The Property Graph data model
LOVES
LIVES WITH
LOVES
OWNS
DRIVES
•Nodes
•Relationships bet ween Nodes
•Relationships have Labels
•Relationships are directed, but traversed at
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties 7
17. The Property Graph data model
name: “Mary”
LOVES
name: “James” age: 35
age: 32 LIVES WITH
twitter: “@spam” LOVES
OWNS
DRIVES
•Nodes
•Relationships bet ween Nodes
•Relationships have Labels brand: “Volvo”
•Relationships are directed, but traversed at model: “V70”
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties 7
18. The Property Graph data model
name: “Mary”
LOVES
name: “James” age: 35
age: 32 LIVES WITH
twitter: “@spam” LOVES
OWNS
item type: “car” DRIVES
•Nodes
•Relationships bet ween Nodes
•Relationships have Labels brand: “Volvo”
•Relationships are directed, but traversed at model: “V70”
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties 7
19. Graphs are whiteboard friendly An application domain model
outlined on a whiteboard or piece
of paper would be translated to
an ER-diagram, then normalized
to fit a Relational Database.
With a Graph Database the model
from the whiteboard is
implemented directly.
Image credits: Tobias Ivarsson 8
20. Graphs are whiteboard friendly An application domain model
outlined on a whiteboard or piece
of paper would be translated to
an ER-diagram, then normalized
to fit a Relational Database.
With a Graph Database the model
from the whiteboard is
implemented directly.
*
1
*
* 1
* 1
*
1
*
Image credits: Tobias Ivarsson 8
21. Graphs are whiteboard friendly An application domain model
outlined on a whiteboard or piece
of paper would be translated to
an ER-diagram, then normalized
to fit a Relational Database.
With a Graph Database the model
from the whiteboard is
implemented directly.
thobe
Joe project blog
Wardrobe Strength
Hello Joe
Modularizing Jython
Neo4j performance analysis
Image credits: Tobias Ivarsson 8
22. What is Neo4j?
๏ Neo4j is a Graph Database
• Non-relational (“#nosql”), transactional (ACID), embedded
• Data is stored as a Graph / Network
‣Nodes and Relationships with properties
‣“Property Graph” or “edge-labeled multidigraph”
๏ Neo4j is Open Source / Free (as in speech) Software
• AGPLv3
Prices are available at
http://neotechnology.com/
• Commercial (“dual license”) license available
Contact us if you have
questions and/or special
license needs (e.g. if you
want an evaluation license)
‣Free (as in beer) for first server installation
‣Inexpensive (as in startup-friendly) when you grow 9
23. More about Neo4j
๏ Neo4j is stable
• In 24/7 operation since 2003
๏ Neo4j is in active development
• Neo Technology received VC funding October 2009
๏ Neo4j delivers high performance graph operations
• traverses 1’000’000+ relationships / second
on commodity hardware
10
25. Path exists in social network
๏ Each person has on average 50 friends
Tobias
Emil
Johan
Peter
Database # persons query time
Relational database 1 000 2 000 ms
Neo4j Graph Database 1 000 2 ms
Neo4j Graph Database 1 000 000 2 ms
12
26. Path exists in social network
๏ Each person has on average 50 friends
Tobias
Emil
Johan
Peter
Database # persons query time
Relational database 1 000 2 000 ms
Neo4j Graph Database 1 000 2 ms
Neo4j Graph Database 1 000 000 2 ms
12
27. Path exists in social network
๏ Each person has on average 50 friends
Tobias
Emil
Johan
Peter
Database # persons query time
Relational database 1 000 2 000 ms
Neo4j Graph Database 1 000 2 ms
Neo4j Graph Database 1 000 000 2 ms
12
28. Path exists in social network
๏ Each person has on average 50 friends
Tobias
Emil
Johan
Peter
Database # persons query time
Relational database 1 000 2 000 ms
Neo4j Graph Database 1 000 2 ms
Neo4j Graph Database 1 000 000 2 ms
12