Neo4J and Python: Playing with graph data.
This talk introduces the world of graphs, their utility and the efficient use of the Neo4J graph database for some super cool day to day applications with the help of py2neo.
Follow the development of the talk at http://www.sonalraj.com/pycon14.html
Check out the details at the PyCon India 2014 website: http://in.pycon.org/funnel/2014/252-neo4j-and-python-playing-with-graph-data
Neo4j and Python: Playing with graph data - PyCon India 2014 Talk.
1. PyCon India 2014• • created by Sonal Raj •
Neo4j and Python
Playing with graph data
Graph Everything
Sonal Raj
2. PyCon India 2014• • created by Sonal Raj •
The Plan for today
Graphs and
NOSQL
Step One
Neo4j and
Cypher
Step Two
4
Step Two
Use Cases
Py2neo and
REST
Step Two
4. PyCon India 2014• • created by Sonal Raj •
Once upon a time..
• Relational databases ruled the earth . .
• Data was stored in Tables, Rows and Columns
• Connections using Primary keys, Foreign keys . .
• That’s all that is relational about then
• No on-the-fly structural (schema) changes
• Horrible for Interconnected data ( joins, really? )
10. PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
11. PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
Big data, Transaction Friendly
12. PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
Big data, Transaction Friendly
Economical, can run on commodity hardware
13. PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
Big data, Transaction Friendly
Economical, can run on commodity hardware
End of the DBA rule
14. PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
Big data, Transaction Friendly
Economical, can run on commodity hardware
End of the DBA rule
Flexible Data models
20. PyCon India 2014• • created by Sonal Raj •
Some Graphs
we overlook . .
21. PyCon India 2014• • created by Sonal Raj •
Some Graphs
we overlook . .
22. PyCon India 2014• • created by Sonal Raj •
Some Graphs
we overlook . .
23. PyCon India 2014• • created by Sonal Raj •
Apart from that
Fraud Analyses
Investment securities &
debt analysis
Recommendation
Engines
Impact Analysis in
networks
24. PyCon India 2014• • created by Sonal Raj •
So, Why Graphs ?
• Increasing Connectivity of Data
• Increasing Semi-Structredness
• Rising Complexity
25. PyCon India 2014• • created by Sonal Raj •
So, Why Graphs ?
• Increasing Connectivity of Data
• Increasing Semi-Structredness
• Rising Complexity
Seven Bridges of Königsberg
Leonhard Euler in 1735
27. PyCon India 2014• • created by Sonal Raj •
Property
Graphs
- Has nodes
28. PyCon India 2014• • created by Sonal Raj •
Property
Graphs
- Has nodes
- Has properties for
each node
29. PyCon India 2014• • created by Sonal Raj •
Property
Graphs
- Has nodes
- Has properties for
each node
- Has Relationships
30. PyCon India 2014• • created by Sonal Raj •
Property
Graphs
- Has nodes
- Has properties for
each node
- Has Relationships
- Has properties for
each relationship
31. PyCon India 2014• • created by Sonal Raj •
Building Blocks
Nodes
Relationships
Labels
Graph Database
Properties
32. PyCon India 2014• • created by Sonal Raj •
Data Models
Native Graphs
Inherently store data
as nodes and
relationships.
33. PyCon India 2014• • created by Sonal Raj •
Data Models
Native Graphs
Inherently store data
as nodes and
relationships.
The Other ones . . .
Data stored in tables,
joins and aggregates
to simulate a graph
34. PyCon India 2014• • created by Sonal Raj •
Data Models
Native Graphs
Inherently store data
as nodes and
relationships.
The Other ones . . .
Data stored in tables,
joins and aggregates
to simulate a graph
35. PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
36. PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
37. PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
Fully ACID Transactions
38. PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
Fully ACID Transactions
Highly Scalable, High Availability Clusters
39. PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
Fully ACID Transactions
Highly Scalable, High Availability Clusters
REST API for servers. Can be embedded to applications on JVM.
40. PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
Fully ACID Transactions
Highly Scalable, High Availability Clusters
REST API for servers. Can be embedded to applications on JVM.
Cypher – a declarative querying solution
Graph DB with good native python bindings . .
41. PyCon India 2014• • created by Sonal Raj •
Cypher in action
• Highly expressive query language
• Cares about ‘what’ rather than ‘how’ to retrieve from the graph.
• Uses pattern matching expressions.
1 2
(1) – [ :label ] - (2)
label
42. PyCon India 2014• • created by Sonal Raj •
Cypher in action
• Highly expressive query language
• Cares about ‘what’ rather than ‘how’ to retrieve from the graph.
• Uses pattern matching expressions.
1 2
START n=(1), m=(2)
MATCH n – [r:label] – m
RETURN r
label
43. PyCon India 2014• • created by Sonal Raj •
Cypher in action
• Highly expressive query language
• Cares about ‘what’ rather than ‘how’ to retrieve from the graph.
• Uses pattern matching expressions.
• To make life easy for some, it is inspired by SQL.
1 2
START n=(1), m=(2)
MATCH n – [r:label] – m
RETURN r
label
44. PyCon India 2014• • created by Sonal Raj •
Cypher in action
Create
Read
CREATE (n:Person { name : ‘Chuck Norris', title : ‘Analyst' })
RETURN n
MATCH (a:Person),(b:Person)
WHERE a.name = ‘Chuck' AND b.name = ‘Rajani'
CREATE (a)-[r:RELTYPE { name : ‘cannot_find’ }]->(b)
RETURN r
MATCH (n) RETURN n #everything is returned
MATCH (n:Label) RETURN n #all with specific label
MATCH (Titanic { title:‘Titanic' })<-[:ACTED_IN|:DIRECTED]-(person)
RETURN person
45. PyCon India 2014• • created by Sonal Raj •
Cypher in action
Update
Delete
MATCH (n { name: 'Andres' })
SET n.surname = 'Taylor'
RETURN n
MATCH (peter { name: 'Peter' })
SET peter += { hungry: TRUE , position: 'Entrepreneur' }
MATCH (n { name: 'Peter' })
REMOVE n.title
REMOVE n:German
RETURN n
SET n.name = NULL
46. PyCon India 2014• • created by Sonal Raj •
REST in peace !!
Create
POST http://localhost:7474/db/data/node
{
"foo" : "bar"
}
POST http://localhost:7474/db/data/node/1/relationships
{
"to" : "http://localhost:7474/db/data/node/10",
"type" : "LOVES",
"data" : {
"foo" : "bar"
}
}
POST http://localhost:7474/db/data/schema/index/person
{
"property_keys" : [ "name" ]
}
47. PyCon India 2014• • created by Sonal Raj •
REST in peace !!
Read
Update
Delete
GET http://localhost:7474/db/data/node/144
GET http://localhost:7474/db/data/relationship/65
GET http://localhost:7474/db/data/relationship/61/properties
GET http://localhost:7474/db/data/schema/index/user
PUT http://localhost:7474/db/data/relationship/66/properties
{
"happy" : false
}
PUT http://localhost:7474/db/data/relationship/60/properties/cost
"deadly"
DELETE http://localhost:7474/db/data/node/308
DELETE http://localhost:7474/db/data/relationship/58
DELETE http://localhost:7474/db/data/schema/index/SomeLabel/name
50. PyCon India 2014• • created by Sonal Raj •
For the pythonistas
As simple as that!
from py2neo import neo4j
graph_db = neo4j.GraphDatabaseService("http://localhost:7474/db/data/")
from py2neo import node, rel
die_hard = graph_db.create(
node(name="Bruce Willis"),
node(name="John McClane"),
node(name="Alan Rickman"),
node(name="Hans Gruber"),
node(name="Nakatomi Plaza"),
rel(0, "PLAYS", 1),
rel(2, "PLAYS", 3),
rel(1, "VISITS", 4),
rel(3, "STEALS_FROM", 4),
rel(1, "KILLS", 3),
)
51. PyCon India 2014• • created by Sonal Raj •
For the pythonistas
graphdb • clear()
• create(*abstracts)
• delete(*entities)
• delete_index(content_type, index_name)
• find(label, property_key=None, property_value=None)
• get_index(content_type, index_name)
• get_indexed_node(index_name, key, value)
• ...
52. PyCon India 2014• • created by Sonal Raj •
For the pythonistas
• get_indexed_relationship(index_name, key, value)
• get_properties(*entities)
• match(start_node=None, rel_type=None, end_node=None,
bidirectional=False, limit=None)
• match_one(start_node=None, rel_type=None, end_node=None,
bidirectional=False)
• node(id_)
• get_or_create_index(content_type, index_name, config=None)
• get_or_create_indexed_node(index_name, key, value,
properties=None)
graphdb
53. PyCon India 2014• • created by Sonal Raj •
Complexity Handling
“ A graph database without traversals is
just a persistent graph ”
54. PyCon India 2014• • created by Sonal Raj •
Paths with py2neo
#Create Paths
from py2neo import neo4j, node
a, b, c = node(name="Alice"), node(name="Bob"), node(name="Carol")
abc = neo4j.Path(a, ’KNOWS’, b, ’KNOWS’, c)
d, e = node(name=“Doctor”), node(name=“Easter”)
de = neo4j.Path(d, ‘KNOWS’, e)
#Join paths
abcde = neo4j.Path.join(abc, ‘KNOWS’, de)
#commit to the db
abcde.get_or_create(graph_db)
55. PyCon India 2014• • created by Sonal Raj •
Schema, Indices with py2neo
#The class
py2neo.neo4j.Schema
py2neo.neo4j.Index
#Join paths
create_index(label, property_key)
drop_index(label, property_key)
get_indexed_property_keys(label)
add_if_none(key, value, entity)
#Apache Lucene Query
people = graph_db.get_or_create_index(neo4j.Node, "People")
s_people = people.query("family_name:S*")
59. PyCon India 2014• • created by Sonal Raj •
Cypher with py2neo
#Create transaction object
from py2neo import cypher
Session = cypher.Session(“http://localhost:7474/”)
tx = session.create_transaction()
#Add transactions, execute or commit
tx.append(“some cypher query”)
tx.append(“some cypher query”)
tx.execute()
tx.append(“some cypher query”)
tx.commit()
#The classical way
from py2neo import neo4j
graph_db = neo4j.GraphDatabaseSercice()
query = neo4j.CypherQuery(graph_db, ‘your cypher query’)
query.execute()
#query.stream()
60. PyCon India 2014• • created by Sonal Raj •
Command Line neotool
#Syntax of operation
neotool [<option>] <command> <args>
Or python –m py2neo.tool ..
#Some serious examples
neotool clear
neotool cypher "start n=node(1) return n, n.name?“
neotool cypher-csv "start n=node(1) return n.name, n.age?"
neotool cypher-tsv "start n=node(1) return n.name, n.age?"
#Guess what, you can also access the shell
neotool shell
61. PyCon India 2014• • created by Sonal Raj •
Neo4j level 2
• Batch Inserter
• High Availability
• Built-in online backup tools
• HTTPS support
62. PyCon India 2014• • created by Sonal Raj •
Neo4j level 2
• Batch Inserter
• High Availability
• Built-in online backup tools
• HTTPS support
Neo4J Framework.
• GraphUnit, for unit testing neo4j
• Libraries for performance and API testing
• Batch Transaction tools
• Transaction Event tools
• Some other utilities . .