Presentation given at the national PHP conference in Poland, in Kielce, October 2011, dealing with the introduction of graph databases in PHP, taking a practical look at OrientDB.
26. Recommendations
lives in
John type shows
Mr
Fun Cinema B
Bean loca
tion
lik
Rome
es
shows Cinema A location
type
Thriller Se7en
s ho
ws
location Milan
Cinema C
19
domenica 23 ottobre 11
27. Recommendations
lives in
John type shows
Mr
Fun Cinema B
Bean loca
tion
lik
Rome
es
type
shows Cinema A location ✓
x x
Thriller Se7en
s ho
ws
location Milan
Cinema C
20
domenica 23 ottobre 11
28. Recommendations
lives in
John type shows
Mr
Fun Cinema B
Bean
✓ loca
tion
lik
Rome
es
type
shows Cinema A location ✓
x x
Thriller Se7en
s ho
ws
location Milan
Cinema C
21
domenica 23 ottobre 11
29. Recommendations
lives in
John type shows
Mr
Fun Cinema B
✓
Bean
✓ loca
tion
lik
Rome
es
type
shows Cinema A location ✓
x x
Thriller Se7en
s ho
ws
location Milan
Cinema C
22
domenica 23 ottobre 11
30. Recommendations
lives in
John
xFun
type Mr
✓
Bean
shows
Cinema B
✓ loca
tion
lik
Rome
es
type
shows Cinema A location ✓
x x
Thriller Se7en
s ho
ws
location Milan
Cinema C
23
domenica 23 ottobre 11
31. Recommendations
lives in
John
x x x
Fun
type Mr
Bean
shows
Cinema B
loca
tion
lik
Rome
es
type
shows
✓
Cinema A location ✓
x x
Thriller Se7en
s ho
ws
location Milan
Cinema C
24
domenica 23 ottobre 11
32. Recommendations
lives in
John
x x x
Fun
type Mr
Bean
shows
Cinema B
loca
tion
lik
Rome
es
type
shows
✓
Cinema A location ✓
✓ shows
x x
Thriller Se7en
location Milan
Cinema C
25
domenica 23 ottobre 11
33. Recommendations
lives in
John
x x x
Fun
type Mr
Bean
shows
Cinema B
loca
tion
lik
Rome
es
type
shows
✓
Cinema A location ✓
✓ shows
x x
✓
Thriller Se7en
location Milan
Cinema C
26
domenica 23 ottobre 11
34. Recommendations
lives in
John
x x x
Fun
type Mr
Bean
shows
Cinema B
loca
tion
lik
✓
Rome
es
type
shows
✓
Cinema A location ✓
✓ shows
x x
✓
Thriller Se7en
location Milan
Cinema C
27
domenica 23 ottobre 11
44. Given your dataset, organize some clusters
Are there some nodes which cannot belong to a cluster?
They probably have some properties different from the average
domenica 23 ottobre 11
45. Given your dataset, organize some clusters
Are there some nodes which cannot belong to a cluster?
They probably have some properties different from the average
ACHTUNG!
TERRORISTEN!
domenica 23 ottobre 11
46. but ... why graphDB?
38
domenica 23 ottobre 11
47. Representing a Graph in:
http://www.slideshare.net/slidarko/problemsolving-using-graph-traversals-searching-
scoring-ranking-and-recommendation#
39
domenica 23 ottobre 11
48. Representing a Graph in:
http://www.slideshare.net/slidarko/problemsolving-using-graph-traversals-searching-
scoring-ranking-and-recommendation#
✓ Relational Database
(mysql, oracle)
✓ Document Oriented DB
(mongodb, couchdb)
✓ XML Database
(MarkLogic, eXist-db)
39
domenica 23 ottobre 11
49. where is the difference ?
40
domenica 23 ottobre 11
50. GraphDB
A graph database is any storage
system that provides index-free
adjacency.
http://www.slideshare.net/slidarko/problemsolving-using-graph-traversals-searching-scoring-ranking-and-recommendation
domenica 23 ottobre 11
51. Step by step example
Given a list of people, find their homepages
42
domenica 23 ottobre 11
53. Tree-based DB WAY
David Funaro
put in the Search Engine
2
1
43
domenica 23 ottobre 11
54. Tree-based DB WAY
David Funaro
put in the Search Engine
2
find 3
1
http://davidfunaro.com
43
domenica 23 ottobre 11
55. Tree-based DB WAY
David Funaro
The cost to find Search Engine friend HP
put in the a single
2
grows as the friends HP tables grows
find 3
1
http://davidfunaro.com
43
domenica 23 ottobre 11
56. GraphDB WAY
it’s like that the GraphDB has an additional information
(the ancor <a>)
44
domenica 23 ottobre 11
57. GraphDB WAY
1 get the embedded
information(index)
www.odino.org
it’s like that the GraphDB has an additional information
(the ancor <a>)
44
domenica 23 ottobre 11
58. GraphDB WAY
The Anchor work as a local index to
reach the document = index-free
adjacency
<a href=”http://odino.org”>
Alessandro Nadalin
</a>
45
domenica 23 ottobre 11
59. Local cost
The local cost is O(k) = Constant
46
domenica 23 ottobre 11
60. Local cost
The local cost is O(k) = Constant
47
domenica 23 ottobre 11
62. Local cost
Thus, as the graph grows in size,
the cost of a local step remain the same
48
domenica 23 ottobre 11
63. any database can implicity represent a
graph
BUT
only a graph database make the graph
structure explicit
49
domenica 23 ottobre 11
64. Benchmark
Deph RDBMS Graph
1 100ms 30ms
• 1 Million Vertex
• 4 Million Edge
2 1000ms 500ms
• Scale Free Tolopogy
3 10000ms 3000ms
• Postgres VS Neo4J
4 100000m 50000ms
s • Both Hash and BTree
5 N/A 100000m
s
50
http://markorodriguez.com/2011/02/18/mysql-vs-neo4j-on-a-large-scale-graph-traversal/
domenica 23 ottobre 11
65. community that is building and feeding the GraphDB ecosystem
GraphDB community
ThinkerPop
Stack
Databases
domenica 23 ottobre 11
66. data model and their
implementation
Blueprints is a collection of interfaces, implementations,
ouplementations, and test suites for the property graph data
model. Blueprints is analogous to the JDBC, but for graph
databases.
https://github.com/tinkerpop/blueprints/wiki/
domenica 23 ottobre 11
67. a data flow Framework using Process Graph
provide a collection of "pipes" that are
connected togheter to from processing
pipelines
domenica 23 ottobre 11
68. a graph-based programming language.
a Turing-Complete graph-base programming language
that compiles Gremlin syntax down to Pipes
domenica 23 ottobre 11
69. a REST-full graph shell.
Allow blueprints graph to be exposed
through a RESTful API (HTTP)
domenica 23 ottobre 11
115. somebody started writing the
binary-protocol binding
https://github.com/AntonTerekhov/OrientDB-PHP
( beta0.4.1, 28 April 2010 )
domenica 23 ottobre 11
116. $db = new OrientDB($host, $port);
$record = $db->recordLoad('1:1', '*:-1');
// $record instance of OrientDBRecord
domenica 23 ottobre 11
133. use CongowOrientQuery;
$query = new Query();
$query->from(array('users'))->where('username = ?', "admin");
echo $query->getRaw();
// SELECT FROM users WHERE username = "admin"
domenica 23 ottobre 11
134. use CongowOrientQuery;
$query = new Query();
$query->from(array('users'))->where('username = ?', "admin");
echo $query->getRaw();
// SELECT FROM users WHERE username = "admin"
domenica 23 ottobre 11
135. use CongowOrientQuery;
$query = new Query();
$query->from(array('users'))->where('username = ?', "admin");
echo $query->getRaw();
// SELECT FROM users WHERE username = "admin"
domenica 23 ottobre 11
136. use CongowOrientQuery;
$query = new Query();
$query->from(array('users'))->where('username = ?', "admin");
echo $query->getRaw();
// SELECT FROM users WHERE username = "admin"
domenica 23 ottobre 11
137. $query->select(array('name', 'username', 'email'), false)
->from(array('12:0', '12:1'), false)
->where('any() traverse ( any() like "%danger%" )')
->orWhere("1 = ?", 1)
->andWhere("links = ?", 1)
->limit(20)
->orderBy('username')
->orderBy('name', true, true)
->range("12:0", "12:1");
SELECT name, username, email
FROM [12:0, 12:1]
WHERE any() traverse ( any() like "%danger%" )
OR 1 = "1" AND links = "1"
ORDER BY name, username
LIMIT 20
RANGE 12:0 12:1
domenica 23 ottobre 11