SlideShare une entreprise Scribd logo
1  sur  97
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 1
www.orientechnologies.com
Luca Garulli –
Founder and CEO
@Orient Technologies Ltd
Author of OrientDB
www.twitter.com/lgarulli
Why Relationships
are cool
but the “JOIN” sucks
BigData & Graphs
In Rome
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 2
1979
First Relational DBMS available as product
2009
NoSQL movement
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 3
1979
First Relational DBMS available as product
2009
NoSQL movement
Hey, 30 years in the
IT field is so huge!
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 4
Before 2009 teams of developers
always fought to select:
Operative System
Programming Language
Middleware (App-Servers)
What about the Database?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 5
One of the main resistances of
RDBMS users to pass to a NoSQL product
are related to the
complexity of the model:
Ok, NoSQL products are super for
BigData and BigScale
but...
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 6
...what about the model?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 7
What is the NoSQL answer
about managing complex domains?
Key-Value stores ?
Column-Based ?
Document database ?
Graph database !
NoRelationships
support
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 8
Why
most of NoSQL
products
don’t support
Relationship
Between entities?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 9
To understand why,
let’s see how
Relational DBMS
managed them
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 10
Domain: the super minimal “Selling App”
CustomerCustomer AddressAddress
OrderOrder StockStock
Registry system
Order system
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 11
StockStock
Registry system
Domain: the super minimal “Selling App”
OrderOrder
Order system
CustomerCustomer AddressAddress
How does
Relational DBMS
manage relationships?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 12
Relational World: 1-1 Relationships
JOIN Customer.Address -> Address.Id
Customer
Id Name Address
10 Luca 34
11 Jill 44
34 John 54
56 Mark 66
88 Steve 68
Address
Id Location
34 Rome
44 London
54 Moscow
66 New Mexico
68 Palo Alto
Foreign key
Primary keyPrimary key
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 13
Relational World: 1-N Relationships
Inverse JOIN Address.Customer -> Customer.Id
Customer
Id Name
10 Luca
11 Jill
34 John
56 Mark
88 Steve
Address
Id Customer Location
24 10 Rome
33 10 London
44 34 Moscow
66 56 Cologne
68 88 Palo Alto
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 14
Relational World: N-M Relationships
Additional table with 2 JOINs
(1) CustomerAddress.Id -> Customer.Id and
(2) CustomerAddress.Address -> Address.Id
Customer
Id Name
10 Luca
11 Jill
34 John
56 Mark
88 Steve
Address
Id Location
24 Rome
33 London
44 Moscow
66 Cologne
68 Palo Alto
CustomerAddress
Id Address
10 24
10 33
34 44
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 15
What’s wrong with the
Relational Model?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 16
These are all JOINs executed
everytime you traverse a
relationship
The JOIN is the evil!
Customer
Id Name
10 Luca
11 Jill
34 John
56 Mark
88 Steve
Address
Id Location
24 Rome
33 London
44 Moscow
66 Cologne
68 Palo Alto
These are all JOINs executed
everytime you traverse a
relationship
These are all JOINs executed
everytime you traverse a
relationship
These are all JOINs executed
everytime you traverse a
relationship!
CustomerAddress
Id Address
10 24
10 33
34 24
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 17
A JOIN means searching for a key in
another table
The first rule to improve performance
is indexing all the keys
Index speeds up searches, but slows down
insert, updates and deletes
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 18
So in the best case a JOIN is a lookup
into an index
This is done per single join!
If you traverse hundreds of relationships
you’re executing hundreds of JOINs
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 19
Index Lookup
is it really that fast?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 20
Index Lookup: how does it works?
A-Z
A-L M-Z
Think to an
Address Book
where we have to find
the Luca’s phone
number
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 21
Index Lookup: how does it works?
A-Z
A-L M-Z
A-L
A-D E-L
M-Z
M-R S-Z
Index algorithms are all
similar and based on
balanced trees
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 22
Index Lookup: how does it works?
A-Z
A-L M-Z
A-L
A-D E-L
M-Z
M-R S-Z
A-D
A-B C-D
E-L
E-G H-L
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 23
Index Lookup: how does it works?
A-Z
A-L M-Z
A-L
A-D E-L
M-Z
M-R S-Z
A-D
A-B C-D
E-L
E-G H-L
E-G
E-F G
H-L
H-J K-L
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 24
Index Lookup: how does it works?
A-Z
A-L M-Z
A-L
A-D E-L
M-Z
M-R S-Z
A-D
A-B C-D
E-L
E-G H-L
E-G
E-F G
H-L
H-J K-L
Luca
Found!
This lookup took 5
steps and grows
up with the index
size!
Found!
This lookup took 5
steps and grows
up with the index
size!
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 25
Can you imagine
how many steps a
Lookup operation does into an
Index with Millions or Billions
of records?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 26
And this JOIN is executed
foreach involved table,
multiplied
foreach scanned records
!
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 27
Querying more tables can easily
produce millions of JOINs/Lookups!
Here the rule: more entries
= more lookup steps = slower JOIN
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 28
Oh! This is why
performance of my database
drops down when
it becomes bigger,
and bigger,
and bigger!
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 29
What about
Document Databases
like MongoDB?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 30
How MongoDB manages relationships:
{
“_id” : “292846512”,
“type” : “Order”,
“number” : 1223,
“customer” : “123456789”
}
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 31
MongoDB uses the same approach:
it stores the _id of the connected
documents. At run-time it lookups up
for the _id by using an index.
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 32
Is there a better way to
manage relationships?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 33
“A graph database is any
storage system
that provides
index-free adjacency”
- Marko Rodriguez
(author of TinkerPop Blueprints)
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 34
How does GraphDB manage
index-free relationships?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 35
Every developer knows
the Relational Model,
but who knows the
Graph one?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 36
Back to school:
Graph Theory crash course
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 37
Basic Graph
LucaLuca
NoSQL
Day
NoSQL
Day
Likes
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 38
Property Graph Model*
Luca
name: Luca
surname: Garulli
company: Orient Tech
Luca
name: Luca
surname: Garulli
company: Orient Tech
NoSQL
Day
date: Nov 15° 2013
NoSQL
Day
date: Nov 15° 2013
Likes
since: 2013
Vertices and Edges
can have properties
Vertices and Edges
can have properties
Vertices and Edges
can have properties
Vertices are
directed
* https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 39
Property Graph Model
LucaLuca
NoSQL
Day
NoSQL
Day
Likes
since: 2013
Speaks
title: «Switching...»
abstract: «This talk presents...»
An Edge connects 2
vertices: use multiple edges
to represents 1-N and N-M
relationships
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 40
Property Graph Model
Likes
DanielDaniel
LucaLuca
Organizes
FriendOf
NoSQL
Day
NoSQL
Day
UdineUdine
located
Studies
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 41
Compliments, this is your diploma in
«Graph Theory»
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 42
The Graph theory
is so simple to be so
powerful
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 43
Let’s go back
to the Graph Stuff
How does OrientDB
manage relationships?
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 44
Luca
(vertex)
Luca
(vertex)
OrientDB: traverse a relationship
label : ‘Customer’
name : ‘Luca’
label : ‘Customer’
name : ‘Luca’
RID = #13:35RID = #13:35 RID = #13:100RID = #13:100
label = ‘Address’
name = ‘Rome’
label = ‘Address’
name = ‘Rome’
The Record ID (RID)
is the physical position
Rome
(vertex)
Rome
(vertex)
The Record ID (RID)
is the physical position
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 45
Lives
OrientDB: traverse a relationship
out : [#14:54]
label : ‘Customer’
name : ‘Luca’
out : [#14:54]
label : ‘Customer’
name : ‘Luca’
out: [#13:35]
in: [#13:100]
Label : ‘Lives’
out: [#13:35]
in: [#13:100]
Label : ‘Lives’
RID = #13:35RID = #13:35 RID = #13:100RID = #13:100
in: [#14:54]
label = ‘Address’
name = ‘Rome’
in: [#14:54]
label = ‘Address’
name = ‘Rome’
The Edge’s RID is saved
inside both vertices, as
«out» and «in»
The Edge’s RID is saved
inside both vertices, as
«out» and «in»
RID = #14:54RID = #14:54
Luca
(vertex)
Luca
(vertex)
Rome
(vertex)
Rome
(vertex)
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 46
LucaLuca
Lives
OrientDB: traverse -> outgoing
out : [#14:54]
label : ‘Customer’
name : ‘Luca’
out : [#14:54]
label : ‘Customer’
name : ‘Luca’
out: [#13:35]
in: [#13:100]
Label : ‘Lives’
out: [#13:35]
in: [#13:100]
Label : ‘Lives’
RID = #13:35RID = #13:35
RID = #14:54RID = #14:54
RID = #13:100RID = #13:100
in: [#14:54]
label = ‘Address’
name = ‘Rome’
in: [#14:54]
label = ‘Address’
name = ‘Rome’
RomeRome
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 47
LucaLuca
Lives
OrientDB: traverse <- incoming
out : [#14:54]
label : ‘Customer’
name : ‘Luca’
out : [#14:54]
label : ‘Customer’
name : ‘Luca’
out: [#13:35]
in: [#13:100]
Label : ‘Lives’
out: [#13:35]
in: [#13:100]
Label : ‘Lives’
RID = #13:35RID = #13:35
RID = #14:54RID = #14:54
RID = #13:100RID = #13:100
in: [#14:54]
label = ‘Address’
name = ‘Rome’
in: [#14:54]
label = ‘Address’
name = ‘Rome’
RomeRome
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 48
GraphDB handles relationships as a
physical LINK to the record
assigned when the edge is created
on the other side
RDBMS computes the
relationship every time you query a database
Is not that crazy?!
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 49
This means jumping from a
O(log N) algorithm to a near O(1)
traversing cost is not more affected
by database size!
This is huge in the BigData age
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 50
an Open Source (Apache licensed)
document-graph NoSQL dbms
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 51
OrientDB in the Blueprints micro-benchmark,
on common hw, with a hot cache,
traverses 29,6 Millions
of records in less than 5 seconds
about 6 Millions of nodes traversed per sec!
*unless you live in the Google’s server farm
Do not try this at home
with a RDBMS*!
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 52
Create the graph in SQL
$luca> cd bin
$luca> ./console.sh
OrientDB console v.1.6.1 (www.orientdb.org)
Type 'help' to display all the commands supported.
orientdb> create vertex Customer set name = ‘Luca’
Created vertex #13:35 in 0.03 secs
orientdb> create vertex Address set name = ‘Rome’
Created vertex #13:100 in 0.02 secs
orientdb> create edge Lives from #13:35 to #13:100
Created edge #14:54 in 0.02 secs
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 53
Create the graph in Java
Graph graph = new OrientGraph("local:/tmp/db/graph”);
Vertex luca = graph.addVertex( “class:Customer” );
luca.setProperty( “name", “Luca” );
Vertex rome = graph.addVertex ( “class:Address” );
rome.setProperty( “name", “Rome” );
Edge edge = luca.addEdge( “Lives”, rome );
graph.shutdown();
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 54
Query the graph in SQL
orientdb> select in(‘Lives’) from Address where name = ‘Rome’
---+------+---------|--------------------+--------------------+--------+
  #| RID  |@class   |label               |out_Lives           |in      |
---+------+---------+--------------------+--------------------+--------+
  0| 13:35|Customer |Luca                |[#14:54]            |        |
---+------+---------+--------------------+--------------------+--------+
1 item(s) found. Query executed in 0.007 sec(s).
Incoming vertices
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 55
More on query power
orientdb> select sum( out(‘Order’).total ) from Customer
where name = ‘Luca’
orientdb> traverse both(‘Friend’)
from Customer while $depth <= 7
orientdb> select from (
traverse both(‘Friend’)
from Customer while $depth <= 7
) where @class=‘Customer’ and city.name = ‘Udine’
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 56
Query vs traversal
Once you’ve a well connected database
in the form of a Super Graph you can
cross records instead of query them!
All you need is a few“Root Vertices”
where to start traversing
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 57
Query vs traversal
CustomersCustomers
LucaLuca
Mar
k
Mar
k
JillJill
Order
2332
Order
2332
Order
8834
Order
8834
White
Soap
White
Soap
StocksStocks
Special
Customers
Special
Customers
This is a
root vertex
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 58
Root Vertices can be enriched by
Meta Graphs
to decorate Graphs with
additional information
and make easier/faster
the retrieval
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 59
Temporal based Meta Graph
Order
2333
Order
2333
Order
2334
Order
2334
CalendarCalendar
Hour
9/4/2013
10:00
Hour
9/4/2013
10:00
Hour
9/4/2013
09:00
Hour
9/4/2013
09:00
Order
2332
Order
2332
Day
9/4/2013
Day
9/4/2013
Month
April 2013
Month
April 2013
Year
2013
Year
2013
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 60
Location based Meta Graph
Order
2333
Order
2333
Order
2334
Order
2334
LocationLocation
City
Rome
City
Rome
City
Fiumicino
City
Fiumicino
Order
2332
Order
2332
State
RM
State
RM
Region
Lazio
Region
Lazio
Country
Italy
Country
Italy
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 61
Mix & Merge graphs
Order
2333
Order
2333
Order
2334
Order
2334
LocationLocation
City
Rome
City
Rome
City
Fiumicino
City
Fiumicino
Order
2332
Order
2332
State
RM
State
RM
Region
Lazio
Region
Lazio
Country
Italy
Country
Italy
CalendarCalendar
Hour
9/4/2013
10:00
Hour
9/4/2013
10:00
Hour
9/4/2013
09:00
Hour
9/4/2013
09:00
Day
9/4/2013
Day
9/4/2013
Month
April 2013
Month
April 2013
Year
2013
Year
2013
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 62
Order
2333
Order
2333
Order
2334
Order
2334
LocationLocation
City
Rome
City
Rome
City
Fiumicino
City
Fiumicino
Order
2332
Order
2332
State
RM
State
RM
Region
Lazio
Region
Lazio
Country
Italy
Country
Italy
CalendarCalendar
Hour
9/4/2013
10:00
Hour
9/4/2013
10:00
Hour
9/4/2013
09:00
Hour
9/4/2013
09:00
Day
9/4/2013
Day
9/4/2013
Month
April 2013
Month
April 2013
Year
2013
Year
2013
Get all the orders
sold in “Fiumicino” city
on 9/4/2013 at 10:00
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 63
Start from Calendar, look for Hour 10:00
Order
2333
Order
2333
Order
2334
Order
2334
LocationLocation
City
Rome
City
Rome
City
Fiumicino
City
Fiumicino
Order
2332
Order
2332
State
RM
State
RM
Region
Lazio
Region
Lazio
Country
Italy
Country
Italy
CalendarCalendar
Hour
9/4/2013
10:00
Hour
9/4/2013
10:00
Hour
9/4/2013
09:00
Hour
9/4/2013
09:00
Day
9/4/2013
Day
9/4/2013
Month
April 2013
Month
April 2013
Year
2013
Year
2013
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 64
Start from Calendar, look for Hour 10:00
Order
2333
Order
2333
Order
2334
Order
2334
LocationLocation
City
Rome
City
Rome
City
Fiumicino
City
Fiumicino
Order
2332
Order
2332
State
RM
State
RM
Region
Lazio
Region
Lazio
Country
Italy
Country
Italy
CalendarCalendar
Hour
9/4/2013
10:00
Hour
9/4/2013
10:00
Hour
9/4/2013
09:00
Hour
9/4/2013
09:00
Day
9/4/2013
Day
9/4/2013
Month
April 2013
Month
April 2013
Year
2013
Year
2013
Found 2 Orders,
filter by incoming
edges<
Found 2 Orders,
now filter by
incoming edges
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 65
Order
2333
Order
2333
LocationLocation
City
Fiumicino
City
Fiumicino
Order
2332
Order
2332
State
RM
State
RM
Region
Lazio
Region
Lazio
Country
Italy
Country
Italy
CalendarCalendar
Hour
9/4/2013
10:00
Hour
9/4/2013
10:00
Hour
9/4/2013
09:00
Hour
9/4/2013
09:00
Day
9/4/2013
Day
9/4/2013
Month
April 2013
Month
April 2013
Year
2013
Year
2013
Order
2334
Order
2334
Only “Order 2333” has
incoming connections
with “Fiumicino”
City
Rome
City
Rome
Start from Calendar, look for Hour 10:00
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 66
Order
2333
Order
2333
LocationLocation
City
Fiumicino
City
Fiumicino
Order
2332
Order
2332
State
RM
State
RM
Region
Lazio
Region
Lazio
Country
Italy
Country
Italy
CalendarCalendar
Hour
9/4/2013
10:00
Hour
9/4/2013
10:00
Hour
9/4/2013
09:00
Hour
9/4/2013
09:00
Day
9/4/2013
Day
9/4/2013
Month
April 2013
Month
April 2013
Year
2013
Year
2013
Order
2334
Order
2334
City
Rome
City
Rome
Or start from Location, look for Fiumicino
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 67
Order
2333
Order
2333
Order
2332
Order
2332
CalendarCalendar
Hour
9/4/2013
10:00
Hour
9/4/2013
10:00
Hour
9/4/2013
09:00
Hour
9/4/2013
09:00
Day
9/4/2013
Day
9/4/2013
Month
April 2013
Month
April 2013
Year
2013
Year
2013
Order
2334
Order
2334
Start from Location, look for Fiumicino
LocationLocation
City
Rome
City
Rome
City
Fiumicino
City
Fiumicino
State
RM
State
RM
Region
Lazio
Region
Lazio
Country
Italy
Country
Italy
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 68
LucaLuca
Recommendation system
JillJill
EnricoEnrico
Friend
Friend
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 69
Da CarloneDa Carlone
LucaLuca
Recommendation system
JillJill
EnricoEnrico
La
Mediterranea
La
Mediterranea
MeridionaleMeridionale
Friend
Friend
Eats
Eats
Eats
Eats
EaitalyEaitaly
Eats
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 70
Recommendation system
Da CarloneDa Carlone
LucaLuca
JillJill
EnricoEnrico
La
Mediterranea
La
Mediterranea
MeridionaleMeridionale
Friend
Friend
Eats
Eats
Eats
Eats
EaitalyEaitaly
select both(‘Friend’)
from Person where name = ‘Luca’
select both(‘Friend’)
from Person where name = ‘Luca’
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 71
Recommendation system
Da CarloneDa Carlone
LucaLuca
JillJill
EnricoEnrico
La
Mediterranea
La
Mediterranea
MeridionaleMeridionale
Friend
Friend
Eats
Eats
Eats
Eats
EaitalyEaitaly
select both(‘Friend’).out(‘Eats’)
from Person where name = ‘Luca’
select both(‘Friend’).out(‘Eats’)
from Person where name = ‘Luca’
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 72
Recommendation system
Da CarloneDa Carlone
LucaLuca
JillJill
EnricoEnrico
La
Mediterranea
La
Mediterranea
MeridionaleMeridionale
Friend
Friend
Eats
Eats
Eats
Eats
EaitalyEaitaly
select both(‘Friend’).out(‘Eats’)
from Person where name = ‘Luca’
select both(‘Friend’).out(‘Eats’)
from Person where name = ‘Luca’
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 73
This is your database
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 74
Get last customer bought ‘Barolo’
select last(out(‘Order’).in(‘Customer)) from Stock
where name = ‘Barolo’
#34:22
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 75
Get his’s country
select out(‘City’) from #34:22
Udine, Italy
#55:12
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 76
Get orders from that country
select in(‘Customer’) from #55:12
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 77
Let’s move like a
Spider
on the web
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 78
OrientDB = {
flexibility of Document databases
+ complexity of the Graph model
+ Object Oriented concepts
+ super fast Index
+ powerful SQL dialect
+ multi-master replication and sharding}
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 79
Ø configdownload, unzip, run!
cut & paste the db
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 80
150,000records per second
(flat records, no index, on commodity hw)
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 81
Schema-less
schema is not mandatory, relaxed model,
collect heterogeneous documents all together
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 82
Schema-full
schema with constraints on fields and validation rules
Customer.age > 17
Customer.address not null
Customer.surname is mandatory
Customer.email matches 'b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}b'
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 83
Schema-mixed
schema with mandatory and optional fields + constraints
the best of schema-less and schema-full modes
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 84
ACID Transactions
db.begin();
try{
// your code
...
db.commit();
} catch( Exception e ) {
db.rollback();
}
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 85
SQLselect * from employee where name like '%Jay%' and status=0
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 86
Why reinvent
yet another language when
the 100% of developers already
know SQL?
OrientDB begins from SQL
but improves it with new
operators for graph manipulation
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 87
For the most of the queries
everyday a programmer needs
SQL is simpler,
more readable and
compact then
Scripting (Map/Reduce)
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 88
SQL & relationships
select from Account where address.city.country.name = 'Italy'
select from Account where addresses contains (city.country.name = 'Italy')
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 89
SQL & trees/graphs
select out('friend’) from V where name = 'Luca' and surname = 'Garulli'
select out[@class='knows’] from V where name = 'Jay' and surname = 'Miner'
traverse friends from #13:55 where $depth < 7
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 90
SQL sub queries
select from (
traverse friends from Profile where $depth < 7
) where home.city.name = ‘Cologne’
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 91
SQL & strings
select from Profile where name.toUpperCase() = 'LUCA'
select from City where country.name.substring(1,3).toUpperCase() = 'TAL'
select from Agenda where phones contains ( number.indexOf( '+39' ) > -1 )
select from Agenda where email matches 'bA-Z0-9._%+-?+@A-Z0-9.-?+.A-Z?{2,4}b'
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 92
SQL & schema-less
select from Profile where any() like '%Jay%'
select from Stock where all() is not null
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 93
SQL & collections
select from Tree where children contains ( married = true )
select from Tree where children containsAll ( married = true )
select from User where roles containsKey 'shutdown'
select from Graph where edges.size() > 0
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 94
Native JSONODocument = new ODocument().fromJSON( "
{
'@rid' = '26:10',
'@class' = 'Developer',
'name' : 'Luca',
'surname' : 'Garulli',
'out' : [ #10:33, #10:232 ]
}“ );
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 95
Always Free
Open Source Apache 2 license
free for any purposes,
even commercials
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 96
Some clients
Kondoot
Scenari
(c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 97
www.orientechnologies.com
Thanks!
Luca Garulli –
Founder and CEO
www.twitter.com/lgarulli

Contenu connexe

En vedette

Model-Driven Development of Semantic Mashup Applications with the Open-Source...
Model-Driven Development of Semantic Mashup Applications with the Open-Source...Model-Driven Development of Semantic Mashup Applications with the Open-Source...
Model-Driven Development of Semantic Mashup Applications with the Open-Source...InfoGrid.org
 
Neo4j -[:LOVES]-> Cypher
Neo4j -[:LOVES]-> CypherNeo4j -[:LOVES]-> Cypher
Neo4j -[:LOVES]-> Cypherjexp
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageNeo4j
 
Modern symmetric cipher
Modern symmetric cipherModern symmetric cipher
Modern symmetric cipherRupesh Mishra
 
Metadata and the Power of Pattern-Finding
Metadata and the Power of Pattern-FindingMetadata and the Power of Pattern-Finding
Metadata and the Power of Pattern-FindingDATAVERSITY
 
MongoGraph - MongoDB Meets the Semantic Web
MongoGraph - MongoDB Meets the Semantic WebMongoGraph - MongoDB Meets the Semantic Web
MongoGraph - MongoDB Meets the Semantic WebDATAVERSITY
 
Document Classification with Neo4j
Document Classification with Neo4jDocument Classification with Neo4j
Document Classification with Neo4jKenny Bastani
 
Neo4j - graph database for recommendations
Neo4j - graph database for recommendationsNeo4j - graph database for recommendations
Neo4j - graph database for recommendationsproksik
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph DatabasesInfiniteGraph
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jWilliam Lyon
 
Natural Language Processing with Neo4j
Natural Language Processing with Neo4jNatural Language Processing with Neo4j
Natural Language Processing with Neo4jKenny Bastani
 
Importing Data into Neo4j quickly and easily - StackOverflow
Importing Data into Neo4j quickly and easily - StackOverflowImporting Data into Neo4j quickly and easily - StackOverflow
Importing Data into Neo4j quickly and easily - StackOverflowNeo4j
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j InternalsTobias Lindaaker
 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesNeo4j
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentationjexp
 
Using MongoDB as a high performance graph database
Using MongoDB as a high performance graph databaseUsing MongoDB as a high performance graph database
Using MongoDB as a high performance graph databaseChris Clarke
 
The Importance of MDM - Eternal Management of the Data Mind
The Importance of MDM - Eternal Management of the Data MindThe Importance of MDM - Eternal Management of the Data Mind
The Importance of MDM - Eternal Management of the Data MindDATAVERSITY
 
Anatomy of a Modern Node.js Application Architecture
Anatomy of a Modern Node.js Application Architecture Anatomy of a Modern Node.js Application Architecture
Anatomy of a Modern Node.js Application Architecture AppDynamics
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesMax De Marzi
 

En vedette (20)

Model-Driven Development of Semantic Mashup Applications with the Open-Source...
Model-Driven Development of Semantic Mashup Applications with the Open-Source...Model-Driven Development of Semantic Mashup Applications with the Open-Source...
Model-Driven Development of Semantic Mashup Applications with the Open-Source...
 
Neo4j -[:LOVES]-> Cypher
Neo4j -[:LOVES]-> CypherNeo4j -[:LOVES]-> Cypher
Neo4j -[:LOVES]-> Cypher
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
 
Modern symmetric cipher
Modern symmetric cipherModern symmetric cipher
Modern symmetric cipher
 
Metadata and the Power of Pattern-Finding
Metadata and the Power of Pattern-FindingMetadata and the Power of Pattern-Finding
Metadata and the Power of Pattern-Finding
 
MongoGraph - MongoDB Meets the Semantic Web
MongoGraph - MongoDB Meets the Semantic WebMongoGraph - MongoDB Meets the Semantic Web
MongoGraph - MongoDB Meets the Semantic Web
 
Document Classification with Neo4j
Document Classification with Neo4jDocument Classification with Neo4j
Document Classification with Neo4j
 
Neo4j - graph database for recommendations
Neo4j - graph database for recommendationsNeo4j - graph database for recommendations
Neo4j - graph database for recommendations
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph Databases
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j
 
Natural Language Processing with Neo4j
Natural Language Processing with Neo4jNatural Language Processing with Neo4j
Natural Language Processing with Neo4j
 
Importing Data into Neo4j quickly and easily - StackOverflow
Importing Data into Neo4j quickly and easily - StackOverflowImporting Data into Neo4j quickly and easily - StackOverflow
Importing Data into Neo4j quickly and easily - StackOverflow
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph Databases
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentation
 
Using MongoDB as a high performance graph database
Using MongoDB as a high performance graph databaseUsing MongoDB as a high performance graph database
Using MongoDB as a high performance graph database
 
NodeJS for Beginner
NodeJS for BeginnerNodeJS for Beginner
NodeJS for Beginner
 
The Importance of MDM - Eternal Management of the Data Mind
The Importance of MDM - Eternal Management of the Data MindThe Importance of MDM - Eternal Management of the Data Mind
The Importance of MDM - Eternal Management of the Data Mind
 
Anatomy of a Modern Node.js Application Architecture
Anatomy of a Modern Node.js Application Architecture Anatomy of a Modern Node.js Application Architecture
Anatomy of a Modern Node.js Application Architecture
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 

Similaire à Why relationships are cool but join sucks - Big Data & Graphs in Rome

Austin Data Geeks - Why relationships are cool but join sucks
Austin Data Geeks - Why relationships are cool but join sucksAustin Data Geeks - Why relationships are cool but join sucks
Austin Data Geeks - Why relationships are cool but join sucksOrient Technologies
 
Switching from the Relational to the Graph model
Switching from the Relational to the Graph modelSwitching from the Relational to the Graph model
Switching from the Relational to the Graph modelLuca Garulli
 
DIGITAL DISRUPTION: an open platform for digital economy between creative inn...
DIGITAL DISRUPTION: an open platform for digital economy between creative inn...DIGITAL DISRUPTION: an open platform for digital economy between creative inn...
DIGITAL DISRUPTION: an open platform for digital economy between creative inn...SpagoWorld
 
Open source from disruption to innovation - Can we measure and evaluate the o...
Open source from disruption to innovation - Can we measure and evaluate the o...Open source from disruption to innovation - Can we measure and evaluate the o...
Open source from disruption to innovation - Can we measure and evaluate the o...SpagoWorld
 
Summarize the What Is Web 2.0
Summarize the What Is Web 2.0Summarize the What Is Web 2.0
Summarize the What Is Web 2.0wacerone
 
Innovation Without Asking Permission
Innovation Without Asking PermissionInnovation Without Asking Permission
Innovation Without Asking PermissionBart Blommaerts
 
O365Con18 - Create an Immersive Experience with Office365 Data and Mixed Real...
O365Con18 - Create an Immersive Experience with Office365 Data and Mixed Real...O365Con18 - Create an Immersive Experience with Office365 Data and Mixed Real...
O365Con18 - Create an Immersive Experience with Office365 Data and Mixed Real...NCCOMMS
 
O365 and SharePoint Connect - Create an immersive experience with office 365...
O365 and SharePoint Connect  - Create an immersive experience with office 365...O365 and SharePoint Connect  - Create an immersive experience with office 365...
O365 and SharePoint Connect - Create an immersive experience with office 365...Alexander Meijers
 
fOSSa 2013 - Building innovation momentum, getting things started from nexus ...
fOSSa 2013 - Building innovation momentum, getting things started from nexus ...fOSSa 2013 - Building innovation momentum, getting things started from nexus ...
fOSSa 2013 - Building innovation momentum, getting things started from nexus ...SpagoWorld
 
5/ GitHub Inner Source @ OPEN'16
5/ GitHub Inner Source @ OPEN'165/ GitHub Inner Source @ OPEN'16
5/ GitHub Inner Source @ OPEN'16Kangaroot
 
10 kickass-technologies-modern-developers-love
10 kickass-technologies-modern-developers-love10 kickass-technologies-modern-developers-love
10 kickass-technologies-modern-developers-loveHamed Hatami
 
Controlling the value in software companies
Controlling the value in software companiesControlling the value in software companies
Controlling the value in software companiesJohan Örneblad
 
Meetup OpenTelemetry Intro
Meetup OpenTelemetry IntroMeetup OpenTelemetry Intro
Meetup OpenTelemetry IntroDimitrisFinas1
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfAltinity Ltd
 
Tech Update Summary from Blue Mountain Data Systems September 2015
Tech Update Summary from Blue Mountain Data Systems September 2015Tech Update Summary from Blue Mountain Data Systems September 2015
Tech Update Summary from Blue Mountain Data Systems September 2015BMDS3416
 
Using Clojure, NoSQL Databases and Functional-Style JavaScript to Write Gext-...
Using Clojure, NoSQL Databases and Functional-Style JavaScript to Write Gext-...Using Clojure, NoSQL Databases and Functional-Style JavaScript to Write Gext-...
Using Clojure, NoSQL Databases and Functional-Style JavaScript to Write Gext-...Stefan Richter
 
RapidBlocks, a platform vision for accelerating enterprise blockchain adoption.
RapidBlocks, a platform vision for accelerating enterprise blockchain adoption.RapidBlocks, a platform vision for accelerating enterprise blockchain adoption.
RapidBlocks, a platform vision for accelerating enterprise blockchain adoption.aurablocks
 
Sears web30e connectionartificialintelligence
Sears web30e connectionartificialintelligenceSears web30e connectionartificialintelligence
Sears web30e connectionartificialintelligencehrpiza
 

Similaire à Why relationships are cool but join sucks - Big Data & Graphs in Rome (20)

Austin Data Geeks - Why relationships are cool but join sucks
Austin Data Geeks - Why relationships are cool but join sucksAustin Data Geeks - Why relationships are cool but join sucks
Austin Data Geeks - Why relationships are cool but join sucks
 
Switching from the Relational to the Graph model
Switching from the Relational to the Graph modelSwitching from the Relational to the Graph model
Switching from the Relational to the Graph model
 
DIGITAL DISRUPTION: an open platform for digital economy between creative inn...
DIGITAL DISRUPTION: an open platform for digital economy between creative inn...DIGITAL DISRUPTION: an open platform for digital economy between creative inn...
DIGITAL DISRUPTION: an open platform for digital economy between creative inn...
 
Open source from disruption to innovation - Can we measure and evaluate the o...
Open source from disruption to innovation - Can we measure and evaluate the o...Open source from disruption to innovation - Can we measure and evaluate the o...
Open source from disruption to innovation - Can we measure and evaluate the o...
 
Summarize the What Is Web 2.0
Summarize the What Is Web 2.0Summarize the What Is Web 2.0
Summarize the What Is Web 2.0
 
Innovation Without Asking Permission
Innovation Without Asking PermissionInnovation Without Asking Permission
Innovation Without Asking Permission
 
O365Con18 - Create an Immersive Experience with Office365 Data and Mixed Real...
O365Con18 - Create an Immersive Experience with Office365 Data and Mixed Real...O365Con18 - Create an Immersive Experience with Office365 Data and Mixed Real...
O365Con18 - Create an Immersive Experience with Office365 Data and Mixed Real...
 
O365 and SharePoint Connect - Create an immersive experience with office 365...
O365 and SharePoint Connect  - Create an immersive experience with office 365...O365 and SharePoint Connect  - Create an immersive experience with office 365...
O365 and SharePoint Connect - Create an immersive experience with office 365...
 
Innovation & Massive data
Innovation & Massive dataInnovation & Massive data
Innovation & Massive data
 
fOSSa 2013 - Building innovation momentum, getting things started from nexus ...
fOSSa 2013 - Building innovation momentum, getting things started from nexus ...fOSSa 2013 - Building innovation momentum, getting things started from nexus ...
fOSSa 2013 - Building innovation momentum, getting things started from nexus ...
 
5/ GitHub Inner Source @ OPEN'16
5/ GitHub Inner Source @ OPEN'165/ GitHub Inner Source @ OPEN'16
5/ GitHub Inner Source @ OPEN'16
 
10 kickass-technologies-modern-developers-love
10 kickass-technologies-modern-developers-love10 kickass-technologies-modern-developers-love
10 kickass-technologies-modern-developers-love
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 
Controlling the value in software companies
Controlling the value in software companiesControlling the value in software companies
Controlling the value in software companies
 
Meetup OpenTelemetry Intro
Meetup OpenTelemetry IntroMeetup OpenTelemetry Intro
Meetup OpenTelemetry Intro
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
 
Tech Update Summary from Blue Mountain Data Systems September 2015
Tech Update Summary from Blue Mountain Data Systems September 2015Tech Update Summary from Blue Mountain Data Systems September 2015
Tech Update Summary from Blue Mountain Data Systems September 2015
 
Using Clojure, NoSQL Databases and Functional-Style JavaScript to Write Gext-...
Using Clojure, NoSQL Databases and Functional-Style JavaScript to Write Gext-...Using Clojure, NoSQL Databases and Functional-Style JavaScript to Write Gext-...
Using Clojure, NoSQL Databases and Functional-Style JavaScript to Write Gext-...
 
RapidBlocks, a platform vision for accelerating enterprise blockchain adoption.
RapidBlocks, a platform vision for accelerating enterprise blockchain adoption.RapidBlocks, a platform vision for accelerating enterprise blockchain adoption.
RapidBlocks, a platform vision for accelerating enterprise blockchain adoption.
 
Sears web30e connectionartificialintelligence
Sears web30e connectionartificialintelligenceSears web30e connectionartificialintelligence
Sears web30e connectionartificialintelligence
 

Plus de Luca Garulli

Scale Out Your Graph Across Servers and Clouds with OrientDB
Scale Out Your Graph Across Servers and Clouds  with OrientDBScale Out Your Graph Across Servers and Clouds  with OrientDB
Scale Out Your Graph Across Servers and Clouds with OrientDBLuca Garulli
 
Polyglot Persistence vs Multi-Model Databases
Polyglot Persistence vs Multi-Model DatabasesPolyglot Persistence vs Multi-Model Databases
Polyglot Persistence vs Multi-Model DatabasesLuca Garulli
 
How Graph Databases started the Multi Model revolution
How Graph Databases started the Multi Model revolutionHow Graph Databases started the Multi Model revolution
How Graph Databases started the Multi Model revolutionLuca Garulli
 
OrientDB and Hazelcast
OrientDB and HazelcastOrientDB and Hazelcast
OrientDB and HazelcastLuca Garulli
 
Switching from relational to the graph model
Switching from relational to the graph modelSwitching from relational to the graph model
Switching from relational to the graph modelLuca Garulli
 
OrientDB document or graph? Select the right model (old presentation)
OrientDB document or graph? Select the right model (old presentation)OrientDB document or graph? Select the right model (old presentation)
OrientDB document or graph? Select the right model (old presentation)Luca Garulli
 
OrientDB distributed architecture 1.1
OrientDB distributed architecture 1.1OrientDB distributed architecture 1.1
OrientDB distributed architecture 1.1Luca Garulli
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App developmentLuca Garulli
 
OrientDB the database for the web 1.1
OrientDB the database for the web 1.1OrientDB the database for the web 1.1
OrientDB the database for the web 1.1Luca Garulli
 
Roma introduction and concepts
Roma introduction and conceptsRoma introduction and concepts
Roma introduction and conceptsLuca Garulli
 
OrientDB introduction - NoSQL
OrientDB introduction - NoSQLOrientDB introduction - NoSQL
OrientDB introduction - NoSQLLuca Garulli
 
RomaFramework Tutorial Basics
RomaFramework Tutorial BasicsRomaFramework Tutorial Basics
RomaFramework Tutorial BasicsLuca Garulli
 
Roma Meta Framework Concepts @JavaDay Rome 2007
Roma Meta Framework Concepts @JavaDay Rome 2007Roma Meta Framework Concepts @JavaDay Rome 2007
Roma Meta Framework Concepts @JavaDay Rome 2007Luca Garulli
 

Plus de Luca Garulli (13)

Scale Out Your Graph Across Servers and Clouds with OrientDB
Scale Out Your Graph Across Servers and Clouds  with OrientDBScale Out Your Graph Across Servers and Clouds  with OrientDB
Scale Out Your Graph Across Servers and Clouds with OrientDB
 
Polyglot Persistence vs Multi-Model Databases
Polyglot Persistence vs Multi-Model DatabasesPolyglot Persistence vs Multi-Model Databases
Polyglot Persistence vs Multi-Model Databases
 
How Graph Databases started the Multi Model revolution
How Graph Databases started the Multi Model revolutionHow Graph Databases started the Multi Model revolution
How Graph Databases started the Multi Model revolution
 
OrientDB and Hazelcast
OrientDB and HazelcastOrientDB and Hazelcast
OrientDB and Hazelcast
 
Switching from relational to the graph model
Switching from relational to the graph modelSwitching from relational to the graph model
Switching from relational to the graph model
 
OrientDB document or graph? Select the right model (old presentation)
OrientDB document or graph? Select the right model (old presentation)OrientDB document or graph? Select the right model (old presentation)
OrientDB document or graph? Select the right model (old presentation)
 
OrientDB distributed architecture 1.1
OrientDB distributed architecture 1.1OrientDB distributed architecture 1.1
OrientDB distributed architecture 1.1
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App development
 
OrientDB the database for the web 1.1
OrientDB the database for the web 1.1OrientDB the database for the web 1.1
OrientDB the database for the web 1.1
 
Roma introduction and concepts
Roma introduction and conceptsRoma introduction and concepts
Roma introduction and concepts
 
OrientDB introduction - NoSQL
OrientDB introduction - NoSQLOrientDB introduction - NoSQL
OrientDB introduction - NoSQL
 
RomaFramework Tutorial Basics
RomaFramework Tutorial BasicsRomaFramework Tutorial Basics
RomaFramework Tutorial Basics
 
Roma Meta Framework Concepts @JavaDay Rome 2007
Roma Meta Framework Concepts @JavaDay Rome 2007Roma Meta Framework Concepts @JavaDay Rome 2007
Roma Meta Framework Concepts @JavaDay Rome 2007
 

Dernier

Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 

Dernier (20)

Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 

Why relationships are cool but join sucks - Big Data & Graphs in Rome

  • 1. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 1 www.orientechnologies.com Luca Garulli – Founder and CEO @Orient Technologies Ltd Author of OrientDB www.twitter.com/lgarulli Why Relationships are cool but the “JOIN” sucks BigData & Graphs In Rome
  • 2. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 2 1979 First Relational DBMS available as product 2009 NoSQL movement
  • 3. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 3 1979 First Relational DBMS available as product 2009 NoSQL movement Hey, 30 years in the IT field is so huge!
  • 4. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 4 Before 2009 teams of developers always fought to select: Operative System Programming Language Middleware (App-Servers) What about the Database?
  • 5. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 5 One of the main resistances of RDBMS users to pass to a NoSQL product are related to the complexity of the model: Ok, NoSQL products are super for BigData and BigScale but...
  • 6. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 6 ...what about the model?
  • 7. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 7 What is the NoSQL answer about managing complex domains? Key-Value stores ? Column-Based ? Document database ? Graph database ! NoRelationships support
  • 8. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 8 Why most of NoSQL products don’t support Relationship Between entities?
  • 9. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 9 To understand why, let’s see how Relational DBMS managed them
  • 10. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 10 Domain: the super minimal “Selling App” CustomerCustomer AddressAddress OrderOrder StockStock Registry system Order system
  • 11. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 11 StockStock Registry system Domain: the super minimal “Selling App” OrderOrder Order system CustomerCustomer AddressAddress How does Relational DBMS manage relationships?
  • 12. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 12 Relational World: 1-1 Relationships JOIN Customer.Address -> Address.Id Customer Id Name Address 10 Luca 34 11 Jill 44 34 John 54 56 Mark 66 88 Steve 68 Address Id Location 34 Rome 44 London 54 Moscow 66 New Mexico 68 Palo Alto Foreign key Primary keyPrimary key
  • 13. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 13 Relational World: 1-N Relationships Inverse JOIN Address.Customer -> Customer.Id Customer Id Name 10 Luca 11 Jill 34 John 56 Mark 88 Steve Address Id Customer Location 24 10 Rome 33 10 London 44 34 Moscow 66 56 Cologne 68 88 Palo Alto
  • 14. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 14 Relational World: N-M Relationships Additional table with 2 JOINs (1) CustomerAddress.Id -> Customer.Id and (2) CustomerAddress.Address -> Address.Id Customer Id Name 10 Luca 11 Jill 34 John 56 Mark 88 Steve Address Id Location 24 Rome 33 London 44 Moscow 66 Cologne 68 Palo Alto CustomerAddress Id Address 10 24 10 33 34 44
  • 15. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 15 What’s wrong with the Relational Model?
  • 16. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 16 These are all JOINs executed everytime you traverse a relationship The JOIN is the evil! Customer Id Name 10 Luca 11 Jill 34 John 56 Mark 88 Steve Address Id Location 24 Rome 33 London 44 Moscow 66 Cologne 68 Palo Alto These are all JOINs executed everytime you traverse a relationship These are all JOINs executed everytime you traverse a relationship These are all JOINs executed everytime you traverse a relationship! CustomerAddress Id Address 10 24 10 33 34 24
  • 17. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 17 A JOIN means searching for a key in another table The first rule to improve performance is indexing all the keys Index speeds up searches, but slows down insert, updates and deletes
  • 18. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 18 So in the best case a JOIN is a lookup into an index This is done per single join! If you traverse hundreds of relationships you’re executing hundreds of JOINs
  • 19. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 19 Index Lookup is it really that fast?
  • 20. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 20 Index Lookup: how does it works? A-Z A-L M-Z Think to an Address Book where we have to find the Luca’s phone number
  • 21. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 21 Index Lookup: how does it works? A-Z A-L M-Z A-L A-D E-L M-Z M-R S-Z Index algorithms are all similar and based on balanced trees
  • 22. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 22 Index Lookup: how does it works? A-Z A-L M-Z A-L A-D E-L M-Z M-R S-Z A-D A-B C-D E-L E-G H-L
  • 23. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 23 Index Lookup: how does it works? A-Z A-L M-Z A-L A-D E-L M-Z M-R S-Z A-D A-B C-D E-L E-G H-L E-G E-F G H-L H-J K-L
  • 24. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 24 Index Lookup: how does it works? A-Z A-L M-Z A-L A-D E-L M-Z M-R S-Z A-D A-B C-D E-L E-G H-L E-G E-F G H-L H-J K-L Luca Found! This lookup took 5 steps and grows up with the index size! Found! This lookup took 5 steps and grows up with the index size!
  • 25. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 25 Can you imagine how many steps a Lookup operation does into an Index with Millions or Billions of records?
  • 26. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 26 And this JOIN is executed foreach involved table, multiplied foreach scanned records !
  • 27. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 27 Querying more tables can easily produce millions of JOINs/Lookups! Here the rule: more entries = more lookup steps = slower JOIN
  • 28. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 28 Oh! This is why performance of my database drops down when it becomes bigger, and bigger, and bigger!
  • 29. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 29 What about Document Databases like MongoDB?
  • 30. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 30 How MongoDB manages relationships: { “_id” : “292846512”, “type” : “Order”, “number” : 1223, “customer” : “123456789” }
  • 31. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 31 MongoDB uses the same approach: it stores the _id of the connected documents. At run-time it lookups up for the _id by using an index.
  • 32. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 32 Is there a better way to manage relationships?
  • 33. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 33 “A graph database is any storage system that provides index-free adjacency” - Marko Rodriguez (author of TinkerPop Blueprints)
  • 34. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 34 How does GraphDB manage index-free relationships?
  • 35. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 35 Every developer knows the Relational Model, but who knows the Graph one?
  • 36. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 36 Back to school: Graph Theory crash course
  • 37. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 37 Basic Graph LucaLuca NoSQL Day NoSQL Day Likes
  • 38. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 38 Property Graph Model* Luca name: Luca surname: Garulli company: Orient Tech Luca name: Luca surname: Garulli company: Orient Tech NoSQL Day date: Nov 15° 2013 NoSQL Day date: Nov 15° 2013 Likes since: 2013 Vertices and Edges can have properties Vertices and Edges can have properties Vertices and Edges can have properties Vertices are directed * https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model
  • 39. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 39 Property Graph Model LucaLuca NoSQL Day NoSQL Day Likes since: 2013 Speaks title: «Switching...» abstract: «This talk presents...» An Edge connects 2 vertices: use multiple edges to represents 1-N and N-M relationships
  • 40. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 40 Property Graph Model Likes DanielDaniel LucaLuca Organizes FriendOf NoSQL Day NoSQL Day UdineUdine located Studies
  • 41. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 41 Compliments, this is your diploma in «Graph Theory»
  • 42. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 42 The Graph theory is so simple to be so powerful
  • 43. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 43 Let’s go back to the Graph Stuff How does OrientDB manage relationships?
  • 44. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 44 Luca (vertex) Luca (vertex) OrientDB: traverse a relationship label : ‘Customer’ name : ‘Luca’ label : ‘Customer’ name : ‘Luca’ RID = #13:35RID = #13:35 RID = #13:100RID = #13:100 label = ‘Address’ name = ‘Rome’ label = ‘Address’ name = ‘Rome’ The Record ID (RID) is the physical position Rome (vertex) Rome (vertex) The Record ID (RID) is the physical position
  • 45. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 45 Lives OrientDB: traverse a relationship out : [#14:54] label : ‘Customer’ name : ‘Luca’ out : [#14:54] label : ‘Customer’ name : ‘Luca’ out: [#13:35] in: [#13:100] Label : ‘Lives’ out: [#13:35] in: [#13:100] Label : ‘Lives’ RID = #13:35RID = #13:35 RID = #13:100RID = #13:100 in: [#14:54] label = ‘Address’ name = ‘Rome’ in: [#14:54] label = ‘Address’ name = ‘Rome’ The Edge’s RID is saved inside both vertices, as «out» and «in» The Edge’s RID is saved inside both vertices, as «out» and «in» RID = #14:54RID = #14:54 Luca (vertex) Luca (vertex) Rome (vertex) Rome (vertex)
  • 46. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 46 LucaLuca Lives OrientDB: traverse -> outgoing out : [#14:54] label : ‘Customer’ name : ‘Luca’ out : [#14:54] label : ‘Customer’ name : ‘Luca’ out: [#13:35] in: [#13:100] Label : ‘Lives’ out: [#13:35] in: [#13:100] Label : ‘Lives’ RID = #13:35RID = #13:35 RID = #14:54RID = #14:54 RID = #13:100RID = #13:100 in: [#14:54] label = ‘Address’ name = ‘Rome’ in: [#14:54] label = ‘Address’ name = ‘Rome’ RomeRome
  • 47. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 47 LucaLuca Lives OrientDB: traverse <- incoming out : [#14:54] label : ‘Customer’ name : ‘Luca’ out : [#14:54] label : ‘Customer’ name : ‘Luca’ out: [#13:35] in: [#13:100] Label : ‘Lives’ out: [#13:35] in: [#13:100] Label : ‘Lives’ RID = #13:35RID = #13:35 RID = #14:54RID = #14:54 RID = #13:100RID = #13:100 in: [#14:54] label = ‘Address’ name = ‘Rome’ in: [#14:54] label = ‘Address’ name = ‘Rome’ RomeRome
  • 48. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 48 GraphDB handles relationships as a physical LINK to the record assigned when the edge is created on the other side RDBMS computes the relationship every time you query a database Is not that crazy?!
  • 49. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 49 This means jumping from a O(log N) algorithm to a near O(1) traversing cost is not more affected by database size! This is huge in the BigData age
  • 50. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 50 an Open Source (Apache licensed) document-graph NoSQL dbms
  • 51. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 51 OrientDB in the Blueprints micro-benchmark, on common hw, with a hot cache, traverses 29,6 Millions of records in less than 5 seconds about 6 Millions of nodes traversed per sec! *unless you live in the Google’s server farm Do not try this at home with a RDBMS*!
  • 52. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 52 Create the graph in SQL $luca> cd bin $luca> ./console.sh OrientDB console v.1.6.1 (www.orientdb.org) Type 'help' to display all the commands supported. orientdb> create vertex Customer set name = ‘Luca’ Created vertex #13:35 in 0.03 secs orientdb> create vertex Address set name = ‘Rome’ Created vertex #13:100 in 0.02 secs orientdb> create edge Lives from #13:35 to #13:100 Created edge #14:54 in 0.02 secs
  • 53. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 53 Create the graph in Java Graph graph = new OrientGraph("local:/tmp/db/graph”); Vertex luca = graph.addVertex( “class:Customer” ); luca.setProperty( “name", “Luca” ); Vertex rome = graph.addVertex ( “class:Address” ); rome.setProperty( “name", “Rome” ); Edge edge = luca.addEdge( “Lives”, rome ); graph.shutdown();
  • 54. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 54 Query the graph in SQL orientdb> select in(‘Lives’) from Address where name = ‘Rome’ ---+------+---------|--------------------+--------------------+--------+   #| RID  |@class   |label               |out_Lives           |in      | ---+------+---------+--------------------+--------------------+--------+   0| 13:35|Customer |Luca                |[#14:54]            |        | ---+------+---------+--------------------+--------------------+--------+ 1 item(s) found. Query executed in 0.007 sec(s). Incoming vertices
  • 55. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 55 More on query power orientdb> select sum( out(‘Order’).total ) from Customer where name = ‘Luca’ orientdb> traverse both(‘Friend’) from Customer while $depth <= 7 orientdb> select from ( traverse both(‘Friend’) from Customer while $depth <= 7 ) where @class=‘Customer’ and city.name = ‘Udine’
  • 56. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 56 Query vs traversal Once you’ve a well connected database in the form of a Super Graph you can cross records instead of query them! All you need is a few“Root Vertices” where to start traversing
  • 57. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 57 Query vs traversal CustomersCustomers LucaLuca Mar k Mar k JillJill Order 2332 Order 2332 Order 8834 Order 8834 White Soap White Soap StocksStocks Special Customers Special Customers This is a root vertex
  • 58. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 58 Root Vertices can be enriched by Meta Graphs to decorate Graphs with additional information and make easier/faster the retrieval
  • 59. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 59 Temporal based Meta Graph Order 2333 Order 2333 Order 2334 Order 2334 CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Order 2332 Order 2332 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013
  • 60. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 60 Location based Meta Graph Order 2333 Order 2333 Order 2334 Order 2334 LocationLocation City Rome City Rome City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy
  • 61. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 61 Mix & Merge graphs Order 2333 Order 2333 Order 2334 Order 2334 LocationLocation City Rome City Rome City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013
  • 62. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 62 Order 2333 Order 2333 Order 2334 Order 2334 LocationLocation City Rome City Rome City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013 Get all the orders sold in “Fiumicino” city on 9/4/2013 at 10:00
  • 63. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 63 Start from Calendar, look for Hour 10:00 Order 2333 Order 2333 Order 2334 Order 2334 LocationLocation City Rome City Rome City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013
  • 64. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 64 Start from Calendar, look for Hour 10:00 Order 2333 Order 2333 Order 2334 Order 2334 LocationLocation City Rome City Rome City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013 Found 2 Orders, filter by incoming edges< Found 2 Orders, now filter by incoming edges
  • 65. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 65 Order 2333 Order 2333 LocationLocation City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013 Order 2334 Order 2334 Only “Order 2333” has incoming connections with “Fiumicino” City Rome City Rome Start from Calendar, look for Hour 10:00
  • 66. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 66 Order 2333 Order 2333 LocationLocation City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013 Order 2334 Order 2334 City Rome City Rome Or start from Location, look for Fiumicino
  • 67. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 67 Order 2333 Order 2333 Order 2332 Order 2332 CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013 Order 2334 Order 2334 Start from Location, look for Fiumicino LocationLocation City Rome City Rome City Fiumicino City Fiumicino State RM State RM Region Lazio Region Lazio Country Italy Country Italy
  • 68. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 68 LucaLuca Recommendation system JillJill EnricoEnrico Friend Friend
  • 69. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 69 Da CarloneDa Carlone LucaLuca Recommendation system JillJill EnricoEnrico La Mediterranea La Mediterranea MeridionaleMeridionale Friend Friend Eats Eats Eats Eats EaitalyEaitaly Eats
  • 70. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 70 Recommendation system Da CarloneDa Carlone LucaLuca JillJill EnricoEnrico La Mediterranea La Mediterranea MeridionaleMeridionale Friend Friend Eats Eats Eats Eats EaitalyEaitaly select both(‘Friend’) from Person where name = ‘Luca’ select both(‘Friend’) from Person where name = ‘Luca’
  • 71. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 71 Recommendation system Da CarloneDa Carlone LucaLuca JillJill EnricoEnrico La Mediterranea La Mediterranea MeridionaleMeridionale Friend Friend Eats Eats Eats Eats EaitalyEaitaly select both(‘Friend’).out(‘Eats’) from Person where name = ‘Luca’ select both(‘Friend’).out(‘Eats’) from Person where name = ‘Luca’
  • 72. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 72 Recommendation system Da CarloneDa Carlone LucaLuca JillJill EnricoEnrico La Mediterranea La Mediterranea MeridionaleMeridionale Friend Friend Eats Eats Eats Eats EaitalyEaitaly select both(‘Friend’).out(‘Eats’) from Person where name = ‘Luca’ select both(‘Friend’).out(‘Eats’) from Person where name = ‘Luca’
  • 73. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 73 This is your database
  • 74. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 74 Get last customer bought ‘Barolo’ select last(out(‘Order’).in(‘Customer)) from Stock where name = ‘Barolo’ #34:22
  • 75. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 75 Get his’s country select out(‘City’) from #34:22 Udine, Italy #55:12
  • 76. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 76 Get orders from that country select in(‘Customer’) from #55:12
  • 77. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 77 Let’s move like a Spider on the web
  • 78. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 78 OrientDB = { flexibility of Document databases + complexity of the Graph model + Object Oriented concepts + super fast Index + powerful SQL dialect + multi-master replication and sharding}
  • 79. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 79 Ø configdownload, unzip, run! cut & paste the db
  • 80. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 80 150,000records per second (flat records, no index, on commodity hw)
  • 81. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 81 Schema-less schema is not mandatory, relaxed model, collect heterogeneous documents all together
  • 82. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 82 Schema-full schema with constraints on fields and validation rules Customer.age > 17 Customer.address not null Customer.surname is mandatory Customer.email matches 'b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}b'
  • 83. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 83 Schema-mixed schema with mandatory and optional fields + constraints the best of schema-less and schema-full modes
  • 84. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 84 ACID Transactions db.begin(); try{ // your code ... db.commit(); } catch( Exception e ) { db.rollback(); }
  • 85. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 85 SQLselect * from employee where name like '%Jay%' and status=0
  • 86. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 86 Why reinvent yet another language when the 100% of developers already know SQL? OrientDB begins from SQL but improves it with new operators for graph manipulation
  • 87. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 87 For the most of the queries everyday a programmer needs SQL is simpler, more readable and compact then Scripting (Map/Reduce)
  • 88. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 88 SQL & relationships select from Account where address.city.country.name = 'Italy' select from Account where addresses contains (city.country.name = 'Italy')
  • 89. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 89 SQL & trees/graphs select out('friend’) from V where name = 'Luca' and surname = 'Garulli' select out[@class='knows’] from V where name = 'Jay' and surname = 'Miner' traverse friends from #13:55 where $depth < 7
  • 90. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 90 SQL sub queries select from ( traverse friends from Profile where $depth < 7 ) where home.city.name = ‘Cologne’
  • 91. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 91 SQL & strings select from Profile where name.toUpperCase() = 'LUCA' select from City where country.name.substring(1,3).toUpperCase() = 'TAL' select from Agenda where phones contains ( number.indexOf( '+39' ) > -1 ) select from Agenda where email matches 'bA-Z0-9._%+-?+@A-Z0-9.-?+.A-Z?{2,4}b'
  • 92. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 92 SQL & schema-less select from Profile where any() like '%Jay%' select from Stock where all() is not null
  • 93. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 93 SQL & collections select from Tree where children contains ( married = true ) select from Tree where children containsAll ( married = true ) select from User where roles containsKey 'shutdown' select from Graph where edges.size() > 0
  • 94. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 94 Native JSONODocument = new ODocument().fromJSON( " { '@rid' = '26:10', '@class' = 'Developer', 'name' : 'Luca', 'surname' : 'Garulli', 'out' : [ #10:33, #10:232 ] }“ );
  • 95. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 95 Always Free Open Source Apache 2 license free for any purposes, even commercials
  • 96. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 96 Some clients Kondoot Scenari
  • 97. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 97 www.orientechnologies.com Thanks! Luca Garulli – Founder and CEO www.twitter.com/lgarulli

Notes de l'éditeur

  1. Good afternoon! Today I’d like to show you a new way to design a database. In 1970 Relational DBMS