SlideShare une entreprise Scribd logo
1  sur  71
Optimizing Cypher
Queries in Neo4j
Wes Freeman (@wefreema)
Mark Needham (@markhneedham)
Today's schedule
• Brief overview of cypher syntax
• Graph global vs Graph local queries
• Labels and indexes
• Optimization patterns
• Profiling cypher queries
• Applying optimization patterns
Cypher Syntax
• Statement parts
o Optional: Querying part (MATCH|WHERE)
o Optional: Updating part (CREATE|MERGE)
o Optional: Returning part (WITH|RETURN)
• Parts can be chained together
Cypher Syntax - Refresher
MATCH (n:Label)-[r:LINKED]->(m)
WHERE n.prop = "..."
RETURN n, r, m
Starting points
• Graph scan (global; potentially slow)
• Label scan (usually reserved for aggregation
queries; not ideal)
• Label property index lookup (local; good!)
Introducing the football dataset
The 1.9 global scan
O(n)
n = # of nodes
START pl = node(*)
MATCH (pl)-[:played]->(stats)
WHERE pl.name = "Wayne Rooney"
RETURN stats
150ms w/ 30k nodes, 120k rels
The 2.0 global scan
MATCH (pl)-[:played]->(stats)
WHERE pl.name = "Wayne Rooney"
RETURN stats
130ms w/ 30k nodes, 120k rels
O(n)
n = # of nodes
Why is it a global scan?
• Cypher is a pattern matching language
• It doesn't discriminate unless you tell it to
o It must try to start at all nodes to find this pattern, as
specified
Introduce a label
Label your starting points
CREATE (player:Player
{name: "Wayne Rooney"} )
O(k)
k = # of nodes with that labelLabel scan
MATCH (pl:Player)-[:played]->(stats)
WHERE pl.name = "Wayne Rooney"
RETURN stats
80ms w/ 30k nodes, 120k rels (~900 :Player nodes)
Indexes don't come for free
CREATE INDEX ON :Player(name)
OR
CREATE CONSTRAINT ON pl:Player
ASSERT pl.name IS UNIQUE
O(log k)
k = # of nodes with that labelIndex lookup
MATCH (pl:Player)-[:played]->(stats)
WHERE pl.name = "Wayne Rooney"
RETURN stats
6ms w/ 30k nodes, 120k rels (~900 :Player nodes)
Optimization Patterns
• Avoid cartesian products
• Avoid patterns in the WHERE clause
• Start MATCH patterns at the lowest
cardinality and expand outward
• Separate MATCH patterns with minimal
expansion at each stage
Introducing the movie data set
Anti-pattern: Cartesian Products
MATCH (m:Movie), (p:Person)
Subtle Cartesian Products
MATCH (p:Person)-[:KNOWS]->(c)
WHERE p.name="Tom Hanks"
WITH c
MATCH (k:Keyword)
RETURN c, k
Counting Cartesian Products
MATCH (pl:Player),(t:Team),(g:Game)
RETURN COUNT(DISTINCT pl),
COUNT(DISTINCT t),
COUNT(DISTINCT g)
80000 ms w/ ~900 players, ~40 teams, ~1200 games
MATCH (pl:Player)
WITH COUNT(pl) as players
MATCH (t:Team)
WITH COUNT(t) as teams, players
MATCH (g:Game)
RETURN COUNT(g) as games, teams, players8ms w/
~900 players, ~40 teams, ~1200 games
Better Counting
Directions on patterns
MATCH (p:Person)-[:ACTED_IN]-(m)
WHERE p.name = "Tom Hanks"
RETURN m
Parameterize your queries
MATCH (p:Person)-[:ACTED_IN]-(m)
WHERE p.name = {name}
RETURN m
Fast predicates first
Bad:
MATCH (t:Team)-[:played_in]->(g)
WHERE NOT (t)-[:home_team]->(g)
AND g.away_goals > g.home_goals
RETURN t, COUNT(g)
Better:
MATCH (t:Team)-[:played_in]->(g)
WHERE g.away_goals > g.home_goals
AND NOT (t)-[:home_team]->()
RETURN t, COUNT(g)
Fast predicates first
Patterns in WHERE clauses
• Keep them in the MATCH
• The only pattern that needs to be in a
WHERE clause is a NOT
MERGE and CONSTRAINTs
• MERGE is MATCH or CREATE
• MERGE can take advantage of unique
constraints and indexes
MERGE (without index)
MERGE (g:Game
{date:1290257100,
time: 1245,
home_goals: 2,
away_goals: 3,
match_id: 292846,
attendance: 60102})
RETURN g
188 ms w/ ~400 games
Adding an index
CREATE INDEX ON :Game(match_id)
MERGE (with index)
MERGE (g:Game
{date:1290257100,
time: 1245,
home_goals: 2,
away_goals: 3,
match_id: 292846,
attendance: 60102})
RETURN g
6 ms w/ ~400 games
Alternative MERGE approach
MERGE (g:Game { match_id: 292846 })
ON CREATE
SET g.date = 1290257100
SET g.time = 1245
SET g.home_goals = 2
SET g.away_goals = 3
SET g.attendance = 60102
RETURN g
Profiling queries
• Use the PROFILE keyword in front of the
query
o from webadmin or shell - won't work in browser
• Look for db_hits and rows
• Ignore everything else (for now!)
Reviewing the football dataset
Football Optimization
MATCH (game)<-[:contains_match]-(season:Season),
(team)<-[:away_team]-(game),
(stats)-[:in]->(game),
(team)<-[:for]-(stats)<-[:played]-(player)
WHERE season.name = "2012-2013"
RETURN player.name,
COLLECT(DISTINCT team.name),
SUM(stats.goals) as goals
ORDER BY goals DESC
LIMIT 103137 ms w/ ~900 players, ~20 teams, ~400 games
Football Optimization
==> ColumnFilter(symKeys=["player.name", " INTERNAL_AGGREGATEe91b055b-a943-4ddd-9fe8-e746407c504a", "
INTERNAL_AGGREGATE240cfcd2-24d9-48a2-8ca9-fb0286f3d323"], returnItemNames=["player.name", "COLLECT(DISTINCT
team.name)", "goals"], _rows=10, _db_hits=0)
==> Top(orderBy=["SortItem(Cached( INTERNAL_AGGREGATE240cfcd2-24d9-48a2-8ca9-fb0286f3d323 of type Number),false)"],
limit="Literal(10)", _rows=10, _db_hits=0)
==> EagerAggregation(keys=["Cached(player.name of type Any)"], aggregates=["( INTERNAL_AGGREGATEe91b055b-a943-4ddd-9fe8-
e746407c504a,Distinct(Collect(Property(team,name(0))),Property(team,name(0))))", "( INTERNAL_AGGREGATE240cfcd2-24d9-48a2-
8ca9-fb0286f3d323,Sum(Property(stats,goals(13))))"], _rows=503, _db_hits=10899)
==> Extract(symKeys=["stats", " UNNAMED12", " UNNAMED108", "season", " UNNAMED55", "player", "team", " UNNAMED124", "
UNNAMED85", "game"], exprKeys=["player.name"], _rows=5192, _db_hits=5192)
==> PatternMatch(g="(player)-[' UNNAMED124']-(stats)", _rows=5192, _db_hits=0)
==> Filter(pred="Property(season,name(0)) == Literal(2012-2013)", _rows=5192, _db_hits=15542)
==> TraversalMatcher(trail="(season)-[ UNNAMED12:contains_match WHERE true AND true]->(game)<-[ UNNAMED85:in WHERE
true AND true]-(stats)-[ UNNAMED108:for WHERE true AND true]->(team)<-[ UNNAMED55:away_team WHERE true AND true]-
(game)", _rows=15542, _db_hits=1620462)
Break out the match statements
MATCH (game)<-[:contains_match]-(season:Season)
MATCH (team)<-[:away_team]-(game)
MATCH (stats)-[:in]->(game)
MATCH (team)<-[:for]-(stats)<-[:played]-(player)
WHERE season.name = "2012-2013"
RETURN player.name,
COLLECT(DISTINCT team.name),
SUM(stats.goals) as goals
ORDER BY goals DESC
LIMIT 10200 ms w/ ~900 players, ~20 teams, ~400 games
Start small
• Smallest cardinality label first
• Smallest intermediate result set first
Exploring cardinalities
MATCH (game)<-[:contains_match]-(season:Season)
RETURN COUNT(DISTINCT game), COUNT(DISTINCT season)
1140 games, 3 seasons
MATCH (team)<-[:away_team]-(game:Game)
RETURN COUNT(DISTINCT team), COUNT(DISTINCT game)
25 teams, 1140 games
Exploring cardinalities
MATCH (stats)-[:in]->(game:Game)
RETURN COUNT(DISTINCT stats), COUNT(DISTINCT game)
31117 stats, 1140 games
MATCH (stats)<-[:played]-(player:Player)
RETURN COUNT(DISTINCT stats), COUNT(DISTINCT player)
31117 stats, 880 players
Look for teams first
MATCH (team)<-[:away_team]-(game:Game)
MATCH (game)<-[:contains_match]-(season)
WHERE season.name = "2012-2013"
MATCH (stats)-[:in]->(game)
MATCH (team)<-[:for]-(stats)<-[:played]-(player)
RETURN player.name,
COLLECT(DISTINCT team.name),
SUM(stats.goals) as goals
ORDER BY goals DESC
LIMIT 10162 ms w/ ~900 players, ~20 teams, ~400 games
==> ColumnFilter(symKeys=["player.name", " INTERNAL_AGGREGATEbb08f36b-a70d-46b3-9297-b0c7ec85c969", "
INTERNAL_AGGREGATE199af213-e3bd-400f-aba9-8ca2a9e153c5"], returnItemNames=["player.name", "COLLECT(DISTINCT
team.name)", "goals"], _rows=10, _db_hits=0)
==> Top(orderBy=["SortItem(Cached( INTERNAL_AGGREGATE199af213-e3bd-400f-aba9-8ca2a9e153c5 of type Number),false)"],
limit="Literal(10)", _rows=10, _db_hits=0)
==> EagerAggregation(keys=["Cached(player.name of type Any)"], aggregates=["( INTERNAL_AGGREGATEbb08f36b-a70d-46b3-9297-
b0c7ec85c969,Distinct(Collect(Property(team,name(0))),Property(team,name(0))))", "( INTERNAL_AGGREGATE199af213-e3bd-400f-
aba9-8ca2a9e153c5,Sum(Property(stats,goals(13))))"], _rows=503, _db_hits=10899)
==> Extract(symKeys=["stats", " UNNAMED12", " UNNAMED168", "season", " UNNAMED125", "player", "team", " UNNAMED152", "
UNNAMED51", "game"], exprKeys=["player.name"], _rows=5192, _db_hits=5192)
==> PatternMatch(g="(stats)-[' UNNAMED152']-(team),(player)-[' UNNAMED168']-(stats)", _rows=5192, _db_hits=0)
==> PatternMatch(g="(stats)-[' UNNAMED125']-(game)", _rows=10394, _db_hits=0)
==> Filter(pred="Property(season,name(0)) == Literal(2012-2013)", _rows=380, _db_hits=380)
==> PatternMatch(g="(season)-[' UNNAMED51']-(game)", _rows=380, _db_hits=1140)
==> TraversalMatcher(trail="(game)-[ UNNAMED12:away_team WHERE true AND true]->(team)", _rows=1140,
_db_hits=1140)
Look for teams first
Filter games a bit earlier
MATCH (game)<-[:contains_match]-(season:Season)
WHERE season.name = "2012-2013"
MATCH (team)<-[:away_team]-(game)
MATCH (stats)-[:in]->(game)
MATCH (team)<-[:for]-(stats)<-[:played]-(player)
RETURN player.name,
COLLECT(DISTINCT team.name),
SUM(stats.goals) as goals
ORDER BY goals DESC
LIMIT 10148 ms w/ ~900 players, ~20 teams, ~400 games
Filter out stats with no goals
MATCH (game)<-[:contains_match]-(season:Season)
WHERE season.name = "2012-2013"
MATCH (team)<-[:away_team]-(game)
MATCH (stats)-[:in]->(game)WHERE stats.goals > 0
MATCH (team)<-[:for]-(stats)<-[:played]-(player)
RETURN player.name,
COLLECT(DISTINCT team.name),
SUM(stats.goals) as goals
ORDER BY goals DESC
LIMIT 10
59 ms w/ ~900 players, ~20 teams, ~400 games
Movie query optimization
MATCH (movie:Movie {title: {title} })
MATCH (genre)<-[:HAS_GENRE]-(movie)
MATCH (director)-[:DIRECTED]->(movie)
MATCH (actor)-[:ACTED_IN]->(movie)
MATCH (writer)-[:WRITER_OF]->(movie)
MATCH (actor)-[:ACTED_IN]->(actormovies)
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword) as weight,
count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as
genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name)
as writers
ORDER BY weight DESC, actormoviesweight DESC
WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie,
collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres,
directors, writers
MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies)
WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors,
genres, directors, writers
ORDER BY keyword_weight
RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })
MATCH (genre)<-[:HAS_GENRE]-(movie)
MATCH (director)-[:DIRECTED]->(movie)
MATCH (actor)-[:ACTED_IN]->(movie)
MATCH (writer)-[:WRITER_OF]->(movie)
MATCH (actor)-[:ACTED_IN]->(actormovies)
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword) as weight,
count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as
genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name)
as writers
ORDER BY weight DESC, actormoviesweight DESC
WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie,
collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres,
directors, writers
MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies)
WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors,
genres, directors, writers
ORDER BY keyword_weight
RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })MATCH (genre)<-[:HAS_GENRE]-
(movie)
MATCH (director)-[:DIRECTED]->(movie)
MATCH (actor)-[:ACTED_IN]->(movie)
MATCH (writer)-[:WRITER_OF]->(movie)
MATCH (actor)-[:ACTED_IN]->(actormovies)
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword) as weight,
count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as
genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name)
as writers
ORDER BY weight DESC, actormoviesweight DESC
WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie,
collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres,
directors, writers
MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies)
WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors,
genres, directors, writers
ORDER BY keyword_weight
RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })
MATCH (genre)<-[:HAS_GENRE]-(movie)MATCH (director)-[:DIRECTED]-
>(movie)
MATCH (actor)-[:ACTED_IN]->(movie)
MATCH (writer)-[:WRITER_OF]->(movie)
MATCH (actor)-[:ACTED_IN]->(actormovies)
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword) as weight,
count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as
genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name)
as writers
ORDER BY weight DESC, actormoviesweight DESC
WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie,
collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres,
directors, writers
MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies)
WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors,
genres, directors, writers
ORDER BY keyword_weight
RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })
MATCH (genre)<-[:HAS_GENRE]-(movie)
MATCH (director)-[:DIRECTED]->(movie)MATCH (actor)-[:ACTED_IN]-
>(movie)
MATCH (writer)-[:WRITER_OF]->(movie)
MATCH (actor)-[:ACTED_IN]->(actormovies)
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword) as weight,
count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as
genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name)
as writers
ORDER BY weight DESC, actormoviesweight DESC
WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie,
collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres,
directors, writers
MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies)
WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors,
genres, directors, writers
ORDER BY keyword_weight
RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })
MATCH (genre)<-[:HAS_GENRE]-(movie)
MATCH (director)-[:DIRECTED]->(movie)
MATCH (actor)-[:ACTED_IN]->(movie)MATCH (writer)-[:WRITER_OF]-
>(movie)
MATCH (actor)-[:ACTED_IN]->(actormovies)
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword) as weight,
count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as
genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name)
as writers
ORDER BY weight DESC, actormoviesweight DESC
WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie,
collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres,
directors, writers
MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies)
WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors,
genres, directors, writers
ORDER BY keyword_weight
RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })
MATCH (genre)<-[:HAS_GENRE]-(movie)
MATCH (director)-[:DIRECTED]->(movie)
MATCH (actor)-[:ACTED_IN]->(movie)
MATCH (writer)-[:WRITER_OF]->(movie)MATCH (actor)-[:ACTED_IN]-
>(actormovies)
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword) as weight,
count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as
genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name)
as writers
ORDER BY weight DESC, actormoviesweight DESC
WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie,
collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres,
directors, writers
MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies)
WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors,
genres, directors, writers
ORDER BY keyword_weight
RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })
MATCH (genre)<-[:HAS_GENRE]-(movie)
MATCH (director)-[:DIRECTED]->(movie)
MATCH (actor)-[:ACTED_IN]->(movie)
MATCH (writer)-[:WRITER_OF]->(movie)
MATCH (actor)-[:ACTED_IN]->(actormovies)MATCH
(movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword) as weight,
count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as
genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name)
as writers
ORDER BY weight DESC, actormoviesweight DESC
WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie,
collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres,
directors, writers
MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies)
WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors,
genres, directors, writers
ORDER BY keyword_weight
RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })
MATCH (genre)<-[:HAS_GENRE]-(movie)
MATCH (director)-[:DIRECTED]->(movie)
MATCH (actor)-[:ACTED_IN]->(movie)
MATCH (writer)-[:WRITER_OF]->(movie)
MATCH (actor)-[:ACTED_IN]->(actormovies)
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)WITH DISTINCT movies as related,
count(DISTINCT keyword) as weight,
count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as
genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name)
as writersORDER BY weight DESC, actormoviesweight DESC
WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie,
collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres,
directors, writers
MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies)
WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors,
genres, directors, writers
ORDER BY keyword_weight
RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })
MATCH (genre)<-[:HAS_GENRE]-(movie)
MATCH (director)-[:DIRECTED]->(movie)
MATCH (actor)-[:ACTED_IN]->(movie)
MATCH (writer)-[:WRITER_OF]->(movie)
MATCH (actor)-[:ACTED_IN]->(actormovies)
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword) as weight,
count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as
genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name)
as writers
ORDER BY weight DESC, actormoviesweight DESCWITH collect(DISTINCT {name: actor.name, weight:
actormoviesweight}) as actors,
movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as
related, genres, directors, writers
MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies)
WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors,
genres, directors, writers
ORDER BY keyword_weight
RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })
MATCH (genre)<-[:HAS_GENRE]-(movie)
MATCH (director)-[:DIRECTED]->(movie)
MATCH (actor)-[:ACTED_IN]->(movie)
MATCH (writer)-[:WRITER_OF]->(movie)
MATCH (actor)-[:ACTED_IN]->(actormovies)
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword) as weight,
count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as
genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name)
as writers
ORDER BY weight DESC, actormoviesweight DESC
WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors,
movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as
related, genres, directors, writersMATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-
[:HAS_KEYWORD]-(movies)
WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors,
genres, directors, writers
ORDER BY keyword_weight
RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })
MATCH (genre)<-[:HAS_GENRE]-(movie)
MATCH (director)-[:DIRECTED]->(movie)
MATCH (actor)-[:ACTED_IN]->(movie)
MATCH (writer)-[:WRITER_OF]->(movie)
MATCH (actor)-[:ACTED_IN]->(actormovies)
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword) as weight,
count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as
genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name)
as writers
ORDER BY weight DESC, actormoviesweight DESC
WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors,
movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as
related, genres, directors, writers
MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies)WITH keyword.name as keyword,
count(movies) as keyword_weight, movie, related,
actors, genres, directors, writers
ORDER BY keyword_weight
RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor)
WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight // 1 row per actor
ORDER BY actormoviesweight DESC
WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row
MATCH (movie)-[:HAS_GENRE]->(genre)
WITH movie, actors, collect(genre) as genres // 1 row
MATCH (director)-[:DIRECTED]->(movie)
WITH movie, actors, genres, collect(director.name) as directors // 1 row
MATCH (writer)-[:WRITER_OF]->(movie)
WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres,
directors, actors, writers // 1 row per related movie
ORDER BY keywords DESC
WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related,
movie, actors, genres, directors, writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)
RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers
10x faster
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor)
WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight // 1 row per actor
ORDER BY actormoviesweight DESC
WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row
MATCH (movie)-[:HAS_GENRE]->(genre)
WITH movie, actors, collect(genre) as genres // 1 row
MATCH (director)-[:DIRECTED]->(movie)
WITH movie, actors, genres, collect(director.name) as directors // 1 row
MATCH (writer)-[:WRITER_OF]->(movie)
WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres,
directors, actors, writers // 1 row per related movie
ORDER BY keywords DESC
WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related,
movie, actors, genres, directors, writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)
RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers
10x faster
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor)WITH movie, actor, length((actor)-[:ACTED_IN]-
>()) as actormoviesweight
ORDER BY actormoviesweight DESC // 1 row per actor
WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row
MATCH (movie)-[:HAS_GENRE]->(genre)
WITH movie, actors, collect(genre) as genres // 1 row
MATCH (director)-[:DIRECTED]->(movie)
WITH movie, actors, genres, collect(director.name) as directors // 1 row
MATCH (writer)-[:WRITER_OF]->(movie)
WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres,
directors, actors, writers // 1 row per related movie
ORDER BY keywords DESC
WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related,
movie, actors, genres, directors, writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)
RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers
10x faster
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor)
WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight
ORDER BY actormoviesweight DESC // 1 row per actorWITH movie, collect({name: actor.name, weight:
actormoviesweight}) as actors // 1 row
MATCH (movie)-[:HAS_GENRE]->(genre)
WITH movie, actors, collect(genre) as genres // 1 row
MATCH (director)-[:DIRECTED]->(movie)
WITH movie, actors, genres, collect(director.name) as directors // 1 row
MATCH (writer)-[:WRITER_OF]->(movie)
WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres,
directors, actors, writers // 1 row per related movie
ORDER BY keywords DESC
WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related,
movie, actors, genres, directors, writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)
RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers
10x faster
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor)
WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight
ORDER BY actormoviesweight DESC // 1 row per actor
WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row MATCH (movie)-[:HAS_GENRE]-
>(genre)
WITH movie, actors, collect(genre) as genres // 1 row MATCH (director)-[:DIRECTED]->(movie)
WITH movie, actors, genres, collect(director.name) as directors // 1 row
MATCH (writer)-[:WRITER_OF]->(movie)
WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres,
directors, actors, writers // 1 row per related movie
ORDER BY keywords DESC
WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related,
movie, actors, genres, directors, writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)
RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers
10x faster
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor)
WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight
ORDER BY actormoviesweight DESC // 1 row per actor
WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row
MATCH (movie)-[:HAS_GENRE]->(genre)
WITH movie, actors, collect(genre) as genres // 1 row
MATCH (director)-[:DIRECTED]->(movie)
WITH movie, actors, genres, collect(director.name) as directors // 1 row
MATCH (writer)-[:WRITER_OF]->(movie)
WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row MATCH (movie)-[:HAS_KEYWORD]-
>(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie,
genres, directors, actors, writers // 1 row per related movieORDER BY keywords DESC
WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related,
movie, actors, genres, directors, writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)
RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers
10x faster
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor)
WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight
ORDER BY actormoviesweight DESC // 1 row per actor
WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row
MATCH (movie)-[:HAS_GENRE]->(genre)
WITH movie, actors, collect(genre) as genres // 1 row
MATCH (director)-[:DIRECTED]->(movie)
WITH movie, actors, genres, collect(director.name) as directors // 1 row
MATCH (writer)-[:WRITER_OF]->(movie)
WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie,
genres, directors, actors, writers // 1 row per related movie
ORDER BY keywords DESCWITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as
related, movie, actors, genres, directors, writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)
RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers
10x faster
Movie query optimization
MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor)
WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight
ORDER BY actormoviesweight DESC // 1 row per actor
WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row
MATCH (movie)-[:HAS_GENRE]->(genre)
WITH movie, actors, collect(genre) as genres // 1 row
MATCH (director)-[:DIRECTED]->(movie)
WITH movie, actors, genres, collect(director.name) as directors // 1 row
MATCH (writer)-[:WRITER_OF]->(movie)
WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row
MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)
WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie,
genres, directors, actors, writers // 1 row per related movie
ORDER BY keywords DESC
WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as
related, movie, actors, genres, directors, writers // 1 rowMATCH (movie)-[:HAS_KEYWORD]->(keyword)
RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors,
writers // 1 row
10x faster
Design for Queryability
Model
Design for Queryability
Query
Design for Queryability
Model
Making the implicit explicit
• When you have implicit relationships in the
graph you can sometimes get better query
performance by modeling the relationship
explicitly
Making the implicit explicit
Refactor property to node
Bad:
MATCH (g:Game)
WHERE
g.date > 1343779200
AND g.date < 1369094400
RETURN g
Good:
MATCH (s:Season)-[:contains]->(g)
WHERE season.name = "2012-2013"
RETURN g
Refactor property to node
Conclusion
• Avoid the global scan
• Add indexes / unique constraints
• Split up MATCH statements
• Measure, measure, measure, tweak, repeat
• Soon Cypher will do a lot of this for you!
Bonus tip
• Use transactions/transactional cypher
endpoint
Q & A
• If you have them send them in

Contenu connexe

Tendances

Webinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanWebinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanGabriele Bartolini
 
Percona Live 2022 - MySQL Architectures
Percona Live 2022 - MySQL ArchitecturesPercona Live 2022 - MySQL Architectures
Percona Live 2022 - MySQL ArchitecturesFrederic Descamps
 
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data ScienceScaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data ScienceNeo4j
 
Brochure industrial-ethernet-switches-english
Brochure industrial-ethernet-switches-englishBrochure industrial-ethernet-switches-english
Brochure industrial-ethernet-switches-englishFranco Soto
 
Neo4j 4 Overview
Neo4j 4 OverviewNeo4j 4 Overview
Neo4j 4 OverviewNeo4j
 
Neo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseNeo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseMindfire Solutions
 
MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바NeoClova
 
MySQL Connectors 8.0.19 & DNS SRV
MySQL Connectors 8.0.19 & DNS SRVMySQL Connectors 8.0.19 & DNS SRV
MySQL Connectors 8.0.19 & DNS SRVKenny Gryp
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseDatabricks
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinDataStax Academy
 
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfThe Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfNeo4j
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainNeo4j
 
Kafka Connect - debezium
Kafka Connect - debeziumKafka Connect - debezium
Kafka Connect - debeziumKasun Don
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performancePostgreSQL-Consulting
 
Neo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic trainingNeo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic trainingNeo4j
 
Percona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database SystemPercona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database SystemFrederic Descamps
 
Using Wildcards with rsyslog's File Monitor imfile
Using Wildcards with rsyslog's File Monitor imfileUsing Wildcards with rsyslog's File Monitor imfile
Using Wildcards with rsyslog's File Monitor imfileRainer Gerhards
 
Intermediate Cypher.pdf
Intermediate Cypher.pdfIntermediate Cypher.pdf
Intermediate Cypher.pdfNeo4j
 

Tendances (20)

Webinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanWebinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with Barman
 
Percona Live 2022 - MySQL Architectures
Percona Live 2022 - MySQL ArchitecturesPercona Live 2022 - MySQL Architectures
Percona Live 2022 - MySQL Architectures
 
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data ScienceScaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
 
Brochure industrial-ethernet-switches-english
Brochure industrial-ethernet-switches-englishBrochure industrial-ethernet-switches-english
Brochure industrial-ethernet-switches-english
 
Neo4j graph database
Neo4j graph databaseNeo4j graph database
Neo4j graph database
 
Neo4j 4 Overview
Neo4j 4 OverviewNeo4j 4 Overview
Neo4j 4 Overview
 
Neo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseNeo4J : Introduction to Graph Database
Neo4J : Introduction to Graph Database
 
MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바
 
MySQL Connectors 8.0.19 & DNS SRV
MySQL Connectors 8.0.19 & DNS SRVMySQL Connectors 8.0.19 & DNS SRV
MySQL Connectors 8.0.19 & DNS SRV
 
MySQL 8
MySQL 8MySQL 8
MySQL 8
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
 
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfThe Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdf
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & Bahrain
 
Kafka Connect - debezium
Kafka Connect - debeziumKafka Connect - debezium
Kafka Connect - debezium
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
Neo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic trainingNeo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic training
 
Percona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database SystemPercona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database System
 
Using Wildcards with rsyslog's File Monitor imfile
Using Wildcards with rsyslog's File Monitor imfileUsing Wildcards with rsyslog's File Monitor imfile
Using Wildcards with rsyslog's File Monitor imfile
 
Intermediate Cypher.pdf
Intermediate Cypher.pdfIntermediate Cypher.pdf
Intermediate Cypher.pdf
 

En vedette

Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4jNeo4j
 
Neo4j tms
Neo4j tmsNeo4j tms
Neo4j tms_mdev_
 
OrientDB - the 2nd generation of (Multi-Model) NoSQL
OrientDB - the 2nd generation  of  (Multi-Model) NoSQLOrientDB - the 2nd generation  of  (Multi-Model) NoSQL
OrientDB - the 2nd generation of (Multi-Model) NoSQLLuigi Dell'Aquila
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to Cypherjexp
 
Downloading the internet with Python + Scrapy
Downloading the internet with Python + ScrapyDownloading the internet with Python + Scrapy
Downloading the internet with Python + ScrapyErin Shellman
 
Apache HBase Application Archetypes
Apache HBase Application ArchetypesApache HBase Application Archetypes
Apache HBase Application ArchetypesCloudera, Inc.
 
Building a Graph-based Analytics Platform
Building a Graph-based Analytics PlatformBuilding a Graph-based Analytics Platform
Building a Graph-based Analytics PlatformKenny Bastani
 
OrientDB introduction - NoSQL
OrientDB introduction - NoSQLOrientDB introduction - NoSQL
OrientDB introduction - NoSQLLuca Garulli
 
OrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databasesOrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databasesCurtis Mosters
 
Neo4j -[:LOVES]-> Cypher
Neo4j -[:LOVES]-> CypherNeo4j -[:LOVES]-> Cypher
Neo4j -[:LOVES]-> Cypherjexp
 
Building Cloud Native Architectures with Spring
Building Cloud Native Architectures with SpringBuilding Cloud Native Architectures with Spring
Building Cloud Native Architectures with SpringKenny Bastani
 
OrientDB Distributed Architecture v2.0
OrientDB Distributed Architecture v2.0OrientDB Distributed Architecture v2.0
OrientDB Distributed Architecture v2.0Orient Technologies
 
Big Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkBig Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkKenny Bastani
 
Bootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jBootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jMax De Marzi
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentationjexp
 
GraphTalks Rome - The Italian Business Graph
GraphTalks Rome - The Italian Business GraphGraphTalks Rome - The Italian Business Graph
GraphTalks Rome - The Italian Business GraphNeo4j
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 

En vedette (20)

Cypher
CypherCypher
Cypher
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
Neo4j tms
Neo4j tmsNeo4j tms
Neo4j tms
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
OrientDB
OrientDBOrientDB
OrientDB
 
OrientDB - the 2nd generation of (Multi-Model) NoSQL
OrientDB - the 2nd generation  of  (Multi-Model) NoSQLOrientDB - the 2nd generation  of  (Multi-Model) NoSQL
OrientDB - the 2nd generation of (Multi-Model) NoSQL
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to Cypher
 
Downloading the internet with Python + Scrapy
Downloading the internet with Python + ScrapyDownloading the internet with Python + Scrapy
Downloading the internet with Python + Scrapy
 
Apache HBase Application Archetypes
Apache HBase Application ArchetypesApache HBase Application Archetypes
Apache HBase Application Archetypes
 
Building a Graph-based Analytics Platform
Building a Graph-based Analytics PlatformBuilding a Graph-based Analytics Platform
Building a Graph-based Analytics Platform
 
OrientDB introduction - NoSQL
OrientDB introduction - NoSQLOrientDB introduction - NoSQL
OrientDB introduction - NoSQL
 
OrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databasesOrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databases
 
Neo4j -[:LOVES]-> Cypher
Neo4j -[:LOVES]-> CypherNeo4j -[:LOVES]-> Cypher
Neo4j -[:LOVES]-> Cypher
 
Building Cloud Native Architectures with Spring
Building Cloud Native Architectures with SpringBuilding Cloud Native Architectures with Spring
Building Cloud Native Architectures with Spring
 
OrientDB Distributed Architecture v2.0
OrientDB Distributed Architecture v2.0OrientDB Distributed Architecture v2.0
OrientDB Distributed Architecture v2.0
 
Big Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkBig Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache Spark
 
Bootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jBootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4j
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentation
 
GraphTalks Rome - The Italian Business Graph
GraphTalks Rome - The Italian Business GraphGraphTalks Rome - The Italian Business Graph
GraphTalks Rome - The Italian Business Graph
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 

Similaire à Optimizing Cypher Queries in Neo4j

How to win $10m - analysing DOTA2 data in R (Sheffield R Users Group - May)
How to win $10m - analysing DOTA2 data in R (Sheffield R Users Group - May)How to win $10m - analysing DOTA2 data in R (Sheffield R Users Group - May)
How to win $10m - analysing DOTA2 data in R (Sheffield R Users Group - May)Paul Richards
 
Delivering Winning Results with Sports Analytics and HPCC Systems
Delivering Winning Results with Sports Analytics and HPCC SystemsDelivering Winning Results with Sports Analytics and HPCC Systems
Delivering Winning Results with Sports Analytics and HPCC SystemsHPCC Systems
 
How to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
How to write SQL queries | pgDay Paris 2019 | Dimitri FontaineHow to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
How to write SQL queries | pgDay Paris 2019 | Dimitri FontaineCitus Data
 
MongoDB.local Austin 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local Austin 2018: Tips and Tricks for Avoiding Common Query PitfallsMongoDB.local Austin 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local Austin 2018: Tips and Tricks for Avoiding Common Query PitfallsMongoDB
 
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2Neo4j
 
Tips and Tricks for Avoiding Common Query Pitfalls
Tips and Tricks for Avoiding Common Query PitfallsTips and Tricks for Avoiding Common Query Pitfalls
Tips and Tricks for Avoiding Common Query PitfallsMongoDB
 
Extracting And Analyzing Cricket Statistics with R and Pyspark
Extracting And Analyzing Cricket Statistics with R and PysparkExtracting And Analyzing Cricket Statistics with R and Pyspark
Extracting And Analyzing Cricket Statistics with R and PysparkParag Ahire
 
Building Streaming Recommendation Engines on Apache Spark with Rui Vieira
Building Streaming Recommendation Engines on Apache Spark with Rui VieiraBuilding Streaming Recommendation Engines on Apache Spark with Rui Vieira
Building Streaming Recommendation Engines on Apache Spark with Rui VieiraDatabricks
 
Football graph - Neo4j and the Premier League
Football graph - Neo4j and the Premier LeagueFootball graph - Neo4j and the Premier League
Football graph - Neo4j and the Premier LeagueMark Needham
 
Predictions European Championships 2020
Predictions European Championships 2020Predictions European Championships 2020
Predictions European Championships 2020Ruben Kerkhofs
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineeringJulian Hyde
 
Mignesh Birdi Assignment3.pdf
Mignesh Birdi Assignment3.pdfMignesh Birdi Assignment3.pdf
Mignesh Birdi Assignment3.pdfmigneshbirdi
 
MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...
MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...
MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...MongoDB
 
Common Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo appsCommon Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo appsOdoo
 
MongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB
 
Tips and Tricks for Avoiding Common Query Pitfalls
Tips and Tricks for Avoiding Common Query PitfallsTips and Tricks for Avoiding Common Query Pitfalls
Tips and Tricks for Avoiding Common Query PitfallsMongoDB
 
Simple Strategies for faster knowledge discovery in big data
Simple Strategies for faster knowledge discovery in big dataSimple Strategies for faster knowledge discovery in big data
Simple Strategies for faster knowledge discovery in big dataRitesh Agrawal
 
Row Pattern Matching in Oracle Database 12c
Row Pattern Matching in Oracle Database 12cRow Pattern Matching in Oracle Database 12c
Row Pattern Matching in Oracle Database 12cStew Ashton
 
Microsoft NERD Talk - R and Tableau - 2-4-2013
Microsoft NERD Talk - R and Tableau - 2-4-2013Microsoft NERD Talk - R and Tableau - 2-4-2013
Microsoft NERD Talk - R and Tableau - 2-4-2013Tanya Cashorali
 

Similaire à Optimizing Cypher Queries in Neo4j (20)

How to win $10m - analysing DOTA2 data in R (Sheffield R Users Group - May)
How to win $10m - analysing DOTA2 data in R (Sheffield R Users Group - May)How to win $10m - analysing DOTA2 data in R (Sheffield R Users Group - May)
How to win $10m - analysing DOTA2 data in R (Sheffield R Users Group - May)
 
Delivering Winning Results with Sports Analytics and HPCC Systems
Delivering Winning Results with Sports Analytics and HPCC SystemsDelivering Winning Results with Sports Analytics and HPCC Systems
Delivering Winning Results with Sports Analytics and HPCC Systems
 
How to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
How to write SQL queries | pgDay Paris 2019 | Dimitri FontaineHow to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
How to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
 
MongoDB.local Austin 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local Austin 2018: Tips and Tricks for Avoiding Common Query PitfallsMongoDB.local Austin 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local Austin 2018: Tips and Tricks for Avoiding Common Query Pitfalls
 
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2
 
Tips and Tricks for Avoiding Common Query Pitfalls
Tips and Tricks for Avoiding Common Query PitfallsTips and Tricks for Avoiding Common Query Pitfalls
Tips and Tricks for Avoiding Common Query Pitfalls
 
Extracting And Analyzing Cricket Statistics with R and Pyspark
Extracting And Analyzing Cricket Statistics with R and PysparkExtracting And Analyzing Cricket Statistics with R and Pyspark
Extracting And Analyzing Cricket Statistics with R and Pyspark
 
Building Streaming Recommendation Engines on Apache Spark with Rui Vieira
Building Streaming Recommendation Engines on Apache Spark with Rui VieiraBuilding Streaming Recommendation Engines on Apache Spark with Rui Vieira
Building Streaming Recommendation Engines on Apache Spark with Rui Vieira
 
Football graph - Neo4j and the Premier League
Football graph - Neo4j and the Premier LeagueFootball graph - Neo4j and the Premier League
Football graph - Neo4j and the Premier League
 
Predictions European Championships 2020
Predictions European Championships 2020Predictions European Championships 2020
Predictions European Championships 2020
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineering
 
Mignesh Birdi Assignment3.pdf
Mignesh Birdi Assignment3.pdfMignesh Birdi Assignment3.pdf
Mignesh Birdi Assignment3.pdf
 
MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...
MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...
MongoDB World 2019: How to Keep an Average API Response Time Less than 5ms wi...
 
Common Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo appsCommon Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo apps
 
MongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
 
R for you
R for youR for you
R for you
 
Tips and Tricks for Avoiding Common Query Pitfalls
Tips and Tricks for Avoiding Common Query PitfallsTips and Tricks for Avoiding Common Query Pitfalls
Tips and Tricks for Avoiding Common Query Pitfalls
 
Simple Strategies for faster knowledge discovery in big data
Simple Strategies for faster knowledge discovery in big dataSimple Strategies for faster knowledge discovery in big data
Simple Strategies for faster knowledge discovery in big data
 
Row Pattern Matching in Oracle Database 12c
Row Pattern Matching in Oracle Database 12cRow Pattern Matching in Oracle Database 12c
Row Pattern Matching in Oracle Database 12c
 
Microsoft NERD Talk - R and Tableau - 2-4-2013
Microsoft NERD Talk - R and Tableau - 2-4-2013Microsoft NERD Talk - R and Tableau - 2-4-2013
Microsoft NERD Talk - R and Tableau - 2-4-2013
 

Plus de Neo4j

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansNeo4j
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...Neo4j
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosNeo4j
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Neo4j
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Neo4j
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsNeo4j
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...Neo4j
 

Plus de Neo4j (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
 

Dernier

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 

Dernier (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 

Optimizing Cypher Queries in Neo4j

  • 1. Optimizing Cypher Queries in Neo4j Wes Freeman (@wefreema) Mark Needham (@markhneedham)
  • 2. Today's schedule • Brief overview of cypher syntax • Graph global vs Graph local queries • Labels and indexes • Optimization patterns • Profiling cypher queries • Applying optimization patterns
  • 3. Cypher Syntax • Statement parts o Optional: Querying part (MATCH|WHERE) o Optional: Updating part (CREATE|MERGE) o Optional: Returning part (WITH|RETURN) • Parts can be chained together
  • 4. Cypher Syntax - Refresher MATCH (n:Label)-[r:LINKED]->(m) WHERE n.prop = "..." RETURN n, r, m
  • 5. Starting points • Graph scan (global; potentially slow) • Label scan (usually reserved for aggregation queries; not ideal) • Label property index lookup (local; good!)
  • 7. The 1.9 global scan O(n) n = # of nodes START pl = node(*) MATCH (pl)-[:played]->(stats) WHERE pl.name = "Wayne Rooney" RETURN stats 150ms w/ 30k nodes, 120k rels
  • 8. The 2.0 global scan MATCH (pl)-[:played]->(stats) WHERE pl.name = "Wayne Rooney" RETURN stats 130ms w/ 30k nodes, 120k rels O(n) n = # of nodes
  • 9. Why is it a global scan? • Cypher is a pattern matching language • It doesn't discriminate unless you tell it to o It must try to start at all nodes to find this pattern, as specified
  • 10. Introduce a label Label your starting points CREATE (player:Player {name: "Wayne Rooney"} )
  • 11. O(k) k = # of nodes with that labelLabel scan MATCH (pl:Player)-[:played]->(stats) WHERE pl.name = "Wayne Rooney" RETURN stats 80ms w/ 30k nodes, 120k rels (~900 :Player nodes)
  • 12. Indexes don't come for free CREATE INDEX ON :Player(name) OR CREATE CONSTRAINT ON pl:Player ASSERT pl.name IS UNIQUE
  • 13. O(log k) k = # of nodes with that labelIndex lookup MATCH (pl:Player)-[:played]->(stats) WHERE pl.name = "Wayne Rooney" RETURN stats 6ms w/ 30k nodes, 120k rels (~900 :Player nodes)
  • 14. Optimization Patterns • Avoid cartesian products • Avoid patterns in the WHERE clause • Start MATCH patterns at the lowest cardinality and expand outward • Separate MATCH patterns with minimal expansion at each stage
  • 16. Anti-pattern: Cartesian Products MATCH (m:Movie), (p:Person)
  • 17. Subtle Cartesian Products MATCH (p:Person)-[:KNOWS]->(c) WHERE p.name="Tom Hanks" WITH c MATCH (k:Keyword) RETURN c, k
  • 18. Counting Cartesian Products MATCH (pl:Player),(t:Team),(g:Game) RETURN COUNT(DISTINCT pl), COUNT(DISTINCT t), COUNT(DISTINCT g) 80000 ms w/ ~900 players, ~40 teams, ~1200 games
  • 19. MATCH (pl:Player) WITH COUNT(pl) as players MATCH (t:Team) WITH COUNT(t) as teams, players MATCH (g:Game) RETURN COUNT(g) as games, teams, players8ms w/ ~900 players, ~40 teams, ~1200 games Better Counting
  • 20. Directions on patterns MATCH (p:Person)-[:ACTED_IN]-(m) WHERE p.name = "Tom Hanks" RETURN m
  • 21. Parameterize your queries MATCH (p:Person)-[:ACTED_IN]-(m) WHERE p.name = {name} RETURN m
  • 22. Fast predicates first Bad: MATCH (t:Team)-[:played_in]->(g) WHERE NOT (t)-[:home_team]->(g) AND g.away_goals > g.home_goals RETURN t, COUNT(g)
  • 23. Better: MATCH (t:Team)-[:played_in]->(g) WHERE g.away_goals > g.home_goals AND NOT (t)-[:home_team]->() RETURN t, COUNT(g) Fast predicates first
  • 24. Patterns in WHERE clauses • Keep them in the MATCH • The only pattern that needs to be in a WHERE clause is a NOT
  • 25. MERGE and CONSTRAINTs • MERGE is MATCH or CREATE • MERGE can take advantage of unique constraints and indexes
  • 26. MERGE (without index) MERGE (g:Game {date:1290257100, time: 1245, home_goals: 2, away_goals: 3, match_id: 292846, attendance: 60102}) RETURN g 188 ms w/ ~400 games
  • 27. Adding an index CREATE INDEX ON :Game(match_id)
  • 28. MERGE (with index) MERGE (g:Game {date:1290257100, time: 1245, home_goals: 2, away_goals: 3, match_id: 292846, attendance: 60102}) RETURN g 6 ms w/ ~400 games
  • 29. Alternative MERGE approach MERGE (g:Game { match_id: 292846 }) ON CREATE SET g.date = 1290257100 SET g.time = 1245 SET g.home_goals = 2 SET g.away_goals = 3 SET g.attendance = 60102 RETURN g
  • 30. Profiling queries • Use the PROFILE keyword in front of the query o from webadmin or shell - won't work in browser • Look for db_hits and rows • Ignore everything else (for now!)
  • 32. Football Optimization MATCH (game)<-[:contains_match]-(season:Season), (team)<-[:away_team]-(game), (stats)-[:in]->(game), (team)<-[:for]-(stats)<-[:played]-(player) WHERE season.name = "2012-2013" RETURN player.name, COLLECT(DISTINCT team.name), SUM(stats.goals) as goals ORDER BY goals DESC LIMIT 103137 ms w/ ~900 players, ~20 teams, ~400 games
  • 33. Football Optimization ==> ColumnFilter(symKeys=["player.name", " INTERNAL_AGGREGATEe91b055b-a943-4ddd-9fe8-e746407c504a", " INTERNAL_AGGREGATE240cfcd2-24d9-48a2-8ca9-fb0286f3d323"], returnItemNames=["player.name", "COLLECT(DISTINCT team.name)", "goals"], _rows=10, _db_hits=0) ==> Top(orderBy=["SortItem(Cached( INTERNAL_AGGREGATE240cfcd2-24d9-48a2-8ca9-fb0286f3d323 of type Number),false)"], limit="Literal(10)", _rows=10, _db_hits=0) ==> EagerAggregation(keys=["Cached(player.name of type Any)"], aggregates=["( INTERNAL_AGGREGATEe91b055b-a943-4ddd-9fe8- e746407c504a,Distinct(Collect(Property(team,name(0))),Property(team,name(0))))", "( INTERNAL_AGGREGATE240cfcd2-24d9-48a2- 8ca9-fb0286f3d323,Sum(Property(stats,goals(13))))"], _rows=503, _db_hits=10899) ==> Extract(symKeys=["stats", " UNNAMED12", " UNNAMED108", "season", " UNNAMED55", "player", "team", " UNNAMED124", " UNNAMED85", "game"], exprKeys=["player.name"], _rows=5192, _db_hits=5192) ==> PatternMatch(g="(player)-[' UNNAMED124']-(stats)", _rows=5192, _db_hits=0) ==> Filter(pred="Property(season,name(0)) == Literal(2012-2013)", _rows=5192, _db_hits=15542) ==> TraversalMatcher(trail="(season)-[ UNNAMED12:contains_match WHERE true AND true]->(game)<-[ UNNAMED85:in WHERE true AND true]-(stats)-[ UNNAMED108:for WHERE true AND true]->(team)<-[ UNNAMED55:away_team WHERE true AND true]- (game)", _rows=15542, _db_hits=1620462)
  • 34. Break out the match statements MATCH (game)<-[:contains_match]-(season:Season) MATCH (team)<-[:away_team]-(game) MATCH (stats)-[:in]->(game) MATCH (team)<-[:for]-(stats)<-[:played]-(player) WHERE season.name = "2012-2013" RETURN player.name, COLLECT(DISTINCT team.name), SUM(stats.goals) as goals ORDER BY goals DESC LIMIT 10200 ms w/ ~900 players, ~20 teams, ~400 games
  • 35. Start small • Smallest cardinality label first • Smallest intermediate result set first
  • 36. Exploring cardinalities MATCH (game)<-[:contains_match]-(season:Season) RETURN COUNT(DISTINCT game), COUNT(DISTINCT season) 1140 games, 3 seasons MATCH (team)<-[:away_team]-(game:Game) RETURN COUNT(DISTINCT team), COUNT(DISTINCT game) 25 teams, 1140 games
  • 37. Exploring cardinalities MATCH (stats)-[:in]->(game:Game) RETURN COUNT(DISTINCT stats), COUNT(DISTINCT game) 31117 stats, 1140 games MATCH (stats)<-[:played]-(player:Player) RETURN COUNT(DISTINCT stats), COUNT(DISTINCT player) 31117 stats, 880 players
  • 38. Look for teams first MATCH (team)<-[:away_team]-(game:Game) MATCH (game)<-[:contains_match]-(season) WHERE season.name = "2012-2013" MATCH (stats)-[:in]->(game) MATCH (team)<-[:for]-(stats)<-[:played]-(player) RETURN player.name, COLLECT(DISTINCT team.name), SUM(stats.goals) as goals ORDER BY goals DESC LIMIT 10162 ms w/ ~900 players, ~20 teams, ~400 games
  • 39. ==> ColumnFilter(symKeys=["player.name", " INTERNAL_AGGREGATEbb08f36b-a70d-46b3-9297-b0c7ec85c969", " INTERNAL_AGGREGATE199af213-e3bd-400f-aba9-8ca2a9e153c5"], returnItemNames=["player.name", "COLLECT(DISTINCT team.name)", "goals"], _rows=10, _db_hits=0) ==> Top(orderBy=["SortItem(Cached( INTERNAL_AGGREGATE199af213-e3bd-400f-aba9-8ca2a9e153c5 of type Number),false)"], limit="Literal(10)", _rows=10, _db_hits=0) ==> EagerAggregation(keys=["Cached(player.name of type Any)"], aggregates=["( INTERNAL_AGGREGATEbb08f36b-a70d-46b3-9297- b0c7ec85c969,Distinct(Collect(Property(team,name(0))),Property(team,name(0))))", "( INTERNAL_AGGREGATE199af213-e3bd-400f- aba9-8ca2a9e153c5,Sum(Property(stats,goals(13))))"], _rows=503, _db_hits=10899) ==> Extract(symKeys=["stats", " UNNAMED12", " UNNAMED168", "season", " UNNAMED125", "player", "team", " UNNAMED152", " UNNAMED51", "game"], exprKeys=["player.name"], _rows=5192, _db_hits=5192) ==> PatternMatch(g="(stats)-[' UNNAMED152']-(team),(player)-[' UNNAMED168']-(stats)", _rows=5192, _db_hits=0) ==> PatternMatch(g="(stats)-[' UNNAMED125']-(game)", _rows=10394, _db_hits=0) ==> Filter(pred="Property(season,name(0)) == Literal(2012-2013)", _rows=380, _db_hits=380) ==> PatternMatch(g="(season)-[' UNNAMED51']-(game)", _rows=380, _db_hits=1140) ==> TraversalMatcher(trail="(game)-[ UNNAMED12:away_team WHERE true AND true]->(team)", _rows=1140, _db_hits=1140) Look for teams first
  • 40. Filter games a bit earlier MATCH (game)<-[:contains_match]-(season:Season) WHERE season.name = "2012-2013" MATCH (team)<-[:away_team]-(game) MATCH (stats)-[:in]->(game) MATCH (team)<-[:for]-(stats)<-[:played]-(player) RETURN player.name, COLLECT(DISTINCT team.name), SUM(stats.goals) as goals ORDER BY goals DESC LIMIT 10148 ms w/ ~900 players, ~20 teams, ~400 games
  • 41. Filter out stats with no goals MATCH (game)<-[:contains_match]-(season:Season) WHERE season.name = "2012-2013" MATCH (team)<-[:away_team]-(game) MATCH (stats)-[:in]->(game)WHERE stats.goals > 0 MATCH (team)<-[:for]-(stats)<-[:played]-(player) RETURN player.name, COLLECT(DISTINCT team.name), SUM(stats.goals) as goals ORDER BY goals DESC LIMIT 10 59 ms w/ ~900 players, ~20 teams, ~400 games
  • 42. Movie query optimization MATCH (movie:Movie {title: {title} }) MATCH (genre)<-[:HAS_GENRE]-(movie) MATCH (director)-[:DIRECTED]->(movie) MATCH (actor)-[:ACTED_IN]->(movie) MATCH (writer)-[:WRITER_OF]->(movie) MATCH (actor)-[:ACTED_IN]->(actormovies) MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword) as weight, count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name) as writers ORDER BY weight DESC, actormoviesweight DESC WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres, directors, writers MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies) WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors, genres, directors, writers ORDER BY keyword_weight RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
  • 43. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' }) MATCH (genre)<-[:HAS_GENRE]-(movie) MATCH (director)-[:DIRECTED]->(movie) MATCH (actor)-[:ACTED_IN]->(movie) MATCH (writer)-[:WRITER_OF]->(movie) MATCH (actor)-[:ACTED_IN]->(actormovies) MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword) as weight, count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name) as writers ORDER BY weight DESC, actormoviesweight DESC WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres, directors, writers MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies) WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors, genres, directors, writers ORDER BY keyword_weight RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
  • 44. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' })MATCH (genre)<-[:HAS_GENRE]- (movie) MATCH (director)-[:DIRECTED]->(movie) MATCH (actor)-[:ACTED_IN]->(movie) MATCH (writer)-[:WRITER_OF]->(movie) MATCH (actor)-[:ACTED_IN]->(actormovies) MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword) as weight, count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name) as writers ORDER BY weight DESC, actormoviesweight DESC WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres, directors, writers MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies) WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors, genres, directors, writers ORDER BY keyword_weight RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
  • 45. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' }) MATCH (genre)<-[:HAS_GENRE]-(movie)MATCH (director)-[:DIRECTED]- >(movie) MATCH (actor)-[:ACTED_IN]->(movie) MATCH (writer)-[:WRITER_OF]->(movie) MATCH (actor)-[:ACTED_IN]->(actormovies) MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword) as weight, count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name) as writers ORDER BY weight DESC, actormoviesweight DESC WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres, directors, writers MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies) WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors, genres, directors, writers ORDER BY keyword_weight RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
  • 46. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' }) MATCH (genre)<-[:HAS_GENRE]-(movie) MATCH (director)-[:DIRECTED]->(movie)MATCH (actor)-[:ACTED_IN]- >(movie) MATCH (writer)-[:WRITER_OF]->(movie) MATCH (actor)-[:ACTED_IN]->(actormovies) MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword) as weight, count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name) as writers ORDER BY weight DESC, actormoviesweight DESC WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres, directors, writers MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies) WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors, genres, directors, writers ORDER BY keyword_weight RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
  • 47. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' }) MATCH (genre)<-[:HAS_GENRE]-(movie) MATCH (director)-[:DIRECTED]->(movie) MATCH (actor)-[:ACTED_IN]->(movie)MATCH (writer)-[:WRITER_OF]- >(movie) MATCH (actor)-[:ACTED_IN]->(actormovies) MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword) as weight, count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name) as writers ORDER BY weight DESC, actormoviesweight DESC WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres, directors, writers MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies) WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors, genres, directors, writers ORDER BY keyword_weight RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
  • 48. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' }) MATCH (genre)<-[:HAS_GENRE]-(movie) MATCH (director)-[:DIRECTED]->(movie) MATCH (actor)-[:ACTED_IN]->(movie) MATCH (writer)-[:WRITER_OF]->(movie)MATCH (actor)-[:ACTED_IN]- >(actormovies) MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword) as weight, count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name) as writers ORDER BY weight DESC, actormoviesweight DESC WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres, directors, writers MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies) WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors, genres, directors, writers ORDER BY keyword_weight RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
  • 49. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' }) MATCH (genre)<-[:HAS_GENRE]-(movie) MATCH (director)-[:DIRECTED]->(movie) MATCH (actor)-[:ACTED_IN]->(movie) MATCH (writer)-[:WRITER_OF]->(movie) MATCH (actor)-[:ACTED_IN]->(actormovies)MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword) as weight, count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name) as writers ORDER BY weight DESC, actormoviesweight DESC WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres, directors, writers MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies) WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors, genres, directors, writers ORDER BY keyword_weight RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
  • 50. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' }) MATCH (genre)<-[:HAS_GENRE]-(movie) MATCH (director)-[:DIRECTED]->(movie) MATCH (actor)-[:ACTED_IN]->(movie) MATCH (writer)-[:WRITER_OF]->(movie) MATCH (actor)-[:ACTED_IN]->(actormovies) MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie)WITH DISTINCT movies as related, count(DISTINCT keyword) as weight, count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name) as writersORDER BY weight DESC, actormoviesweight DESC WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres, directors, writers MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies) WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors, genres, directors, writers ORDER BY keyword_weight RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
  • 51. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' }) MATCH (genre)<-[:HAS_GENRE]-(movie) MATCH (director)-[:DIRECTED]->(movie) MATCH (actor)-[:ACTED_IN]->(movie) MATCH (writer)-[:WRITER_OF]->(movie) MATCH (actor)-[:ACTED_IN]->(actormovies) MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword) as weight, count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name) as writers ORDER BY weight DESC, actormoviesweight DESCWITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres, directors, writers MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies) WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors, genres, directors, writers ORDER BY keyword_weight RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
  • 52. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' }) MATCH (genre)<-[:HAS_GENRE]-(movie) MATCH (director)-[:DIRECTED]->(movie) MATCH (actor)-[:ACTED_IN]->(movie) MATCH (writer)-[:WRITER_OF]->(movie) MATCH (actor)-[:ACTED_IN]->(actormovies) MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword) as weight, count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name) as writers ORDER BY weight DESC, actormoviesweight DESC WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres, directors, writersMATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<- [:HAS_KEYWORD]-(movies) WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors, genres, directors, writers ORDER BY keyword_weight RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
  • 53. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' }) MATCH (genre)<-[:HAS_GENRE]-(movie) MATCH (director)-[:DIRECTED]->(movie) MATCH (actor)-[:ACTED_IN]->(movie) MATCH (writer)-[:WRITER_OF]->(movie) MATCH (actor)-[:ACTED_IN]->(actormovies) MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword) as weight, count(DISTINCT actormovies) as actormoviesweight, movie, collect(DISTINCT genre.name) as genres, collect(DISTINCT director.name) as directors, actor, collect(DISTINCT writer.name) as writers ORDER BY weight DESC, actormoviesweight DESC WITH collect(DISTINCT {name: actor.name, weight: actormoviesweight}) as actors, movie, collect(DISTINCT {related: {title: related.title}, weight: weight}) as related, genres, directors, writers MATCH (movie)-[:HAS_KEYWORD]->(keyword:Keyword)<-[:HAS_KEYWORD]-(movies)WITH keyword.name as keyword, count(movies) as keyword_weight, movie, related, actors, genres, directors, writers ORDER BY keyword_weight RETURN collect(DISTINCT keyword), movie, actors, related, genres, directors, writers
  • 54. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor) WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight // 1 row per actor ORDER BY actormoviesweight DESC WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row MATCH (movie)-[:HAS_GENRE]->(genre) WITH movie, actors, collect(genre) as genres // 1 row MATCH (director)-[:DIRECTED]->(movie) WITH movie, actors, genres, collect(director.name) as directors // 1 row MATCH (writer)-[:WRITER_OF]->(movie) WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres, directors, actors, writers // 1 row per related movie ORDER BY keywords DESC WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related, movie, actors, genres, directors, writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword) RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers 10x faster
  • 55. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor) WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight // 1 row per actor ORDER BY actormoviesweight DESC WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row MATCH (movie)-[:HAS_GENRE]->(genre) WITH movie, actors, collect(genre) as genres // 1 row MATCH (director)-[:DIRECTED]->(movie) WITH movie, actors, genres, collect(director.name) as directors // 1 row MATCH (writer)-[:WRITER_OF]->(movie) WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres, directors, actors, writers // 1 row per related movie ORDER BY keywords DESC WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related, movie, actors, genres, directors, writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword) RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers 10x faster
  • 56. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor)WITH movie, actor, length((actor)-[:ACTED_IN]- >()) as actormoviesweight ORDER BY actormoviesweight DESC // 1 row per actor WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row MATCH (movie)-[:HAS_GENRE]->(genre) WITH movie, actors, collect(genre) as genres // 1 row MATCH (director)-[:DIRECTED]->(movie) WITH movie, actors, genres, collect(director.name) as directors // 1 row MATCH (writer)-[:WRITER_OF]->(movie) WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres, directors, actors, writers // 1 row per related movie ORDER BY keywords DESC WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related, movie, actors, genres, directors, writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword) RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers 10x faster
  • 57. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor) WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight ORDER BY actormoviesweight DESC // 1 row per actorWITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row MATCH (movie)-[:HAS_GENRE]->(genre) WITH movie, actors, collect(genre) as genres // 1 row MATCH (director)-[:DIRECTED]->(movie) WITH movie, actors, genres, collect(director.name) as directors // 1 row MATCH (writer)-[:WRITER_OF]->(movie) WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres, directors, actors, writers // 1 row per related movie ORDER BY keywords DESC WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related, movie, actors, genres, directors, writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword) RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers 10x faster
  • 58. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor) WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight ORDER BY actormoviesweight DESC // 1 row per actor WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row MATCH (movie)-[:HAS_GENRE]- >(genre) WITH movie, actors, collect(genre) as genres // 1 row MATCH (director)-[:DIRECTED]->(movie) WITH movie, actors, genres, collect(director.name) as directors // 1 row MATCH (writer)-[:WRITER_OF]->(movie) WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres, directors, actors, writers // 1 row per related movie ORDER BY keywords DESC WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related, movie, actors, genres, directors, writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword) RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers 10x faster
  • 59. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor) WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight ORDER BY actormoviesweight DESC // 1 row per actor WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row MATCH (movie)-[:HAS_GENRE]->(genre) WITH movie, actors, collect(genre) as genres // 1 row MATCH (director)-[:DIRECTED]->(movie) WITH movie, actors, genres, collect(director.name) as directors // 1 row MATCH (writer)-[:WRITER_OF]->(movie) WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row MATCH (movie)-[:HAS_KEYWORD]- >(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres, directors, actors, writers // 1 row per related movieORDER BY keywords DESC WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related, movie, actors, genres, directors, writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword) RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers 10x faster
  • 60. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor) WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight ORDER BY actormoviesweight DESC // 1 row per actor WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row MATCH (movie)-[:HAS_GENRE]->(genre) WITH movie, actors, collect(genre) as genres // 1 row MATCH (director)-[:DIRECTED]->(movie) WITH movie, actors, genres, collect(director.name) as directors // 1 row MATCH (writer)-[:WRITER_OF]->(movie) WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres, directors, actors, writers // 1 row per related movie ORDER BY keywords DESCWITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related, movie, actors, genres, directors, writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword) RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers 10x faster
  • 61. Movie query optimization MATCH (movie:Movie {title: 'The Matrix' })<-[:ACTED_IN]-(actor) WITH movie, actor, length((actor)-[:ACTED_IN]->()) as actormoviesweight ORDER BY actormoviesweight DESC // 1 row per actor WITH movie, collect({name: actor.name, weight: actormoviesweight}) as actors // 1 row MATCH (movie)-[:HAS_GENRE]->(genre) WITH movie, actors, collect(genre) as genres // 1 row MATCH (director)-[:DIRECTED]->(movie) WITH movie, actors, genres, collect(director.name) as directors // 1 row MATCH (writer)-[:WRITER_OF]->(movie) WITH movie, actors, genres, directors, collect(writer.name) as writers // 1 row MATCH (movie)-[:HAS_KEYWORD]->(keyword)<-[:HAS_KEYWORD]-(movies:Movie) WITH DISTINCT movies as related, count(DISTINCT keyword.name) as keywords, movie, genres, directors, actors, writers // 1 row per related movie ORDER BY keywords DESC WITH collect(DISTINCT { related: { title: related.title }, weight: keywords }) as related, movie, actors, genres, directors, writers // 1 rowMATCH (movie)-[:HAS_KEYWORD]->(keyword) RETURN collect(keyword.name) as keywords, related, movie, actors, genres, directors, writers // 1 row 10x faster
  • 65. Making the implicit explicit • When you have implicit relationships in the graph you can sometimes get better query performance by modeling the relationship explicitly
  • 67. Refactor property to node Bad: MATCH (g:Game) WHERE g.date > 1343779200 AND g.date < 1369094400 RETURN g
  • 68. Good: MATCH (s:Season)-[:contains]->(g) WHERE season.name = "2012-2013" RETURN g Refactor property to node
  • 69. Conclusion • Avoid the global scan • Add indexes / unique constraints • Split up MATCH statements • Measure, measure, measure, tweak, repeat • Soon Cypher will do a lot of this for you!
  • 70. Bonus tip • Use transactions/transactional cypher endpoint
  • 71. Q & A • If you have them send them in