Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Discovering+the+Power+of+Dark+Data+
Or:$The$Value$of$Rela/onships$
Javazone$Oslo$Sept$2015$
Michael+Hunger+
Caretaker,$Neo4j$Community$
Ask$me$anything$
$
michael@neo4j.com$$
$
@mesirii$|$@neo4j$
The+world+is+a+graph+–+everything+is+connected+
•  people,&places,&events&
•  companies,&markets&
•  countries,&history,&p...
Meet+Jan+
Kontext+is+King+
+
Community$Graph$
Neo4j$
Let‘s+have+a+look.+
by&Mike&Nolan&CC&BY?SA&3.0&
NODE&
key:&“value”&
proper4es&
Property+Graph+Model+
Nodes&
•  The&en44es&in&the&graph&
•  Can&have&name?value&proper%es'
...
CAR&
name:&“Jan”&
last:&Hansson&
born:&1978&
name:&“Anne”&
last:&Olsen&
since:&&
Jan&10,&2011&
brand:&“Volvo”&
model:&“V70...
Your+friend+Neo4j+
An&openLsource$graph$database&
•  Manager+and+store&your&connected+
data+as&a&graph+
•  Query+relaEonsh...
Your+friend+Neo4j+
An&openLsource$graph$database$
&
•  Built+for+RelaEonships+
•  Open+Source+
•  Java+&+Scala+
•  High+Pe...
Whiteboard+Friendly+Graph+Modeling+
Graph+Query+Language:+Cypher+
&&&&&&&&&&&&&&(jan:Person&{&name:“Jan”}&)&?[:LOVES]?>&(anne:Person&{&name:“Anne”}&)&&
LOVES&...
Cypher:+Clauses+
• CREATE&pabern&
• MERGE&pabern&
• ADD&
• DELETE&
Cypher:+Clauses+
• MATCH&pabern&&
WHERE&pred&
• ORDER&BY&expr&&
SKIP&...&LIMIT&...&
• RETURN&expr&AS&alias&...&
Cypher:+Clauses+
• WITH&expr&AS&alias,&...&
• UNWIND&coll&AS&item&
• LOAD&CSV&FROM&„url“&AS&row&
GeYng+Data+into+Neo4j+
CypherSBased+“LOAD+CSV”+
•  Transac4onal&(ACID)&writes&
•  Ini4al&and&incremental&loads&of&up&to&&
...
GeYng+Data+into+Neo4j+
Load+JSON+with+Cypher+
•  Send&JSON&as&parameter&
•  Deconstruct&the&document&
•  Into&a&non?duplic...
GeYng+Data+into+Neo4j+
CSV+Bulk+Loader++++neo4j/import'
•  For&ini4al&database&popula4on&
•  For&loads&with&10B+&records&
...
Let‘s+LOAD+
USED& SHARED&
&&&
Core+Model+
POSTED&
&&
&
&
Full+Model+
Twi]er+
•  Run&a&twiber&search,&exclude&Neo4j&sources&
•  „neo4j&OR&#neo4j&OR&@neo4j&&
&?from:@neo4j&?from:@neotechnology“...
Twi]er+
UNWIND%{tweets}%AS%t%
WITH%t,%t.entities%AS%e,%t.user%AS%u%
%
MERGE%(tweet:Tweet:Content%{id:t.id})%
SET%tweet.tex...
Data+imported&
Connect!+&
•  twiber&handle&
•  email&
•  website&
•  name&
•  tags&
•  url&
+
SoOware$Analy/cs$
jqassistant.org$
So_ware+AnalyEcs+
So7ware&is&connected&informa4on&
•  Source&?>&AST&
•  Inheritance,&Composi4on,&Delega4on&
•  Call&Trees&...
jQAssistant+
•  GeekCruise:&My&first&Neo4j&project&
•  So_ware+deteriorates+
•  Develop&rules&and&enforce&them&
•  Commerci...
Let‘s+explore+...+
hbp://jqassistant.org/demo/java8&
...+The+JDK+
+
A$tale$of$french$silos$
SFR$France$
Background&
Business&problem& Solu4on&&&Benefits&
©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.&
Industry:'''Communica%...
Background&
Business&problem& Solu4on&&&Benefits&
©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.&
Domain+
Background&
Business&problem& Solu4on&&&Benefits&
©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.&
Industry:'''Communica%...
Background&
Business&problem& Solu4on&&&Benefits&
©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.&
• Infrastructure&maint...
Background&
Business&problem& Solu4on&&&Benefits&
©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.&
• Flexible&network&inv...
+
Visual$Graph$Search$
For$NonLDevelopers$
Background&
Business&problem& Solu4on&&&Benefits&
©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.&
• Neat&Javascript&libr...
Background&
Business&problem& Solu4on&&&Benefits&
©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.&
Popoto.js+
Background&
Business&problem& Solu4on&&&Benefits&
©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.&
Popoto.js+
Background&
Business&problem& Solu4on&&&Benefits&
©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.&
Visual+Search+Bar+
• B...
Background&
Business&problem& Solu4on&&&Benefits&
©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.&
• Natural&Language&to&...
Users+Love+Neo4j+
We&found&Neo4j&to&be&
literally&thousands+of+Emes+
faster+than&our&prior&MySQL&
solu4on,&with&queries&th...
Discrete+Data+
Minimally''
connected'data'
Graph+Databases+are+designed+for+data+relaEonships+
Summary+S+Use+the+Right+Dat...
There+Are+Lots+of+Ways+to+Easily+Learn+Neo4j+
neo4j.com/developer+
Users+Love+Neo4j+–+Will+you+too?++
Thanks!+QuesEons?$
Free$O’Reilly$Book:$
graphdatabases.com$
Find$Me:$@neo4j$
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Graph Search and Discovery for your Dark Data
Prochain SlideShare
Chargement dans…5
×

Graph Search and Discovery for your Dark Data

764 vues

Publié le

To achieve new insights within your company you don't necessarily have to accumulate and crunch a lot of new tracking and tracing data - much key information already exists in disconnected data silos, stored in different databases and systems. By bringing the relevant pieces together in a single place and connecting the pieces of the puzzle at the right points you can gain new insights into your existing data and business.
Graph databases offer the unique opportunity to easily cross-reference and connect disparate, variably structured datasets from many sources in a single place.
It is then available for ad-hoc querying and exploration as well as strategic decision making.
As a graph is a flexible and malleable data structure, it can evolve as you add new datasets or draw new connections from the insights or assumptions gained.

This source of decision making will not be your old-school MDM data cemetery, but a flexible instrument that you can use, reuse, discard and recreate for different goals and purposes.

Getting the data into the graph is only one interesting side of the coin, the other similarly challenging one is how to make the freshly spun web of knowledge available for different types of users.

Technical users have it easy - they can yield Cypher, a powerful graph query language for arbitrary graph pattern matching, filtering, projection and aggregation of relevant data.
For non-technical users a variety of tools and toolkits are available that use the inherent structures and meta information of the graph (labels, types, relationships and properties) to provide visual or natural language interaction to drill down, look at interesting facets, connect and correlate interesting information.
Under the hood those tools use the same graph query language to do their job.

This talk will cover both aspects of this exciting opportunity.

We'll look at how to aggregate, import and connect information from disparate data-sources into a dynamic, flexible graph model and the means, tools and techniques for graph search and discovery on top of that fabric.

The tools we use are the graph database Neo4j, its query language Cypher which also provides comprehensive import and data transformation abilities, as well as several tools and libraries from the Neo4j ecosystem to provide visual and textual graph search and exploration.

Storyline Draft: http://gist.asciidoctor.org/?dropbox-14493611%2Ftalk_javazone.txt

Publié dans : Données & analyses
  • Soyez le premier à commenter

Graph Search and Discovery for your Dark Data

  1. 1. Discovering+the+Power+of+Dark+Data+ Or:$The$Value$of$Rela/onships$ Javazone$Oslo$Sept$2015$
  2. 2. Michael+Hunger+ Caretaker,$Neo4j$Community$ Ask$me$anything$ $ michael@neo4j.com$$ $ @mesirii$|$@neo4j$
  3. 3. The+world+is+a+graph+–+everything+is+connected+ •  people,&places,&events& •  companies,&markets& •  countries,&history,&poli4cs& •  sciences,&art,&teaching& •  technology,&networks,&machines,&& applica4ons,&users& •  so7ware,&code,&dependencies,&& architecture,&deployments&
  4. 4. Meet+Jan+
  5. 5. Kontext+is+King+
  6. 6. + Community$Graph$ Neo4j$
  7. 7. Let‘s+have+a+look.+
  8. 8. by&Mike&Nolan&CC&BY?SA&3.0&
  9. 9. NODE& key:&“value”& proper4es& Property+Graph+Model+ Nodes& •  The&en44es&in&the&graph& •  Can&have&name?value&proper%es' •  Can&be&labeled& RelaEonships& •  Relate&nodes&by&type&and&direc4on& •  Can&have&name?value&proper%es' RELATIONSHIP& NODE& NODE& key:&“value”& proper4es& key:&“value”& proper4es& key:&“value”& proper4es&
  10. 10. CAR& name:&“Jan”& last:&Hansson& born:&1978& name:&“Anne”& last:&Olsen& since:&& Jan&10,&2011& brand:&“Volvo”& model:&“V70”& Property+Graph+Model+&+Example+ Nodes& •  The&en44es&in&the&graph& •  Can&have&name?value&proper%es' •  Can&be&labeled& RelaEonships& •  Relate&nodes&by&type&and&direc4on& •  Can&have&name?value&proper%es' LOVES& LOVES& LIVES&WITH& PERSON& PERSON&
  11. 11. Your+friend+Neo4j+ An&openLsource$graph$database& •  Manager+and+store&your&connected+ data+as&a&graph+ •  Query+relaEonships&& easily&and&quickly& •  Evolve+model+and+applicaEons++ to&support&new&requirements&and& insights& •  Built&to&solve&relaEonal+pains++ &
  12. 12. Your+friend+Neo4j+ An&openLsource$graph$database$ & •  Built+for+RelaEonships+ •  Open+Source+ •  Java+&+Scala+ •  High+Performance+ •  ACIDSDatabase+ &
  13. 13. Whiteboard+Friendly+Graph+Modeling+
  14. 14. Graph+Query+Language:+Cypher+ &&&&&&&&&&&&&&(jan:Person&{&name:“Jan”}&)&?[:LOVES]?>&(anne:Person&{&name:“Anne”}&)&& LOVES& Jan+ Anne+ NODE& NODE& LABEL& PROPERTY&LABEL& PROPERTY& MATCH+
  15. 15. Cypher:+Clauses+ • CREATE&pabern& • MERGE&pabern& • ADD& • DELETE&
  16. 16. Cypher:+Clauses+ • MATCH&pabern&& WHERE&pred& • ORDER&BY&expr&& SKIP&...&LIMIT&...& • RETURN&expr&AS&alias&...&
  17. 17. Cypher:+Clauses+ • WITH&expr&AS&alias,&...& • UNWIND&coll&AS&item& • LOAD&CSV&FROM&„url“&AS&row&
  18. 18. GeYng+Data+into+Neo4j+ CypherSBased+“LOAD+CSV”+ •  Transac4onal&(ACID)&writes& •  Ini4al&and&incremental&loads&of&up&to&& 10&million&nodes&and&rela4onships& LOAD%CSV%WITH%HEADERS%FROM%"url"%AS%row% MERGE%(:Person%{name:row.name,%% %%%%%%%%%%%%%%%%%age:toInt(row.age)});%
  19. 19. GeYng+Data+into+Neo4j+ Load+JSON+with+Cypher+ •  Send&JSON&as&parameter& •  Deconstruct&the&document& •  Into&a&non?duplicated&graph&model& UNWIND%{json}.items%as%item% MERGE%(:Person%{name:item.name,%% %%%%%%%%%%%%%%%%%age:toInt(item.age)});%
  20. 20. GeYng+Data+into+Neo4j+ CSV+Bulk+Loader++++neo4j/import' •  For&ini4al&database&popula4on& •  For&loads&with&10B+&records& •  Up&to&1M&records&per&second& bin/neo4jOimport%–Ointo%people.db%% %OOnodes:Person%people.csv% %OOrelationship:FRIEND_OF%friendship.csv%
  21. 21. Let‘s+LOAD+
  22. 22. USED& SHARED& &&& Core+Model+
  23. 23. POSTED& && & & Full+Model+
  24. 24. Twi]er+ •  Run&a&twiber&search,&exclude&Neo4j&sources& •  „neo4j&OR&#neo4j&OR&@neo4j&& &?from:@neo4j&?from:@neotechnology“& •  Pass&resul4ng&JSON&directly&to&Cypher& (:Person)O[:TWEETED]O>(:Tweet:Content)O[:TAGGED]O>(:Tag)% (:Tweet)O[:MENTIONS]O>(:Person)% (:Tweet)O[:RETWEET]O>(:Tweet)% (:Tweet)O[:REPLY]O>(:Tweet)% (:Tweet)O[:CONTAINS]O>(:Link)%
  25. 25. Twi]er+ UNWIND%{tweets}%AS%t% WITH%t,%t.entities%AS%e,%t.user%AS%u% % MERGE%(tweet:Tweet:Content%{id:t.id})% SET%tweet.text%=%t.text,%tweet.created_at%=%t.created_at,...% % MERGE%(p:Person%{name:u.name})%% SET%p.handle%=%u.screen_name,%p.followers%=%u.followers_count,%...% % MERGE%(p)O[:POSTED]O>(tweet)% % FOREACH%(h%IN%e.hashtags%|%% %%MERGE%(tag:Tag%{name:toLower(h.text)})% %%MERGE%(tweet)O[:TAGGED]O>(tag))% % FOREACH%(url%IN%e.urls%|% %%MERGE%(link:Link%{url:u.expanded_url})% %%MERGE%(tweet)O[:CONTAINS]O>(link))% ...%
  26. 26. Data+imported&
  27. 27. Connect!+& •  twiber&handle& •  email& •  website& •  name& •  tags& •  url&
  28. 28. + SoOware$Analy/cs$ jqassistant.org$
  29. 29. So_ware+AnalyEcs+ So7ware&is&connected&informa4on& •  Source&?>&AST& •  Inheritance,&Composi4on,&Delega4on& •  Call&Trees& •  Run4me&Memory& •  Dependencies& •  Modules,&Libraries& •  Tests& •  ...& & hbps://jqassistant.org&
  30. 30. jQAssistant+ •  GeekCruise:&My&first&Neo4j&project& •  So_ware+deteriorates+ •  Develop&rules&and&enforce&them& •  Commercial&Tools&too&inflexible& •  Open&Source&So7ware&...&& •  Scanner&?>&Enhancer&?>&Analyzer& •  Enrichment,&Concepts&and&Rulez&in&Cypher& •  Scanner&Plugins& •  Integrate&in&Build&Process& •  Fail,&Generate&Reports,&...& hbp://jqassistant.org&
  31. 31. Let‘s+explore+...+ hbp://jqassistant.org/demo/java8& ...+The+JDK+
  32. 32. + A$tale$of$french$silos$ SFR$France$
  33. 33. Background& Business&problem& Solu4on&&&Benefits& ©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.& Industry:'''Communica%ons' Use'case:''Network'Management' Paris,&France& •  Large&French&communica4ons&company& •  Tons&of&various&networking&infrasctructure& •  Has&to&shut&down&equipment&for&maintenance& •  Who+is+impacted?+How+to+compensate?+ •  Data&lives&in&30+&systems& •  Took&a&week&to&print,&reseach,&inform&...& •  Un4l&...& 5 8
  34. 34. Background& Business&problem& Solu4on&&&Benefits& ©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.& Domain+
  35. 35. Background& Business&problem& Solu4on&&&Benefits& ©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.& Industry:'''Communica%ons' Use'case:''Network'Management' Paris,&France& •  Second&largest&communica4ons&company& in&France& •  Part&of&Vivendi&Group,&& partnering&with&Vodafone& 6 0
  36. 36. Background& Business&problem& Solu4on&&&Benefits& ©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.& • Infrastructure&maintenance&took&one+full+week+to&plan,& because&of&the&need&to&model&network&impacts& • Needed&rapid,&automated&“what+if”&analysis&to&ensure& resilience&during&unplanned&network&outages& • Iden4fy&weaknesses&in&the&network&to&uncover&the&need&for& addi4onal&redundancy& • Network&informa4on&spread&across+>+30+systems,&with&daily& changes&to&network&infrastructure& • Business&needs&some4mes&changed&very&rapidly& Business+Problem+
  37. 37. Background& Business&problem& Solu4on&&&Benefits& ©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.& • Flexible&network&inventory&management&system,&to&support& modeling,&aggrega4on&&&troubleshoo4ng& • Single+source+of+truth+(Neo4j)&represen4ng&the&en4re&network& • Dynamic&system&loads&data&from+30++systems,&and&allows&new& applica4ons&to&access&network&data& • Modeling&efforts&greatly&reduced&because&of&the&near+1:1+ mapping+between&the&real&world&and&the&graph& • Flexible+schema+highly&adaptable&to&changing&business& requirements& SoluEon+&+Benefits+
  38. 38. + Visual$Graph$Search$ For$NonLDevelopers$
  39. 39. Background& Business&problem& Solu4on&&&Benefits& ©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.& • Neat&Javascript&library&based&on&d3.js+ • Uses&Graph&Metadata&to&offer&visual&search& • Categories&to&filter&Instances& • Component&based&extensions& • Graph&& • Zero&Config&with&Web&Extension& Popoto.js+ hbp://www.popotojs.com/&
  40. 40. Background& Business&problem& Solu4on&&&Benefits& ©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.& Popoto.js+
  41. 41. Background& Business&problem& Solu4on&&&Benefits& ©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.& Popoto.js+
  42. 42. Background& Business&problem& Solu4on&&&Benefits& ©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.& Visual+Search+Bar+ • Based&on&visualsearch.js& • Uses&graph&metadata&for&parametriza4on& • Limit&results&by&selected&items& • By&Max&de&Marzi& maxdemarzi.com/2013/07/03/the?last?mile/&
  43. 43. Background& Business&problem& Solu4on&&&Benefits& ©&All&Rights&Reserved&2014&|&Neo&Technology,&Inc.& • Natural&Language&to&Cypher& • Ruby&TreeTop&Gem&for&NLP&& • Convert&phrases&to&Cypher&Fragments& Facebook+Graph+Search+ maxdemarzi.com/2013/01/28/facebook?graph?search?with?cypher?and?neo4j/&
  44. 44. Users+Love+Neo4j+ We&found&Neo4j&to&be& literally&thousands+of+Emes+ faster+than&our&prior&MySQL& solu4on,&with&queries&that& require&10+to+100+Emes+less+ code.&Today,&Neo4j&provides& eBay&with&func4onality&that& was&previously+impossible. & & Volker'Pacher' Senior'Developer' Performance+ "The&Neo4j&graph&database&gives&us&dras4cally& improved&performance&and&a&simple&language&to&query& our&connected&data”&& /'Sebas%an'Verheugher,'Telenor& Scale+and+Availability+ "As&the&current&market&leader&in&graph&databases,&and& with&enterprise&features&for&scalability&and&availability,& Neo4j&is&the&right&choice&to&meet&our&demands.”& &&&&/'Marcos'Wada,'Walmart&
  45. 45. Discrete+Data+ Minimally'' connected'data' Graph+Databases+are+designed+for+data+relaEonships+ Summary+S+Use+the+Right+Database+for+the+Job+ Other+NoSQL+ RelaEonal+DBMS+ Graph+DBMS+ Connected+Data+ Focused'on' Data'Rela%onships' Development+Benefits+ Easy&model&maintenance& Easy&query& Deployment+Benefits& Ultra&high&performance& Minimal&resource&usage&
  46. 46. There+Are+Lots+of+Ways+to+Easily+Learn+Neo4j+ neo4j.com/developer+
  47. 47. Users+Love+Neo4j+–+Will+you+too?++
  48. 48. Thanks!+QuesEons?$ Free$O’Reilly$Book:$ graphdatabases.com$ Find$Me:$@neo4j$

×