Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

There and Back Again, A Developer's Tale

32 vues

Publié le

Jennifer Reif, Developer Relations Engineer, Neo4j

Publié dans : Logiciels
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

There and Back Again, A Developer's Tale

  1. 1. There And Back Again… A Developer’s Tale By: Jennifer Reif Neo4j Developer Relations Engineer jennifer.reif@neo4j.com @JMHReif
  2. 2. Who Am I? • Developer Relations Engineer for Neo4j • Continuous learner, developer, blogger • Conference speaker • Survivor of financial industry development Email: jennifer.reif@neo4j.com Twitter: @JMHReif
  3. 3. We have a problem…
  4. 4. We want to know… • What actors played which characters in Lord of the Rings movies • Other scenarios: • What employees have which skills for job openings in the company • What customers purchased which products and the suppliers • What patient was prescribed which medications from which doctors • What customers bought which vehicles from what dealerships/people
  5. 5. Existing solutions are painful • Thousands of actors, employees, customers, patients, doctors, dealerships, skills, etc • Relational: • Great for reports and simple JOINs, but too many JOINs to go across 3 core tables and lookup tables with endless rows each • Document: • Great for pulling information about individual components, but linking properties across substructures is complicated • Key-value: • Great for bits of information very quickly, but aggregating and compiling lots of related data is arduous
  6. 6. What is a Graph Database?
  7. 7. GraphChart
  8. 8. Database - specifically graph • Database: a structured set of data held in a computer, especially one that is accessible in various ways. • Relational? NoSQL? Graph? • Graph database: uses graph structures for semantic queries with nodes, edges, and properties to represent and store data.
  9. 9. What can graph do?
  10. 10. –Wikipedia, “Graph Database”, Performance section “Execution of queries within a graph database is localized to a portion of the graph. It does not search through irrelevant data, making it advantageous for real-time big data analytical queries. Consequently, graph database performance is proportional to the size of the data needed to be traversed, staying relatively constant despite the growth of data stored.”
  11. 11. Relational Graph
  12. 12. Other NoSQL Graph
  13. 13. What is it used to accomplish? Use Cases • Social networks • Impact analysis • Logistics and routing • Recommendations • Access control • Fraud analysis • …and many, many more!
  14. 14. Lots of graph options… Why choose Neo4j?
  15. 15. Neo4j is a database Neo4j Fast Reliable No size limit Binary & HTTP protocol ACID transactions 2-4 M
 ops/s per core Clustering scale & availability Official Drivers
  16. 16. Neo4j is a graph database Neo4j Property Graph Model Native GraphDB Schema Free Graph Storage Cypher Query Language Developer Workbench Extensible Procedures & Functions Graph Visualization
  17. 17. What’s the data model?
  18. 18. Whiteboard Friendliness Easy to design and modeldirect representation of the model
  19. 19. Property Graph Data Model • 2 Main Components: • Nodes • Relationships • Additional Components: • Labels • Properties
  20. 20. Property Graph Data Model • Nodes: • Represent the objects in the graph • Can be categorized using Labels Car Person Person
  21. 21. Property Graph Data Model • Nodes: • Represent the objects in the graph • Can be categorized using Labels • Relationships: • Relate nodes by type and direction Car DRIVES LOVES LOVES LIVES WITH OW NS Person Person
  22. 22. Property Graph Data Model • Nodes: • Represent the objects in the graph • Can be categorized using Labels • Relationships: • Relate nodes by type and direction • Properties: • Name-value pairs that can be applied to nodes or relationships Car DRIVES LOVES LOVES LIVES WITH OW NS Person Person name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2011 brand: “Volvo” model: “V70”
  23. 23. Tools for data modeling… • Arrows tool: • http://www.apcjones.com/arrows/ • Developer guides: • https://neo4j.com/developer/data-modeling/ • GraphGists: • https://neo4j.com/graphgists/ • Community Site: • https://community.neo4j.com/ • Training - Data Modeling course: • https://neo4j.com/graphacademy/
  24. 24. Applied to our scenario
  25. 25. Whiteboard friendliness
  26. 26. Whiteboard friendliness title: The Lord of the Rings… released: 2003 Movie Cast name: Orlando Bloom name: Frodo Baggins Character PLAYED APPEARS_IN name: Elijah Wood Cast Character name: Legolas Character name: Aragorn name: Viggo Mortensen Cast PLAYED PLAYED APPEARS_IN APPEARS_IN
  27. 27. Whiteboard friendliness
  28. 28. Importing Data
  29. 29. Options for Importing Data • Cypher statements / script: create individual statements to load data manually • LOAD CSV: used for small and medium data sets can import local or online csv files to graph • ETL Tool: can import from a relational database and maps relational data model to graph • Kettle: can import massive amounts of data from a variety of sources • APOC: standard library that includes several import procedures for different data formats • Neo4j-admin import tool: command-line interface for large amounts of data • Import programmatically from drivers: interact via preferred programming language
  30. 30. Cypher Query Language…. SQL for graphs
  31. 31. Tools for Cypher… • Cypher quick-reference: • https://neo4j.com/docs/cypher-refcard/current/ • Developer guides: • https://neo4j.com/developer/cypher/ • Cypher manual: • https://neo4j.com/docs/cypher-manual/current/ • Community Site: • https://community.neo4j.com/ • Resources list: • https://neo4j.com/developer/cypher-resources/
  32. 32. Cypher: Powerful and Expressive CREATE (:Person { name:“Dan”}) -[:LOVES]-> (:Person { name:“Ann”}) LOVES Dan Ann LABEL PROPERTY NODE NODE LABEL PROPERTY
  33. 33. Cypher: Powerful and Expressive LOVES Dan Ann MATCH (:Person { name:"Dan"} ) -[:LOVES]-> ( whom ) 
 RETURN whom
  34. 34. Cypher in 20 sec… • Nodes look like this: • (var:Label) OR (var:Label { propKey: propValue }) • Relationships look like this: • -[var:REL_TYPE]-> or -[var:REL_TYPE { propKey: propValue }]- • Using Cypher is just looking for particular patterns of those nodes/rels • (var1:Label)-[var2:REL_TYPE]->(var3:Label)
  35. 35. How do we want to import our data?
  36. 36. Cypher statements/script MERGE (m:Movie {id: 100}) ON CREATE SET m.title = “The Lord of the Rings: The Fellowship of the Ring”, m.releaseDate = date(‘2001-12-19’)… MERGE (c:Character {id: 300}) ON CREATE SET m.name = “Legolas”… MERGE (c)-[:APPEARED_IN]->(m) ….
  37. 37. LOAD CSV LOAD CSV WITH HEADERS FROM “file:///movies.csv” as row MERGE (m:Movie {id: row.movieId}) ON CREATE SET m.title = row.title, m.releaseDate = date(row.released)… …. LOAD CSV WITH HEADERS FROM “file:///movieCharacters.csv” as row MATCH (m:Movie {id: row.movieId}) WITH m, row MERGE (c:Character {id: row.id}) ON CREATE SET m.name = row.name … MERGE (c)-[:APPEARED_IN]->(m) ….
  38. 38. ETL Tool
  39. 39. Kettle
  40. 40. APOC WITH "https://bestmovies.com/" as url CALL apoc.load.json(url) YIELD value UNWIND value.results AS results WITH results MERGE (m:Movie {id: results.id}) ON CREATE SET m.title = results.title, m.releaseDate = date(results.released)… ….
  41. 41. APOC fave procs • apoc.load.json(url) / apoc.load.csv(file) / apoc.load.xml(file) / apoc.load.jdbc(url) • Procedures to load various kinds of data • Can handle flat files or url paths (locally or remote) • Excellent when you need transformations with data load • apoc.periodic.iterate(‘cypher1’, ’cypher2’, {parms}) • For each result in cypher1 statement, run cypher2 statement on them • Helpful for selecting a segment for update • apoc.do.when(condition, query, else, {parms}) • Handles transformation for substituting values • Used for a variety of functions, but here is good for cleaning data • apoc.date.format(dateType, “precision”, ‘format’) • Can output date in a variety of formats for display or querying • Very helpful pulling or pushing date/time value into/out of Neo4j
  42. 42. Tools for APOC… • Docs: • https://neo4j-contrib.github.io/neo4j-apoc- procedures/ • Developer guides: • https://neo4j.com/developer/neo4j-apoc/ • Community Site: • https://community.neo4j.com/ • YouTube videos: • https://www.youtube.com/watch? v=V1DTBjetIfk&list=PL9Hl4pk2FsvXEww23 lDX_owoKoqqBQpdq
  43. 43. Getting an instance of Neo4j
  44. 44. Free Tools for Running Neo4j… • Sandbox: • https://neo4j.com/sandbox-v2/ • Neo4j Desktop (local instance): • https://neo4j.com/download/ • Server install (open source): • https://neo4j.com/download-center/#community • In the Cloud: • https://neo4j.com/developer/guide-cloud- deployment/ • Docker: • https://hub.docker.com/_/neo4j
  45. 45. Getting our data imported…
  46. 46. //Load Movie objects that are wanted WITH 'https://api.themoviedb.org/3/search/movie?api_key='+ $apiKey+'&query=Lord%20of%20the%20Rings' as url CALL apoc.load.json(url) YIELD value UNWIND value.results AS results WITH results MERGE (m:Movie {movieId: results.id})
 ON CREATE SET m.title = results.title, m.desc = results.overview, m.poster = results.poster_path, m.reviewStars = results.vote_average, m.reviews = results.vote_count WITH results, m CALL apoc.do.when(results.release_date = "", 'SET m.releaseDate = null', 'SET m.releaseDate = date(results.release_date)', {m:m, results:results}) YIELD value RETURN m
  47. 47. //For Movie objects just loaded, pick out trilogy and retrieve cast of those movies WITH 'https://api.themoviedb.org/3/movie/' as prefix, '/credits?api_key='+$apiKey as suffix, ["The Lord of the Rings: The Fellowship of the Ring", "The Lord of the Rings: The Two Towers", "The Lord of the Rings: The Return of the King"] as movies CALL apoc.periodic.iterate('MATCH (m:Movie) WHERE m.title IN $movies RETURN m', 'WITH m CALL apoc.load.json($prefix+m.movieId+$suffix) YIELD value UNWIND value.cast AS cast MERGE (c:Cast {id: cast.id}) ON CREATE SET c.name = cast.name MERGE (ch:Character {name: cast.character}) MERGE (ch)-[r:APPEARS_IN]->(m) MERGE (c)-[r1:PLAYED]->(ch)', {batchSize: 1, iterateList:false, params:{movies:movies, prefix:prefix, suffix:suffix}});
  48. 48. Writing Queries
  49. 49. Let’s Query Our Data!
  50. 50. Other ways to query and explore • Make calls from an application • Neo4j drivers for almost any programming language • Java, Python, Javascript, Go, Ruby, PHP • Visualization tools • Open source and proprietary • Neovis, Browser, Bloom, 3d-force-graph, Kineviz, yWorks
  51. 51. Fitting into Architecture
  52. 52. Will it play nice? • Integrations, integrations, integrations! • Out-of-the-box plugins (APOC, GraphQL, graph algorithms) • Custom extensions possible • Tons of options for feeding data to existing tools/systems • Tableau, Kettle, Kafka, ElasticSearch, other DBs, Spark, and many more
  53. 53. Sharing the value
  54. 54. What can I show to others? • Neo4j Bloom (or partner/open source visualization tools) • Exploration tool for business users to query with natural language • Basic reports and query performance • Build according to specs and compare solutions, just as you would with any technology evaluation • Use cases and success stories • https://neo4j.com/resources • Possible integrations and minimal interruption of existing systems • What tools are you using today? Does our integration fit neatly? • Community and support network! • Support agreement or fabulous expert community answers to questions
  55. 55. Recap! @JMHReif jennifer.reif@neo4j.com

×