Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Graph Gurus Episode 1: Enterprise Graph

16 vues

Publié le

Building an Enterprise Knowledge Graph

Publié dans : Logiciels
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Graph Gurus Episode 1: Enterprise Graph

  1. 1. Graph Gurus Episode 1 Building An Enterprise Knowledge Graph with Resource Description Framework (RDF) and Graph Database
  2. 2. © 2018 TigerGraph. All Rights Reserved Welcome ● Attendees are muted but you can talk to us via Chat in Zoom ● We will have 10 min for Q&A at the end ● Send questions at any time using the Q&A tab in the Zoom menu ● The webinar will be recorded ● A link to the presentation and reproducible steps will be emailed 2 Developer Edition Download https://www.tigergraph.com/developer/
  3. 3. © 2018 TigerGraph. All Rights Reserved Speaking Today ● BS in Electrical Engineering and Computer Science from UC Berkeley ● MS in Electrical Engineering from Stanford University ● PhD in Computer Science from Kent State University focused on graph data mining ● 15+ years in tech industry 3 Victor Lee, Director of Product Management
  4. 4. © 2018 TigerGraph. All Rights Reserved Speaking Today ● Master in Computer Science from Kent State University. ● Ph.D. Candidate from Kent State University. ● With TigerGraph for 4 years, co-author of GSQL query language. Developed production solutions for major customers. ● Expertise in Graph Algorithms, Big Data Analytic, Anti-Fraud and AML. 4 Xinyu Chang, Solutions Engineering Team Leader Photo?
  5. 5. © 2018 TigerGraph. All Rights Reserved Agenda ● Welcome to Graph Gurus! ● Episode 1: Building an Enterprise Knowledge Graph from an RDF Dataset ● Q&A 5 Developer Edition Download https://www.tigergraph.com/developer/
  6. 6. © 2018 TigerGraph. All Rights Reserved ● Topics: How to solve many important data-centric problems using the right graph and right queries. ● Goal: To make you graph-smart! ● Format: Series of short webinars ○ Aimed at Developers and Data Scientists ○ Presented by Developers ○ Code & data used in demos available for download 6
  7. 7. © 2018 TigerGraph. All Rights Reserved Episode 1: Building an Enterprise Knowledge Graph from RDF Knowledge Graph ● Network of interconnected entities in a given domain ● Semantic implications for entities and relationships Enterprise Knowledge Graph ● Domain is an Enterprise 7
  8. 8. © 2018 TigerGraph. All Rights Reserved Why use TigerGraph (a Property Graph) for an RDF-based Knowledge Graph? 8 ● Motivation ○ RDF stores usually use relational join to handle graph traversal -- too slow ○ RDF/SparQL not designed for fine-controlled graph analytics -- hard to use ● Takeaways from this episode ○ Generic property graph schema applicable to any RDF data set ○ GSQL multiple-hop queries with real-time response speed and intuitive visualization out of the box
  9. 9. © 2018 TigerGraph. All Rights Reserved Getting Value from a Knowledge Graph ● I’ve got my data in a knowledge graph. Now what? Query the graph to obtain information: ● Search: Are there any ____ ? ● Reasoning: If THIS and THAT, then __ ● Discover trends and norms ● Discover outliers ● Explore relationships to see what you find... 9
  10. 10. © 2018 TigerGraph. All Rights Reserved Demonstration 1. Example Data Set 2. Defining a Generic Knowledge Graph Schema 3. Mapping the RDF Data to the Graph Database 4. Extracting Knowledge from the Graph 10 Developer Edition Download https://www.tigergraph.com/developer/
  11. 11. © 2018 TigerGraph. All Rights Reserved 1. Example Data Set - DBpedia https://wiki.dbpedia.org/ ● Structured content extracted from the information created in various Wikimedia projects. ● April 2016 Data Set ○ https://wiki.dbpedia.org/dbpedia-version-2016-04 ○ Size: 150 GB ○ 9.5 billion pieces of information (RDF triples) ○ 6.0M entities of which 4.6M have abstracts, 1.53M have geo coordinates and 1.6M depictions ○ Download: http://fragments.dbpedia.org/hdt/dbpedia2016-04en.hdt 11
  12. 12. © 2018 TigerGraph. All Rights Reserved Data Format - RDF ● RDF is an abstract data model with several serializations. ● Our DBpedia data is in n-triple format: <subject> <predicate> <object> . 12 <http://0-access.newspaperarchive.com.topcat.switchinc.org/Viewer.aspx?img=7578853> <http://dbpedia.org/property/accessdate> "2010-04-21"^^<http://www.w3.org/2001/XMLSchema#date> . <http://0-access.newspaperarchive.com.topcat.switchinc.org/Viewer.aspx?img=7578853> <http://dbpedia.org/property/date> "1937-01-04"^^<http://www.w3.org/2001/XMLSchema#date> . <http://0-access.newspaperarchive.com.topcat.switchinc.org/Viewer.aspx?img=7578853> <http://dbpedia.org/property/format> "PDF" . <http://0-access.newspaperarchive.com.topcat.switchinc.org/Viewer.aspx?img=7578853> <http://dbpedia.org/property/isCitedBy> <http://dbpedia.org/resource/List_of_Attorneys_General_of_ Subject Predicate Object
  13. 13. © 2018 TigerGraph. All Rights Reserved 2. Defining A Knowledge Graph Schema Traditional approach ● Numerous vertex & edge types ○ predicate value ←→ edge type ○ “is_a_” predicate ← → vertex type ● Pro: Clear semantics ● Con: Schema can grow with every new data triple 13
  14. 14. © 2018 TigerGraph. All Rights Reserved A Universal Knowledge Graph Schema Alternate approach: Universal Schema ● Only 2 vertex types and 3 edge types 14
  15. 15. © 2018 TigerGraph. All Rights Reserved 15 15 <http://0-access.newspaperarchive.com.topcat.switchinc.org/Viewer.aspx?img=7578853> <http://dbpedia.org/property/isCitedBy> <http://dbpedia.org/resource/List_of_Attorneys_General_of_ “isCitedBy” <http://0-acce ss.newspaper archive.com.t opcat.switchin c.org/Viewer.a spx?img=7578 853> Subject <http://dbpedi a.org/resource /List_of_Attorn eys_General_o f_ Object isCitedBy Predicate SubHasPred ObjHasPred ● Example: 1 Data line → up to 3 vertices and 3 edges Each vertex will serve as an index node, including the Predicate vertex, speeding up searches. 3. Mapping RDF Data to the Graph Database
  16. 16. © 2018 TigerGraph. All Rights Reserved Predicate Semantics in the Universal Schema 16 16 “isCitedBy” <http://0-acce ss.newspaper archive.com.t opcat.switchin c.org/Viewer.a spx?img=7578 853> Subject <http://dbpedi a.org/resource /List_of_Attorn eys_General_o f_ Object isCitedBy Predicate SubHasPred ObjHasPred Searching from Predicate to Subject accesses all the subjects that have a specific predicate (Country, date…) Searching from Predicate to Object accesses all the objects associated with a specific predicate (Country, date…) <http://0-access.newspaperarchive.com.topcat.switchinc.org/Viewer.aspx?img=7578853> <http://dbpedia.org/property/isCitedBy> <http://dbpedia.org/resource/List_of_Attorneys_General_of_
  17. 17. © 2018 TigerGraph. All Rights Reserved Loading RDF using GSQL 17 Subject Predicate Object <http://0-access.newspaperarchive.com.topcat.switchinc.org/Viewer.aspx?img=7578853> <http://dbpedia.org/property/isCitedBy> <http://dbpedia.org/resource/List_of_Attorneys_General_of_ CREATE LOADING JOB loadRDF FOR GRAPH rdf { LOAD "/home/ubuntu/dbpedia2016-04.rdf" TO EDGE HasObject VALUES (subject($0), object($2), predicate($1)), TO EDGE SubHasPred VALUES (subject($0), predicate($1)), TO EDGE ObjHasPred VALUES (object($2), predicate($1)) USING Separator=">", Header="false"; } subject(), object(), and predicative() are user-defined functions which parse the RDF syntax.
  18. 18. © 2018 TigerGraph. All Rights Reserved 4. Extracting Knowledge from the Graph 1. getMostRelatedPerson Given a person, output the top k most related person via common objects. • http://dbpedia.org/resource/Limas_Sweed • http://dbpedia.org/resource/Tony_Hills_(American_football) 2. getTheTopPublisher Given a place, output the top k most productive publishers who born in that place. • http://dbpedia.org/resource/New_York • http://dbpedia.org/resource/Pittsburgh 3. getRelatedTopics Given a Predicate, output the top k most visited Entities during the traversal. 18
  19. 19. © 2018 TigerGraph. All Rights Reserved 4. Extracting Knowledge from the Graph 1. getMostRelatedPerson 19 Input Person Common Object Common Object Common Object Related Person Person Natural Person
  20. 20. © 2018 TigerGraph. All Rights Reserved 4. Extracting Knowledge from the Graph 2. getTheTopPublisher or k most visited Entities during the traversal. 20 Person Article Article ArticlePerson Input Place Born in Born in Published Published Published
  21. 21. © 2018 TigerGraph. All Rights Reserved 4. Extracting Knowledge from the Graph 3. getRelatedTopics Given a Predicate, output the top k most visited Entities during the traversal. 21 Input Predicate Related Entities Related Entities Related Entities Related Entities Related Entities … ... Related Entities Related Entities Related Entities
  22. 22. Q&A Please send your questions via the Q&A menu in Zoom 22
  23. 23. © 2018 TigerGraph. All Rights Reserved Episode 2: Building the Next Generation Recommendation Engine with a Graph DB https://info.tigergraph.com/graph-gurus-2 Episode 3: Detecting Fraud and Money Laundering in Real-Time with a Graph DB https://info.tigergraph.com/graph-gurus-3 23 REGISTER FOR MORE WEBINARS AT https://www.tigergraph.com/ webinars-and-events/
  24. 24. © 2018 TigerGraph. All Rights Reserved Additional Resources 24 Compare the Developer Edition and Enterprise Free Trial https://www.tigergraph.com/download/ Guru Scripts https://github.com/tigergraph/ecosys/tree/master/guru_scripts Join our Developer Forum https://groups.google.com/a/opengsql.org/forum/#!forum/gsql-users Take the Developer Survey https://www.tigergraph.com/developer-edition-feedback-survey/ @TigerGraphDB youtube.com/tigergraph facebook.com/TigerGraphDB linkedin.com/company/TigerGraph

×