Introduction to Graph Databases

•Télécharger en tant que PPT, PDF•

4 j'aime•1,224 vues

Josh Adell

Technologie Business

Who am I? ,[object Object],[object Object],[object Object]

The Solution? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Find Every Actor at Each Degree ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

The Real Problem ,[object Object],[object Object],[object Object]

Some Graph Use Cases ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Modelling a Domain with Graphs ,[object Object],[object Object],[object Object],[object Object]

Graph Mining ,[object Object],[object Object],[object Object]

New Solution to the Bacon Problem $keanu = $actorIndex->find('name', 'Keanu Reeves'); $kevin = $actorIndex->find('name', 'Kevin Bacon'); $path = $keanu->findPathTo($kevin);

Cypher ,[object Object],// Find all the directors who have directed a movie scored by John Williams // that starred Kevin Bacon START actor=(actors, 'Kevin Bacon'), composer=(compsers, 'John Williams') MATCH (actor)-[:IN]->(movie)<-[:DIRECTED]-(director), (movie)<-[:SCORED]-(composer) RETURN director

Are RDBs Useful At All? ,[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Resources ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Recommandé

Application Modeling with Graph DatabasesJosh Adell

Writing Apps the Google-y Way (Brisbane)Pamela Fox

A new data platform for ParliamentUK Parliament Data

GeoMapper, Python Script for Visualizing Data on Social Networks with Geo-loc...Marcel Caraciolo

Talk to me – Chatbots und digitale Assistenteninovex GmbH

You are in a maze of deeply nested maps, all alikeEric Normand

Software Dendrology by Brandon BloomHakka Labs

Ruby - Uma IntroduçãoÍgor Bonadio

Recommandé

Application Modeling with Graph DatabasesJosh Adell

Writing Apps the Google-y Way (Brisbane)Pamela Fox

A new data platform for ParliamentUK Parliament Data

GeoMapper, Python Script for Visualizing Data on Social Networks with Geo-loc...Marcel Caraciolo

Talk to me – Chatbots und digitale Assistenteninovex GmbH

You are in a maze of deeply nested maps, all alikeEric Normand

Software Dendrology by Brandon BloomHakka Labs

Ruby - Uma IntroduçãoÍgor Bonadio

Graph DatabasesJosh Adell

Microdata, Authorship and Semantic HTML - Ruth Cheesley - J and Beyond 2013Ruth Cheesley

Ruth Cheesley - Joomla!Day Kenya - Microdata, Authorship, and why you can't a...Ruth Cheesley

Dropping ACID with MongoDBkchodorow

Ruth Cheesley - Joomla!Day Spain - Microdata and Semantic SearchRuth Cheesley

Freebasing for Fun and EnhancementMrDys

Pick-a-Plex App: The Pinnacle of Cinema ExperiencesFlatiron School

Writing Friendly libraries for CodeIgniterCodeIgniter Conference

Exploiting Php With PhpJeremy Coates

Introduction to CodeIgniter (RefreshAugusta, 20 May 2009)Michael Wales

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Why Teams call analytics are critical to your entire businesspanagenda

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea

Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez

MINDCTI Revenue Release Quarter One 2024MIND CTI

FWD Group - Insurer Innovation Award 2024The Digital Insurer

ICT role in 21st century education and its challengesrafiqahmad00786416

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Contenu connexe

Similaire à Introduction to Graph Databases

Graph DatabasesJosh Adell

Microdata, Authorship and Semantic HTML - Ruth Cheesley - J and Beyond 2013Ruth Cheesley

Ruth Cheesley - Joomla!Day Kenya - Microdata, Authorship, and why you can't a...Ruth Cheesley

Dropping ACID with MongoDBkchodorow

Ruth Cheesley - Joomla!Day Spain - Microdata and Semantic SearchRuth Cheesley

Freebasing for Fun and EnhancementMrDys

Pick-a-Plex App: The Pinnacle of Cinema ExperiencesFlatiron School

Writing Friendly libraries for CodeIgniterCodeIgniter Conference

Exploiting Php With PhpJeremy Coates

Introduction to CodeIgniter (RefreshAugusta, 20 May 2009)Michael Wales

Similaire à Introduction to Graph Databases (10)

Graph Databases

Microdata, Authorship and Semantic HTML - Ruth Cheesley - J and Beyond 2013

Ruth Cheesley - Joomla!Day Kenya - Microdata, Authorship, and why you can't a...

Dropping ACID with MongoDB

Ruth Cheesley - Joomla!Day Spain - Microdata and Semantic Search

Freebasing for Fun and Enhancement

Pick-a-Plex App: The Pinnacle of Cinema Experiences

Writing Friendly libraries for CodeIgniter

Exploiting Php With Php

Introduction to CodeIgniter (RefreshAugusta, 20 May 2009)

Dernier

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Why Teams call analytics are critical to your entire businesspanagenda

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea

Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez

MINDCTI Revenue Release Quarter One 2024MIND CTI

FWD Group - Insurer Innovation Award 2024The Digital Insurer

ICT role in 21st century education and its challengesrafiqahmad00786416

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya

Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

Manulife - Insurer Transformation Award 2024The Digital Insurer

Dernier (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

Exploring the Future Potential of AI-Enabled Smartphone Processors

Why Teams call analytics are critical to your entire business

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

MINDCTI Revenue Release Quarter One 2024

FWD Group - Insurer Innovation Award 2024

ICT role in 21st century education and its challenges

2024: Domino Containers - The Next Step. News from the Domino Container commu...

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

How to Troubleshoot Apps for the Modern Connected Worker

Artificial Intelligence Chap.5 : Uncertainty

Finding Java's Hidden Performance Traps @ DevoxxUK 2024

Boost Fertility New Invention Ups Success Rates.pdf

Strategies for Landing an Oracle DBA Job as a Fresher

AWS Community Day CPH - Three problems of Terraform

Manulife - Insurer Transformation Award 2024

Introduction to Graph Databases

1. Introduction to Graph Databases Josh Adell <josh.adell@gmail.com> 20110806

3. The Problem

8. The Real Solution

9. Graph Examples

10. Relational Databases are Graphs!

11.

12.

13.

14. New Solution to the Bacon Problem $keanu = $actorIndex->find('name', 'Keanu Reeves'); $kevin = $actorIndex->find('name', 'Kevin Bacon'); $path = $keanu->findPathTo($kevin);

15.

16.

17.

18. Questions?

19.

Notes de l'éditeur

* graph db usage poll
* Six degrees game * Relational databases can't easily answer certain types of questions
* first pass using a relational database * cast table: actor_name, movie_title * hard to visualize the solution * In order to do this, you need to do multiple passes or joins
* Each degree adds a join * Increases complexity * Decreases performance * Stop when the actor you're looking for is in the list
* this problem highlights the ugly truth about RDBs * they weren't designed to handle these types of problems. * arbitrary path query * RDB relationships join data, but are not data in themselves * Set math * Gather everything in the set that matches these criteria, then tell me if this thing is in the set * 1 set, no problem * 2nd set no problem * 3rd set not related to 1st * 4th not related to 2nd * 5th related to 1st and 4th * etc. * Relationships are only available between overlapping sets
* disjoint sets
* Graphs * Not X-Y * Computer Science definition of graphs * A graph is an ordered pair G = (V, E) where V is a set of vertices and E is a set of edges , which are pairs of vertices. * Node : vertex * Relationship : edge * Property : meta-datum attached to a node or relationship * Nodes can have arbitrary properties * Relationships are first-class citizens Have a type Have properties Have a direction Domain semantics Traversable in any direction * This is how graph dbs solve the problems that RDBs can't * Path : an ordered list of nodes and relationships * Paths are found using traversal algorithms
* Tree data-structures * Networks * Maps * vehicles on streets == packets through network * Relational databases are graphs!
* Make each record a node * Make every foreign key a relationship * RDB indexes are usually stored in a tree structure * Trees are graphs * Why not use RDBs? * The trouble with RDBs is how they are stored in memory and queried * Require a translation step from memory blocks to graph structure * Relationships not first-class citizens * Many problem domains map poorly to rows/tables
* Big Data ** billions of nodes and relationships in a single instance * &quot;Internet of Things&quot; buzzword * Social networking - friends of friends of friends of friends * Assembly/Manufacturing - 1 widget contains 3 gadgets each contain 2 gizmos * Map directions - starting at my house find a route to the office that goes past the pub * Multi-tenancy - root node per tenant * all queries start at root * No overlap between graphs = no accidental data spillage * Fraud: track transactions back to origination * Pretty much anything that can be drawn on a whiteboard
* Example: retail system * Customer makes Order * Store sells Order * Order contains Items * Supplier supplied Items * Customer rates Items * Did this customer rank supplier X highly? * Which suppliers sell the highest rated items? * Does item A get rated higher when ordered with Item B? * All can be answered with RDBs as well * Not as elegant * Not as performant
* This is where the power of graph dbs comes from * Paths - find any relationship chain between A and B * Kevin Bacon example, known start and end * Traversal - filter out paths that don't meet criteria * Complex path finding, base next decision on existing path from start to current position * Define path-finding (prune) and result filtering functions * Queries - Here is what I want, find it however you can * SPARQL, Gremlin, Cypher
* Actors are nodes * Movies are nodes * Relationship: Actor is IN a movie * pseudo-code shortened for brevity * Compare to degree selection join queries
* Cypher is &quot;what to find&quot; * describe the &quot;shape&quot; of the thing you're looking for * Very white-board friendly * Pros: easy to understand, query looks like domain model * Cons: not as powerful, not fully featured (YET) * result set is an array of arrays
* RDBs are really good at data aggregation * Set math, duh * Have to traverse the whole graph in order to do aggregation * Truly tabular means not a lot of relationships between the data types
* billions of nodes and relationships in a single instance * cluster replication * transactions * native bindings for Ruby, Python, and language that can run in JVM * Licensing