SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
How to build a
scalable graph database
Bryn Cooke
The smart way
In this talk
1. What does it take to build a graph database?
2. Why shouldn’t you do this at home.
3. What do you use this for?
Graph family tree
Graph database recipe
1. Model
2. Language
3. Storage
Model
bob
since: 2001
steph
bob:
knows
:steph
age: 30age: 34
knows known
Property Graph RDF
Language
g.V().has('name', 'marko').out('knows').values('name')
Storage
The adjacency list
Vertex Adjacent to
A B, D, E
B
C B
D C
E D, F
F
A
B
C
E
D
F
//TODO
• Storage
• Indexing
• Commit log
• Drivers
• Caching
• Schema
• Metrics
• Backup/Restore
• Logging
• Security
• Testing
• Support
• Failover
• QoS
• Paging
• Partitioning
• Sorting
• Compaction
• Repair
• Community
• Bux fixing
• Optimisation
Storage - Cassandra
• Fast
• Distributed
• Scalable
• Reliable
• 11 years of development
• 54 committers (listed on apache)
• 274 contributors (listed on github)
The adjacency list (in Cassandra)
Here's what you could do
C*
C*
C*
C*C*
My Graph
Database
Client
Client
Client
Client
Client
Here's what you could do
C*
C*
C*
C*C*
My Graph
Database
Here's what you should do
C*
C*
C*
C*C*
DS Graph
Client
Client
Client
Client
Deep integration with DataStax Enterprise
DataStax Enterprise
• DataStax Enterprise scalability > Cassandra scalability.
• Analytics integration.
• Search integration.
• Thread optimisation.
• Continuous paging.
• Prefetching.
• First class schema integration.
Today’s Graph Database Market
Graph
Problems > Graph
Databases
Typical customer 360 queries
Offline
fast
Human
fast
Machine
fast
Analytics
CQL
Search
Responsetime
Simple Complex
No go zone
DSE
• Find me Jenny.
• Find me all people
with similar names
to 'Jenny'.
• Tell there are
duplicate Jennys.
• Find how Jenny
and John are
connected.
• Find how
influential Jenny is
in my application.
Find me Jenny
Offline
fast
Human
fast
Machine
fast
Analytics
CQL
Search
Responsetime
Simple Complex
No go zone
DSE
How Complex?
• Simple
How Fast?
• Machine
What?
• CQL
Why?
• Single partition
lookup
• Single iteration
Find me all people with similar names to 'Jenny'
Offline
fast
Human
fast
Machine
fast
Analytics
CQL
Search
Responsetime
Simple Complex
No go zone
DSE
How Complex?
• Medium
How Fast?
• Human Fast
What?
• Search
• Graph
Why?
• Single index
lookup
• Single iteration
Tell there are duplicate Jennys
Offline
fast
Human
fast
Machine
fast
Analytics
CQL
Search
Responsetime
Simple Complex
No go zone
DSE
How Complex?
• Medium
How Fast?
• Offline
What?
• Analytics
• Graph
Why?
• Aggregation
• Multiple Iteration
Find how Jenny and John are connected
Offline
fast
Human
fast
Machine
fast
Analytics
CQL
Search
Responsetime
Simple Complex
No go zone
DSE
How Complex?
• Complex
How Fast?
• Machine
What?
• Graph
Why?
• Multiple partition
lookup
• Multiple iteration
Find how influential Jenny is in my application
Offline
fast
Human
fast
Machine
fast
Analytics
CQL
Search
Responsetime
Simple Complex
No go zone
DSE
How Complex?
• Complex
How Fast?
• Offline
What?
• Spark Analytics
• Graph via PageRank
Why?
• Full scan
• Unknown iterations
Typical customer 360 queries
Offline
fast
Human
fast
Machine
fast
Analytics
CQL
Search
Responsetime
Simple Complex
No go zone
DSE
• Find me Jenny.
• Find me all people
with similar names
to 'Jenny'.
• Tell there are
duplicate Jennys.
• Find how Jenny
and John are
connected.
• Find how
influential Jenny is
in my application.
Summary
1. What it takes to create a graph database
a. Model
b. Language
c. Storage
2. How you can leverage an existing storage engine, and why Cassandra is a
great choice.
3. Solving graph problems requires more than just the basics. Search and
Analytics are essential tools, especially graph database.
Don't try this at home
Do not try replicate 100 person years of
dev effort creating your own storage
engine.
Creating a graph database that scales is
tough enough.
Try it now
https://downloads.datastax.com/#labs
Labs
Thank You

Contenu connexe

Tendances

Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Ontico
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentationTao Feng
 
From Postgres to ScyllaDB: Migration Strategies and Performance Gains
From Postgres to ScyllaDB: Migration Strategies and Performance GainsFrom Postgres to ScyllaDB: Migration Strategies and Performance Gains
From Postgres to ScyllaDB: Migration Strategies and Performance GainsScyllaDB
 
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis LabsRedis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis LabsHostedbyConfluent
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 
Elasticsearch From the Bottom Up
Elasticsearch From the Bottom UpElasticsearch From the Bottom Up
Elasticsearch From the Bottom Upfoundsearch
 
RDBMS to Graph
RDBMS to GraphRDBMS to Graph
RDBMS to GraphNeo4j
 
ELK in Security Analytics
ELK in Security Analytics ELK in Security Analytics
ELK in Security Analytics nullowaspmumbai
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScyllaDB
 
LMAX Disruptor as real-life example
LMAX Disruptor as real-life exampleLMAX Disruptor as real-life example
LMAX Disruptor as real-life exampleGuy Nir
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino ProjectMartin Traverso
 
[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영NAVER D2
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Databricks
 
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino BusaReal-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino BusaSpark Summit
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxElasticsearch
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesAmazon Web Services
 
Building a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaBuilding a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaGuozhang Wang
 
Building Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkBuilding Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkDatabricks
 

Tendances (20)

Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
 
Kibana overview
Kibana overviewKibana overview
Kibana overview
 
From Postgres to ScyllaDB: Migration Strategies and Performance Gains
From Postgres to ScyllaDB: Migration Strategies and Performance GainsFrom Postgres to ScyllaDB: Migration Strategies and Performance Gains
From Postgres to ScyllaDB: Migration Strategies and Performance Gains
 
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis LabsRedis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Elasticsearch From the Bottom Up
Elasticsearch From the Bottom UpElasticsearch From the Bottom Up
Elasticsearch From the Bottom Up
 
RDBMS to Graph
RDBMS to GraphRDBMS to Graph
RDBMS to Graph
 
ELK in Security Analytics
ELK in Security Analytics ELK in Security Analytics
ELK in Security Analytics
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
 
LMAX Disruptor as real-life example
LMAX Disruptor as real-life exampleLMAX Disruptor as real-life example
LMAX Disruptor as real-life example
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino Project
 
[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0
 
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino BusaReal-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
 
Building a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaBuilding a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache Kafka
 
kafka
kafkakafka
kafka
 
Building Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkBuilding Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache Spark
 

Similaire à Graph in Apache Cassandra. The World’s Most Scalable Graph Database

CM UTaipei Kaggle Share
CM UTaipei Kaggle ShareCM UTaipei Kaggle Share
CM UTaipei Kaggle Share志明 陳
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comJungsu Heo
 
ETL for the masses with Power Query and M
ETL for the masses with Power Query and METL for the masses with Power Query and M
ETL for the masses with Power Query and MRégis Baccaro
 
Análisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic StackAnálisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic StackElasticsearch
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training IntroductionMax De Marzi
 
A tale of 3 databases
A tale of 3 databasesA tale of 3 databases
A tale of 3 databasesChris Skardon
 
Test driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDBTest driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDBAndrew Siemer
 
From Paper to Power using Azure Form Recognizer (Azure Sydney UG 2020)
From Paper to Power using Azure Form Recognizer (Azure Sydney UG 2020)From Paper to Power using Azure Form Recognizer (Azure Sydney UG 2020)
From Paper to Power using Azure Form Recognizer (Azure Sydney UG 2020)Jernej Kavka (JK)
 
Neo4j Training Cypher
Neo4j Training CypherNeo4j Training Cypher
Neo4j Training CypherMax De Marzi
 
Elastic Stack roadmap deep dive
Elastic Stack roadmap deep diveElastic Stack roadmap deep dive
Elastic Stack roadmap deep diveElasticsearch
 
Análisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackAnálisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackElasticsearch
 
Migration from Redshift to Spark
Migration from Redshift to SparkMigration from Redshift to Spark
Migration from Redshift to SparkSky Yin
 
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...AOE
 
Database theory and modeling
Database theory and modelingDatabase theory and modeling
Database theory and modelingElizabeth Smith
 
Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06jimbojsb
 
Windycityrails page performance
Windycityrails page performanceWindycityrails page performance
Windycityrails page performanceJohn McCaffrey
 
API Simplicity == Speed; Designing APIs That are Easy and Fun to Use
API Simplicity == Speed; Designing APIs That are Easy and Fun to UseAPI Simplicity == Speed; Designing APIs That are Easy and Fun to Use
API Simplicity == Speed; Designing APIs That are Easy and Fun to UseHarold Madsen
 
Introduction to SQL++ for Big Data: Same Language, More Power
Introduction to SQL++ for Big Data: Same Language, More PowerIntroduction to SQL++ for Big Data: Same Language, More Power
Introduction to SQL++ for Big Data: Same Language, More PowerAll Things Open
 

Similaire à Graph in Apache Cassandra. The World’s Most Scalable Graph Database (20)

CM UTaipei Kaggle Share
CM UTaipei Kaggle ShareCM UTaipei Kaggle Share
CM UTaipei Kaggle Share
 
Betabit - syrwag 2018-03-28
Betabit - syrwag 2018-03-28Betabit - syrwag 2018-03-28
Betabit - syrwag 2018-03-28
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.com
 
ETL for the masses with Power Query and M
ETL for the masses with Power Query and METL for the masses with Power Query and M
ETL for the masses with Power Query and M
 
Análisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic StackAnálisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic Stack
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training Introduction
 
A tale of 3 databases
A tale of 3 databasesA tale of 3 databases
A tale of 3 databases
 
Test driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDBTest driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDB
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
From Paper to Power using Azure Form Recognizer (Azure Sydney UG 2020)
From Paper to Power using Azure Form Recognizer (Azure Sydney UG 2020)From Paper to Power using Azure Form Recognizer (Azure Sydney UG 2020)
From Paper to Power using Azure Form Recognizer (Azure Sydney UG 2020)
 
Neo4j Training Cypher
Neo4j Training CypherNeo4j Training Cypher
Neo4j Training Cypher
 
Elastic Stack roadmap deep dive
Elastic Stack roadmap deep diveElastic Stack roadmap deep dive
Elastic Stack roadmap deep dive
 
Análisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackAnálisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic Stack
 
Migration from Redshift to Spark
Migration from Redshift to SparkMigration from Redshift to Spark
Migration from Redshift to Spark
 
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
 
Database theory and modeling
Database theory and modelingDatabase theory and modeling
Database theory and modeling
 
Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06
 
Windycityrails page performance
Windycityrails page performanceWindycityrails page performance
Windycityrails page performance
 
API Simplicity == Speed; Designing APIs That are Easy and Fun to Use
API Simplicity == Speed; Designing APIs That are Easy and Fun to UseAPI Simplicity == Speed; Designing APIs That are Easy and Fun to Use
API Simplicity == Speed; Designing APIs That are Easy and Fun to Use
 
Introduction to SQL++ for Big Data: Same Language, More Power
Introduction to SQL++ for Big Data: Same Language, More PowerIntroduction to SQL++ for Big Data: Same Language, More Power
Introduction to SQL++ for Big Data: Same Language, More Power
 

Plus de Connected Data World

Systems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenSystems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenConnected Data World
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaConnected Data World
 
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Connected Data World
 
How to get started with Graph Machine Learning
How to get started with Graph Machine LearningHow to get started with Graph Machine Learning
How to get started with Graph Machine LearningConnected Data World
 
The years of the graph: The future of the future is here
The years of the graph: The future of the future is hereThe years of the graph: The future of the future is here
The years of the graph: The future of the future is hereConnected Data World
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2Connected Data World
 
From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3Connected Data World
 
In Search of the Universal Data Model
In Search of the Universal Data ModelIn Search of the Universal Data Model
In Search of the Universal Data ModelConnected Data World
 
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Connected Data World
 
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Connected Data World
 
Semantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleSemantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleConnected Data World
 
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Connected Data World
 
Schema, Google & The Future of the Web
Schema, Google & The Future of the WebSchema, Google & The Future of the Web
Schema, Google & The Future of the WebConnected Data World
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
Elegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsElegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsConnected Data World
 
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...Connected Data World
 
Graph for Good: Empowering your NGO
Graph for Good: Empowering your NGOGraph for Good: Empowering your NGO
Graph for Good: Empowering your NGOConnected Data World
 
What are we Talking About, When we Talk About Ontology?
What are we Talking About, When we Talk About Ontology?What are we Talking About, When we Talk About Ontology?
What are we Talking About, When we Talk About Ontology?Connected Data World
 

Plus de Connected Data World (20)

Systems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenSystems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van Harmelen
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora Lassila
 
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
 
How to get started with Graph Machine Learning
How to get started with Graph Machine LearningHow to get started with Graph Machine Learning
How to get started with Graph Machine Learning
 
Graphs in sustainable finance
Graphs in sustainable financeGraphs in sustainable finance
Graphs in sustainable finance
 
The years of the graph: The future of the future is here
The years of the graph: The future of the future is hereThe years of the graph: The future of the future is here
The years of the graph: The future of the future is here
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
 
From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3
 
In Search of the Universal Data Model
In Search of the Universal Data ModelIn Search of the Universal Data Model
In Search of the Universal Data Model
 
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
 
Graph Realities
Graph RealitiesGraph Realities
Graph Realities
 
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
 
Semantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleSemantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scale
 
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
 
Schema, Google & The Future of the Web
Schema, Google & The Future of the WebSchema, Google & The Future of the Web
Schema, Google & The Future of the Web
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Elegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsElegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property Graphs
 
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
 
Graph for Good: Empowering your NGO
Graph for Good: Empowering your NGOGraph for Good: Empowering your NGO
Graph for Good: Empowering your NGO
 
What are we Talking About, When we Talk About Ontology?
What are we Talking About, When we Talk About Ontology?What are we Talking About, When we Talk About Ontology?
What are we Talking About, When we Talk About Ontology?
 

Dernier

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 

Dernier (20)

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 

Graph in Apache Cassandra. The World’s Most Scalable Graph Database

  • 1. How to build a scalable graph database Bryn Cooke The smart way
  • 2. In this talk 1. What does it take to build a graph database? 2. Why shouldn’t you do this at home. 3. What do you use this for?
  • 4. Graph database recipe 1. Model 2. Language 3. Storage
  • 8. The adjacency list Vertex Adjacent to A B, D, E B C B D C E D, F F A B C E D F
  • 9. //TODO • Storage • Indexing • Commit log • Drivers • Caching • Schema • Metrics • Backup/Restore • Logging • Security • Testing • Support • Failover • QoS • Paging • Partitioning • Sorting • Compaction • Repair • Community • Bux fixing • Optimisation
  • 10. Storage - Cassandra • Fast • Distributed • Scalable • Reliable • 11 years of development • 54 committers (listed on apache) • 274 contributors (listed on github)
  • 11. The adjacency list (in Cassandra)
  • 12. Here's what you could do C* C* C* C*C* My Graph Database Client Client Client Client
  • 13. Client Here's what you could do C* C* C* C*C* My Graph Database
  • 14. Here's what you should do C* C* C* C*C* DS Graph Client Client Client Client
  • 15. Deep integration with DataStax Enterprise DataStax Enterprise • DataStax Enterprise scalability > Cassandra scalability. • Analytics integration. • Search integration. • Thread optimisation. • Continuous paging. • Prefetching. • First class schema integration.
  • 16. Today’s Graph Database Market Graph Problems > Graph Databases
  • 17. Typical customer 360 queries Offline fast Human fast Machine fast Analytics CQL Search Responsetime Simple Complex No go zone DSE • Find me Jenny. • Find me all people with similar names to 'Jenny'. • Tell there are duplicate Jennys. • Find how Jenny and John are connected. • Find how influential Jenny is in my application.
  • 18. Find me Jenny Offline fast Human fast Machine fast Analytics CQL Search Responsetime Simple Complex No go zone DSE How Complex? • Simple How Fast? • Machine What? • CQL Why? • Single partition lookup • Single iteration
  • 19. Find me all people with similar names to 'Jenny' Offline fast Human fast Machine fast Analytics CQL Search Responsetime Simple Complex No go zone DSE How Complex? • Medium How Fast? • Human Fast What? • Search • Graph Why? • Single index lookup • Single iteration
  • 20. Tell there are duplicate Jennys Offline fast Human fast Machine fast Analytics CQL Search Responsetime Simple Complex No go zone DSE How Complex? • Medium How Fast? • Offline What? • Analytics • Graph Why? • Aggregation • Multiple Iteration
  • 21. Find how Jenny and John are connected Offline fast Human fast Machine fast Analytics CQL Search Responsetime Simple Complex No go zone DSE How Complex? • Complex How Fast? • Machine What? • Graph Why? • Multiple partition lookup • Multiple iteration
  • 22. Find how influential Jenny is in my application Offline fast Human fast Machine fast Analytics CQL Search Responsetime Simple Complex No go zone DSE How Complex? • Complex How Fast? • Offline What? • Spark Analytics • Graph via PageRank Why? • Full scan • Unknown iterations
  • 23. Typical customer 360 queries Offline fast Human fast Machine fast Analytics CQL Search Responsetime Simple Complex No go zone DSE • Find me Jenny. • Find me all people with similar names to 'Jenny'. • Tell there are duplicate Jennys. • Find how Jenny and John are connected. • Find how influential Jenny is in my application.
  • 24. Summary 1. What it takes to create a graph database a. Model b. Language c. Storage 2. How you can leverage an existing storage engine, and why Cassandra is a great choice. 3. Solving graph problems requires more than just the basics. Search and Analytics are essential tools, especially graph database.
  • 25. Don't try this at home Do not try replicate 100 person years of dev effort creating your own storage engine. Creating a graph database that scales is tough enough.