SlideShare une entreprise Scribd logo
1  sur  30
NEO4J, TITAN &
CASSANDRA
Comparisons
JOHN JENSON
12 YEARS BUILDING SOFTWARE FOR THE WEB
Salt Lake City
 Software consultant, developer, project lead, on dozens of projects
 Extensive experience developing Java apps and using Oracle
 Two largest relational databases, banking database, PHI Database
 Massive relational databases, tables in excess of 500 million rows
Boston
 Cengage Learning – Principal Engineer
 MEAN stack development
 A year of Experience developing with Mongo, and Neo4J
 Researched Cassandra and Titan
 TandemSeven – Principal Architect
 Software Consulting
GRAPH DATABASES
 Arcs and Nodes
 Objects and the relationships between them
 Objects (nodes)
 Schemaless
 Can have arbitrary attributes
 Relationships (arcs)
 Have a type
 Can also have arbitrary attributes
GRAPH DATABASES
 Anything that can be modeled in a relational database, could also be
modeled in relational database.
 Nothing new here
 Querying a tree in SQL sucks
 The power of a graph database comes from the query language
 Oracle provides “connect by” feature for trees, but it only works for trees and you
have to use Oracle
 What if your data is highly connected and breaks the rules of a tree?
 Good luck bringing all of your data into memory and writing your own algorithm to
traverse your data
NEO4J
 Cypher Query
 Powerful query language (this is what sets Neo4J apart from other graph DBs)
MATCH (actor:Person)-[:ACTED_IN]->(movie:Movie)
WHERE movie.title =~ "T.*"
RETURN movie.title as title, collect(actor.name) as cast
ORDER BY title ASC LIMIT 10;
 http://docs.neo4j.org/chunked/stable/examples-from-sql-to-
cypher.html (sql to cypher reference)
 http://docs.neo4j.org/refcard/2.0/ (useful cypher reference)
COLUMN DATABASES
 Rows are organized into individual tables
 Columns are represented as rows in those tables
 Designed to reduce IO and seek times when accessing data
CASSANDRA
 Massively distributed
 Can support massive clusters with 75,000+ machines
 CQL – Cassandra Query Language
 No joins or subqueries
SELECT *
FROM users
WHERE last_name = ”smith”;
 MapReduce
 Hadoop is all you get
CASSANDRA - PERFORMANCE
"In terms of scalability, there is a clear winner throughout our
experiments. Cassandra achieves the highest throughput for the
maximum number of nodes in all experiments" although "this comes
at the price of high write and read latencies.” – Toronto University
 Absolutely amazing throughput
 Not so amazing response times for each individual query
TITAN
 Runs on Cassandra or Hbase
 Oracle Berkeley DB
 Can store massive graphs
 Doesn’t support Cypher Query
 Supports Gremlin
 http://sql2gremlin.com/ (useful gremlin reference)
CASSANDRA VS NEO4J
15 points of comparison
CASSANDRA
Cassandra is a non-relational data store that
stores data in tables. Cassandra organizes
columns into rows and rows into tables.
Neo is a graph database that organizes data
into arcs and nodes.
NEO4J
Point 1
CASSANDRA
Because columns are stored as rows, tables
can have a huge number of columns
(maximum of 2 billion columns).
Neo can house at most 34 billion nodes, 34
billion relationships, and 68 billion properties in
total.
NEO4J
Point 2
CASSANDRA
All tables must have an index which is used as
a basis for sharding the data.
Indexes can be added and removed wherever
desired.
NEO4J
Point 3
CASSANDRA
Cassandra has impressive HA capabilities that
can span multiple data centers with little effort.
Neo uses master slave replication.
NEO4J
Point 4
CASSANDRA
Cassandra can elegantly run on huge clusters
that replicate and shard data effortlessly.
Neo doesn’t shard your data.
NEO4J
Point 5
CASSANDRA
Cassandra scales linearly by adding more
hardware. There is pretty much no limit to the
hardware that you can add.
Neo read throughput scales linearly with the
number of servers, but the number of servers
in a cluster has to stay relatively small.
NEO4J
Point 6
CASSANDRA
The dataset can grow virtually endlessly while
still getting the same performance.
The dataset size is limited to at most 34 billion
nodes, 34 billion relationships, and 68 billion
properties in total.
NEO4J
Point 7
CASSANDRA
Cassandra does not use a master/slave
paradigm, so there is no down-time when a
machine dies.
There is a brief window of downtime while a
new master is elected.
NEO4J
Point 8
CASSANDRA
Cannot do traversal queries
Traversal queries that have exponential cost
on traditional RDBMS have linear cost on Neo.
NEO4J
Point 9
CASSANDRA
Write performance is just as good as read
performance.
Write performance is slower than read
performance.
NEO4J
Point 10
CASSANDRA
Every query has additional latency due to
cluster overhead
Individual queries can be serviced much faster
with far less latency.
NEO4J
Point 11
CASSANDRA
ACID transactions are mostly supported, but
with tunable consistency.
ACID transactions are fully supported and
completely consistent, but there is a
performance hit for the consistency.
NEO4J
Point 12
CASSANDRA
Cassandra can perform operations completely
synchronously or alternatively at variously
levels of consistency with corresponding
performance on an operation by operation
basis.
Consistency is not tunable
NEO4J
Point 13
CASSANDRA
Cassandra uses it’s own query language
(CQL) that has similar syntax to SQL (no joins)
Neo uses Cypher and also supports Gremlin
NEO4J
Point 14
CASSANDRA
Instead of performing joins at runtime data
must be de-normalized before hand
Graphs are normalized and highly connected.
Traversals are very fast.
NEO4J
Point 15
CASSANDRA
“Unlike in relational databases, it’s not easy to tune or introduce new
query patterns in Cassandra by simply creating secondary indexes
or building complex SQLs (using joins, order by, group by?) because
of its high-scale distributed nature. So think about query patterns up
front, and design column families accordingly.” –Ebay
http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-
best-practices-part-1/#.U-j6fICwIeY
NEO4J
“A single instance of Neo4j can house at most 34 billion nodes, 34
billion relationships, and 68 billion properties, in total. Businesses like
Google obviously push these limits, but in general, this does not pose
a limitation in practice. It is also important to understand that these
limits were chosen purely as a storage optimization, and do not
indicate any particular shortcoming of the product. They are easily,
and are in fact being, increased.”
http://info.neotechnology.com/rs/neotechnology/images/Understanding
%20Neo4j%20Scalability(2).pdf
Gemlin
g.V('customerId','ALFKI').as('customer')
.out('ordered').out('contains').out('is').as('products')
.in('is').in('contains').in('ordered').except(‘customer')
.out('ordered').out('contains').out('is').except('products')
.groupCount().cap().orderMap(T.decr)[0..<5].productName
Cypher
MATCH (c1)-[:ordered]->(o1)-[:contains]->(p1)<-[:contains]-(o2)<-[:ordered]-(c2)-
[:ordered]->(o3)-[:contains]->(p2)
WHERE c1.customerId = "ALFKI" AND c1 != c2 AND p1 != p2
RETURN p2.productName, count(p2) num
SQL
SELECT TOP (5) [t14].[ProductName]
FROM (SELECT COUNT(*) AS [value],
[t13].[ProductName]
FROM [customers] AS [t0]
CROSS APPLY (SELECT [t9].[ProductName]
FROM [orders] AS [t1]
CROSS JOIN [order details] AS [t2]
INNER JOIN [products] AS [t3]
ON [t3].[ProductID] = [t2].[ProductID]
CROSS JOIN [order details] AS [t4]
INNER JOIN [orders] AS [t5]
ON [t5].[OrderID] = [t4].[OrderID]
LEFT JOIN [customers] AS [t6]
ON [t6].[CustomerID] = [t5].[CustomerID]
CROSS JOIN ([orders] AS [t7]
CROSS JOIN [order details] AS [t8]
INNER JOIN [products] AS [t9]
ON [t9].[ProductID] = [t8].[ProductID])
WHERE NOT EXISTS(SELECT NULL AS
[EMPTY]
FROM [orders] AS [t10]
CROSS JOIN [order details] AS [t11]
INNER JOIN [products] AS [t12]
ON [t12].[ProductID] =
[t11].[ProductID]
WHERE [t9].[ProductID] =
[t12].[ProductID]
AND [t10].[CustomerID] =
[t0].[CustomerID]
AND [t11].[OrderID] =
[t10].[OrderID])
AND [t6].[CustomerID] <> [t0].[CustomerID]
AND [t1].[CustomerID] = [t0].[CustomerID]
AND [t2].[OrderID] = [t1].[OrderID]
AND [t4].[ProductID] = [t3].[ProductID]
AND [t7].[CustomerID] = [t6].[CustomerID]
AND [t8].[OrderID] = [t7].[OrderID]) AS [t13]
WHERE [t0].[CustomerID] = N'ALFKI'
GROUP BY [t13].[ProductName]) AS [t14]
ORDER BY [t14].[value] DESC
CONCLUSION
If one plans on writing a recommendation
queries, a graph db is a more elegant fit than a
relational DB.
Only use Titan if you need it
 You have an insanely large graph
 Or you expect an insanely high load
Neo4J
 Faster queries
 A more straightforward and powerful query
language

Contenu connexe

Tendances

Cassandra advanced data modeling
Cassandra advanced data modelingCassandra advanced data modeling
Cassandra advanced data modelingRomain Hardouin
 
Engineering fast indexes
Engineering fast indexesEngineering fast indexes
Engineering fast indexesDaniel Lemire
 
Time series database by Harshil Ambagade
Time series database by Harshil AmbagadeTime series database by Harshil Ambagade
Time series database by Harshil AmbagadeSigmoid
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Brian O'Neill
 
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Spark Summit
 
Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Jelena Zanko
 
Koalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache SparkKoalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache SparkDatabricks
 
Revealing the Power of Legacy Machine Data
Revealing the Power of Legacy Machine DataRevealing the Power of Legacy Machine Data
Revealing the Power of Legacy Machine DataDatabricks
 
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...jaxLondonConference
 
Full stack analytics with Hadoop 2
Full stack analytics with Hadoop 2Full stack analytics with Hadoop 2
Full stack analytics with Hadoop 2Gabriele Modena
 
Deep Dive : Spark Data Frames, SQL and Catalyst Optimizer
Deep Dive : Spark Data Frames, SQL and Catalyst OptimizerDeep Dive : Spark Data Frames, SQL and Catalyst Optimizer
Deep Dive : Spark Data Frames, SQL and Catalyst OptimizerSachin Aggarwal
 
Spark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student SlidesSpark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student SlidesDatabricks
 
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...Аліна Шепшелей
 
Large scale, interactive ad-hoc queries over different datastores with Apache...
Large scale, interactive ad-hoc queries over different datastores with Apache...Large scale, interactive ad-hoc queries over different datastores with Apache...
Large scale, interactive ad-hoc queries over different datastores with Apache...jaxLondonConference
 
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...Spark Summit
 
Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5Haoyuan Li
 
Game Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupGame Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupJelena Zanko
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkDatabricks
 

Tendances (20)

Spark - Philly JUG
Spark  - Philly JUGSpark  - Philly JUG
Spark - Philly JUG
 
Cassandra advanced data modeling
Cassandra advanced data modelingCassandra advanced data modeling
Cassandra advanced data modeling
 
Engineering fast indexes
Engineering fast indexesEngineering fast indexes
Engineering fast indexes
 
Time series database by Harshil Ambagade
Time series database by Harshil AmbagadeTime series database by Harshil Ambagade
Time series database by Harshil Ambagade
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
 
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
 
Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20
 
Working with the Scalding Type -Safe API
Working with the Scalding Type -Safe APIWorking with the Scalding Type -Safe API
Working with the Scalding Type -Safe API
 
Koalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache SparkKoalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache Spark
 
Revealing the Power of Legacy Machine Data
Revealing the Power of Legacy Machine DataRevealing the Power of Legacy Machine Data
Revealing the Power of Legacy Machine Data
 
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...
 
Full stack analytics with Hadoop 2
Full stack analytics with Hadoop 2Full stack analytics with Hadoop 2
Full stack analytics with Hadoop 2
 
Deep Dive : Spark Data Frames, SQL and Catalyst Optimizer
Deep Dive : Spark Data Frames, SQL and Catalyst OptimizerDeep Dive : Spark Data Frames, SQL and Catalyst Optimizer
Deep Dive : Spark Data Frames, SQL and Catalyst Optimizer
 
Spark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student SlidesSpark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student Slides
 
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
 
Large scale, interactive ad-hoc queries over different datastores with Apache...
Large scale, interactive ad-hoc queries over different datastores with Apache...Large scale, interactive ad-hoc queries over different datastores with Apache...
Large scale, interactive ad-hoc queries over different datastores with Apache...
 
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
 
Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5
 
Game Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupGame Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid Meetup
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
 

En vedette

Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3Matthias Broecheler
 
Intro to Graph Databases Using Tinkerpop, TitanDB, and Gremlin
Intro to Graph Databases Using Tinkerpop, TitanDB, and GremlinIntro to Graph Databases Using Tinkerpop, TitanDB, and Gremlin
Intro to Graph Databases Using Tinkerpop, TitanDB, and GremlinCaleb Jones
 
Graph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DBGraph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DBMohamed Taher Alrefaie
 
Titan: Big Graph Data with Cassandra
Titan: Big Graph Data with CassandraTitan: Big Graph Data with Cassandra
Titan: Big Graph Data with CassandraMatthias Broecheler
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionSymeon Papadopoulos
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityCurtis Mosters
 
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataTitan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataMarko Rodriguez
 
Graph Processing with Apache TinkerPop
Graph Processing with Apache TinkerPopGraph Processing with Apache TinkerPop
Graph Processing with Apache TinkerPopJason Plurad
 
OrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databasesOrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databasesCurtis Mosters
 
Graphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXGraphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXAndrea Iacono
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j InternalsTobias Lindaaker
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesMax De Marzi
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use CasesMax De Marzi
 
Starting from Scratch with the MEAN Stack
Starting from Scratch with the MEAN StackStarting from Scratch with the MEAN Stack
Starting from Scratch with the MEAN StackMongoDB
 
Mongo db model relationships with documents
Mongo db model relationships with documentsMongo db model relationships with documents
Mongo db model relationships with documentsDr. Awase Khirni Syed
 
TinkerPop: a story of graphs, DBs, and graph DBs
TinkerPop: a story of graphs, DBs, and graph DBsTinkerPop: a story of graphs, DBs, and graph DBs
TinkerPop: a story of graphs, DBs, and graph DBsJoshua Shinavier
 
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersNear-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersSymeon Papadopoulos
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics EngineMarko Rodriguez
 
Building a Mobile Data Platform with Cassandra - Apigee Under the Hood (Webcast)
Building a Mobile Data Platform with Cassandra - Apigee Under the Hood (Webcast)Building a Mobile Data Platform with Cassandra - Apigee Under the Hood (Webcast)
Building a Mobile Data Platform with Cassandra - Apigee Under the Hood (Webcast)Apigee | Google Cloud
 

En vedette (20)

Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3
 
Intro to Graph Databases Using Tinkerpop, TitanDB, and Gremlin
Intro to Graph Databases Using Tinkerpop, TitanDB, and GremlinIntro to Graph Databases Using Tinkerpop, TitanDB, and Gremlin
Intro to Graph Databases Using Tinkerpop, TitanDB, and Gremlin
 
Graph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DBGraph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DB
 
Titan: Big Graph Data with Cassandra
Titan: Big Graph Data with CassandraTitan: Big Graph Data with Cassandra
Titan: Big Graph Data with Cassandra
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detection
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionality
 
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataTitan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
 
Graph Processing with Apache TinkerPop
Graph Processing with Apache TinkerPopGraph Processing with Apache TinkerPop
Graph Processing with Apache TinkerPop
 
OrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databasesOrientDB vs Neo4j - and an introduction to NoSQL databases
OrientDB vs Neo4j - and an introduction to NoSQL databases
 
Graphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXGraphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphX
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
MongoDB vs OrientDB
MongoDB vs OrientDBMongoDB vs OrientDB
MongoDB vs OrientDB
 
Starting from Scratch with the MEAN Stack
Starting from Scratch with the MEAN StackStarting from Scratch with the MEAN Stack
Starting from Scratch with the MEAN Stack
 
Mongo db model relationships with documents
Mongo db model relationships with documentsMongo db model relationships with documents
Mongo db model relationships with documents
 
TinkerPop: a story of graphs, DBs, and graph DBs
TinkerPop: a story of graphs, DBs, and graph DBsTinkerPop: a story of graphs, DBs, and graph DBs
TinkerPop: a story of graphs, DBs, and graph DBs
 
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersNear-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics Engine
 
Building a Mobile Data Platform with Cassandra - Apigee Under the Hood (Webcast)
Building a Mobile Data Platform with Cassandra - Apigee Under the Hood (Webcast)Building a Mobile Data Platform with Cassandra - Apigee Under the Hood (Webcast)
Building a Mobile Data Platform with Cassandra - Apigee Under the Hood (Webcast)
 

Similaire à Neo, Titan & Cassandra

NoSQL Options Compared
NoSQL Options ComparedNoSQL Options Compared
NoSQL Options ComparedSergey Bushik
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, HowIgor Moochnick
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandraBrian Enochson
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 
Scaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlScaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlDavid Daeschler
 
Minnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraMinnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraJeff Bollinger
 
If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.Lukas Smith
 
Nosql Introduction
Nosql IntroductionNosql Introduction
Nosql IntroductionAnju Singh
 
An efficient data mining solution by integrating Spark and Cassandra
An efficient data mining solution by integrating Spark and CassandraAn efficient data mining solution by integrating Spark and Cassandra
An efficient data mining solution by integrating Spark and CassandraStratio
 
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack Clustrix
 
Webcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopWebcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopImpetus Technologies
 
Codemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labCodemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labUgo Landini
 
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...Olivier DASINI
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGijiert bestjournal
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...DataStax
 

Similaire à Neo, Titan & Cassandra (20)

NoSQL Options Compared
NoSQL Options ComparedNoSQL Options Compared
NoSQL Options Compared
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandra
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Scaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlScaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosql
 
Minnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with CassandraMinnebar 2013 - Scaling with Cassandra
Minnebar 2013 - Scaling with Cassandra
 
If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.
 
Stratio big data spain
Stratio   big data spainStratio   big data spain
Stratio big data spain
 
NoSql Databases
NoSql DatabasesNoSql Databases
NoSql Databases
 
Nosql Introduction
Nosql IntroductionNosql Introduction
Nosql Introduction
 
An efficient data mining solution by integrating Spark and Cassandra
An efficient data mining solution by integrating Spark and CassandraAn efficient data mining solution by integrating Spark and Cassandra
An efficient data mining solution by integrating Spark and Cassandra
 
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
 
Webcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopWebcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond Hadoop
 
No sql
No sqlNo sql
No sql
 
Codemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labCodemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech lab
 
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
No sql database
No sql databaseNo sql database
No sql database
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 

Dernier

%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburgmasabamasaba
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 

Dernier (20)

%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 

Neo, Titan & Cassandra

  • 2. JOHN JENSON 12 YEARS BUILDING SOFTWARE FOR THE WEB Salt Lake City  Software consultant, developer, project lead, on dozens of projects  Extensive experience developing Java apps and using Oracle  Two largest relational databases, banking database, PHI Database  Massive relational databases, tables in excess of 500 million rows Boston  Cengage Learning – Principal Engineer  MEAN stack development  A year of Experience developing with Mongo, and Neo4J  Researched Cassandra and Titan  TandemSeven – Principal Architect  Software Consulting
  • 3. GRAPH DATABASES  Arcs and Nodes  Objects and the relationships between them  Objects (nodes)  Schemaless  Can have arbitrary attributes  Relationships (arcs)  Have a type  Can also have arbitrary attributes
  • 4. GRAPH DATABASES  Anything that can be modeled in a relational database, could also be modeled in relational database.  Nothing new here  Querying a tree in SQL sucks  The power of a graph database comes from the query language  Oracle provides “connect by” feature for trees, but it only works for trees and you have to use Oracle  What if your data is highly connected and breaks the rules of a tree?  Good luck bringing all of your data into memory and writing your own algorithm to traverse your data
  • 5. NEO4J  Cypher Query  Powerful query language (this is what sets Neo4J apart from other graph DBs) MATCH (actor:Person)-[:ACTED_IN]->(movie:Movie) WHERE movie.title =~ "T.*" RETURN movie.title as title, collect(actor.name) as cast ORDER BY title ASC LIMIT 10;  http://docs.neo4j.org/chunked/stable/examples-from-sql-to- cypher.html (sql to cypher reference)  http://docs.neo4j.org/refcard/2.0/ (useful cypher reference)
  • 6. COLUMN DATABASES  Rows are organized into individual tables  Columns are represented as rows in those tables  Designed to reduce IO and seek times when accessing data
  • 7. CASSANDRA  Massively distributed  Can support massive clusters with 75,000+ machines  CQL – Cassandra Query Language  No joins or subqueries SELECT * FROM users WHERE last_name = ”smith”;  MapReduce  Hadoop is all you get
  • 8. CASSANDRA - PERFORMANCE "In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments" although "this comes at the price of high write and read latencies.” – Toronto University  Absolutely amazing throughput  Not so amazing response times for each individual query
  • 9. TITAN  Runs on Cassandra or Hbase  Oracle Berkeley DB  Can store massive graphs  Doesn’t support Cypher Query  Supports Gremlin  http://sql2gremlin.com/ (useful gremlin reference)
  • 10. CASSANDRA VS NEO4J 15 points of comparison
  • 11. CASSANDRA Cassandra is a non-relational data store that stores data in tables. Cassandra organizes columns into rows and rows into tables. Neo is a graph database that organizes data into arcs and nodes. NEO4J Point 1
  • 12. CASSANDRA Because columns are stored as rows, tables can have a huge number of columns (maximum of 2 billion columns). Neo can house at most 34 billion nodes, 34 billion relationships, and 68 billion properties in total. NEO4J Point 2
  • 13. CASSANDRA All tables must have an index which is used as a basis for sharding the data. Indexes can be added and removed wherever desired. NEO4J Point 3
  • 14. CASSANDRA Cassandra has impressive HA capabilities that can span multiple data centers with little effort. Neo uses master slave replication. NEO4J Point 4
  • 15. CASSANDRA Cassandra can elegantly run on huge clusters that replicate and shard data effortlessly. Neo doesn’t shard your data. NEO4J Point 5
  • 16. CASSANDRA Cassandra scales linearly by adding more hardware. There is pretty much no limit to the hardware that you can add. Neo read throughput scales linearly with the number of servers, but the number of servers in a cluster has to stay relatively small. NEO4J Point 6
  • 17. CASSANDRA The dataset can grow virtually endlessly while still getting the same performance. The dataset size is limited to at most 34 billion nodes, 34 billion relationships, and 68 billion properties in total. NEO4J Point 7
  • 18. CASSANDRA Cassandra does not use a master/slave paradigm, so there is no down-time when a machine dies. There is a brief window of downtime while a new master is elected. NEO4J Point 8
  • 19. CASSANDRA Cannot do traversal queries Traversal queries that have exponential cost on traditional RDBMS have linear cost on Neo. NEO4J Point 9
  • 20. CASSANDRA Write performance is just as good as read performance. Write performance is slower than read performance. NEO4J Point 10
  • 21. CASSANDRA Every query has additional latency due to cluster overhead Individual queries can be serviced much faster with far less latency. NEO4J Point 11
  • 22. CASSANDRA ACID transactions are mostly supported, but with tunable consistency. ACID transactions are fully supported and completely consistent, but there is a performance hit for the consistency. NEO4J Point 12
  • 23. CASSANDRA Cassandra can perform operations completely synchronously or alternatively at variously levels of consistency with corresponding performance on an operation by operation basis. Consistency is not tunable NEO4J Point 13
  • 24. CASSANDRA Cassandra uses it’s own query language (CQL) that has similar syntax to SQL (no joins) Neo uses Cypher and also supports Gremlin NEO4J Point 14
  • 25. CASSANDRA Instead of performing joins at runtime data must be de-normalized before hand Graphs are normalized and highly connected. Traversals are very fast. NEO4J Point 15
  • 26. CASSANDRA “Unlike in relational databases, it’s not easy to tune or introduce new query patterns in Cassandra by simply creating secondary indexes or building complex SQLs (using joins, order by, group by?) because of its high-scale distributed nature. So think about query patterns up front, and design column families accordingly.” –Ebay http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling- best-practices-part-1/#.U-j6fICwIeY
  • 27. NEO4J “A single instance of Neo4j can house at most 34 billion nodes, 34 billion relationships, and 68 billion properties, in total. Businesses like Google obviously push these limits, but in general, this does not pose a limitation in practice. It is also important to understand that these limits were chosen purely as a storage optimization, and do not indicate any particular shortcoming of the product. They are easily, and are in fact being, increased.” http://info.neotechnology.com/rs/neotechnology/images/Understanding %20Neo4j%20Scalability(2).pdf
  • 29. SQL SELECT TOP (5) [t14].[ProductName] FROM (SELECT COUNT(*) AS [value], [t13].[ProductName] FROM [customers] AS [t0] CROSS APPLY (SELECT [t9].[ProductName] FROM [orders] AS [t1] CROSS JOIN [order details] AS [t2] INNER JOIN [products] AS [t3] ON [t3].[ProductID] = [t2].[ProductID] CROSS JOIN [order details] AS [t4] INNER JOIN [orders] AS [t5] ON [t5].[OrderID] = [t4].[OrderID] LEFT JOIN [customers] AS [t6] ON [t6].[CustomerID] = [t5].[CustomerID] CROSS JOIN ([orders] AS [t7] CROSS JOIN [order details] AS [t8] INNER JOIN [products] AS [t9] ON [t9].[ProductID] = [t8].[ProductID]) WHERE NOT EXISTS(SELECT NULL AS [EMPTY] FROM [orders] AS [t10] CROSS JOIN [order details] AS [t11] INNER JOIN [products] AS [t12] ON [t12].[ProductID] = [t11].[ProductID] WHERE [t9].[ProductID] = [t12].[ProductID] AND [t10].[CustomerID] = [t0].[CustomerID] AND [t11].[OrderID] = [t10].[OrderID]) AND [t6].[CustomerID] <> [t0].[CustomerID] AND [t1].[CustomerID] = [t0].[CustomerID] AND [t2].[OrderID] = [t1].[OrderID] AND [t4].[ProductID] = [t3].[ProductID] AND [t7].[CustomerID] = [t6].[CustomerID] AND [t8].[OrderID] = [t7].[OrderID]) AS [t13] WHERE [t0].[CustomerID] = N'ALFKI' GROUP BY [t13].[ProductName]) AS [t14] ORDER BY [t14].[value] DESC
  • 30. CONCLUSION If one plans on writing a recommendation queries, a graph db is a more elegant fit than a relational DB. Only use Titan if you need it  You have an insanely large graph  Or you expect an insanely high load Neo4J  Faster queries  A more straightforward and powerful query language