** An update to the 2012 MongoUK presentation, given at NoSQL Birmingham/London meetup **
This presentation charts how Talis implemented tripod, a library that runs over the top of MongoDB, to provide access to large scale graph datasets with very high performance query access.
As Talis' own applications became web-scale, the company used tripod as a replacement for its earlier, general purpose RDF triple store, and maintained the graph-model in the code line whilst swapping in MongoDB underneath.
By prioritising on what really mattered to those applications, and discarding what did not, the company was able to extract extreme performance from graph based datasets using MongoDB running on commodity hardware.
https://github.com/talis/tripod-php
https://github.com/talis/tripod-node
24. Over time…
• Our apps become popular. Last week, average 4M
requests per day and at peak times 600k+ per hour
• Our dataset is growing in size - about 350M triples
this week
• Our apps needed more queries and more expensive
queries
• Our in-house triple store was EoL and out of date
26. System characteristics
• 99:1 read:write
• Well shared, tenant based system. Our largest
single customer has 35M triples
• Graph data structures and operations (merges, sub-graphs
etc.) well entrenched in the codebase, over
2M lines code (inc. libraries)
• Actually not that many distinct query shapes
28. DESCRIBE <http://example.com/John>
Give me all the triples about John as a graph
SELECT ?name ?age
WHERE {
<http://example.com/John> <foaf:name> ?name .
<http://example.com/John> <foaf:age> ?age .
}
Give me properties name, age of John as tabular data
33. {
_id: “example:John”,
“foaf:knows”: { u: “example:Jane” },
“rdf:type”: { u: “foaf:Person” },
“foaf:name”: { l: “John” }
}
_id is the unique primary key. There can only be one John
34. {
_id: “example:John”,
“foaf:knows”: { u: “example:Jane” },
“rdf:type”: { u: “foaf:Person” },
“foaf:name”: { l: “John” }
}
l means value is a
literal text value
_id is the unique primary key. There can only be one John
35. {
_id: “example:John”,
“foaf:knows”: { u: “example:Jane” },
“rdf:type”: { u: “foaf:Person” },
“foaf:name”: { l: “John” }
}
u means value is a
uri, or another node.
l means value is a
literal text value
_id is the unique primary key. There can only be one John
47. System characteristics
• 99:1 read:write
• Well shared, tenant based system. Our largest
single customer has 35M triples
• Graph data structures and operations (merges, sub-graphs
etc.) well entrenched in the codebase, over
2M lines code (inc. libraries).
• Actually not that many distinct query shapes.
52. What about tabular data?
• We also have tables and table specs
• Conceptually the same as views
• Instead of an array of graphs we have computed
columns for complex tabular queries
• You can page, limit, offset results just like you’d
expect
56. Tripod save()
• Based on change sets, you supply the old and new
graphs
• CBDs updated immediately. Write ahead transaction
log for multi-CBD writes
• Choice per save on whether to update views/tables
sync or async (eventually consistent)
• Async adds jobs to a Mongo based queue
61. Hardware
• Real tin, 2x Dell low-end rack mount servers
• 96Gb RAM, 24 cores
• RAID-10 disks, non-SSD
• Keep ‘em on the same LAN as your app servers
• About the same to lease per month than a couple of
c3.4xlarge (30Gb, 32vCPU)
• We’re about to add similar second cluster, 144Gb
62. Why Mongo?
RTFM, not HN comment feeds.
But seriously it could have been n other document DBs
Not just mongodb
Specific to our circumstances
YMMV
The theory part - remember I’m not a data scientist ;-)
Ball and stick diagrams
The balls are nodes and the sticks are named relationships between the nodes
This is an undirected graph
Ball and stick diagrams
The balls are nodes and the sticks are named relationships between the nodes
This is an undirected graph
This is a directed graph
directional relationship
Doesn’t tell us Jane knows John
A toolset to work with graph data.
Directed graphs
Values can be other Entities
This is a triple
The same node can be a subject or an object.
In RDF subjects and properties are actually URIs that can be dereferenced
Here the predicate is part of a public vocabulary called FOAF
Billions of triples out there on the public internets defined using FOAF
Namespacing - makes URIs shorter
In RDF subjects and properties are actually URIs that can be dereferenced
Here the predicate is part of a vocabulary called FOAF
Billions of triples out there on the public internets defined using FOAF
Here it is in ball and stick
Yes, you can!
Data schema only makes sense to you
Not graph data
Complex graphs quickly end in renormalisation hell, or many, many follow your nose queries
Real data graphs quickly get complicated
Really easy to merge datasets from different sources that talk ABOUT THE SAME THING
Global identifiers via URIs
Really easy to merge datasets from different sources that talk ABOUT THE SAME THING
Global identifiers via URIs
W3 standard
SQL-like, to an extent.
WHERE is Pattern matching, essentially joins
UNIONS, Geo extensions, etc.
4 main query types
We started working on our first application in 2008
Talis was 3 companies back then. One built a general purpose graph store, part of technical strategy to build on it
RDF based, integrates other data sources from around the web
We did caching for performance. Complicated!
Data size outgrew our existing general purpose technology stack, became hard to operate
Complex SPARQL queries expensive on large data sets
In 2008 even low hundred ms from the DB was acceptable (with caching). Today we do 20 queries a page and expect single digit or better performance.
Our graph store end of lifed
2012 - project to replace generalised triple store with something more specific to our app
FIND A NEW POD FOR OUR TRIPLES
It’s a library. Currently implemented in php and parts ported to node. Sorry, our apps are php.
We didn’t consider moving from the graph. You can’t just refactor the whole codebase to relational and flip a switch overnight and expect it to work. This was a moving target.
Lots of very simple data
These can be satisfied very easily and cheaply if you group all the immediate properties of a subject together
“Concise Bound Description”
Earlier example
Graph theory concept: CBD
Our data model: One document per CBD
In more detail
_id indexed by default
Mega fast queries with single docs returned, no cursors.
Micro secs on decent hardware.
Mega fast queries with single docs returned, no cursors.
Micro secs on decent hardware.
Contrast that to most triple stores, they traditionally model the triple. Cayley being one of them
Makes queries expensive. Have to deal with cursors with 1..n documents. Have to pluck values via multiple or complex queries.
Gets worse when you want to find matches by value
JOINS
Typical complex query
9 “joins”
Document databases don’t generally like joins. map reduce?!
Only thing that changes in this query is the URI
9 joins in this query = expensive
In the whole system we probably only had 20 queries that required joins
A revelation from the data gods!
Flexibility of SPARQL great for the developer but simply put hard to scale
1000’s of hours optimisation over relation DB’s query engines over decades
This is why in the old design we hid everything behind a cache
Pre compute all possible answers to the query
data storage cheap
Without pre-computed views
This is just a single join. Very messy.
What if John knows 50 people? n+1 queries.
We discard and re-generate views at write time
There’s only about 20 of them in our whole app.
Pre-computed typed views
1 query, ultra fast
When we do a save we do a lookup to see which views might be impacted
In our config
Simple config lang with a few keywords
This means we have to specify queries up front, not send them at run time
Most complex in our system. 11 joins!
This is a table row from our system
Instead of graphs key/value pairs
Note you can have multi-value cells (type). This was a limitation of SPARQL select for us.
CBD collections are read-write for the developer
table_rows, views read only, tripod driver manages regeneration
In our system:
50M distinct CBDs
34M distinct views
23M distinct table rows
roughly 800Mb per MT inc indexes and views etc.
This brings us nicely onto saves.
Trade speed with eventual consistency.
Mongo doesn’t have transactions. TLog is a separate mongo cluster used to control transactions + rollback. Also allows us to update a nightly backup to the last applied transaction in the case of total data loss.
TLog is in Mongo but a poor choice. Moving to Postgres.
Async faster, but not consistent. Depends on situation
Queue implemented in Mongo, moving to redis + resque (probably)
Tripod has built in ability to collect stats. We use statsD+graphite
Lot less tabular queries
Scale on left is ms.
This includes database, network to web server and the time marshalling into php objects. This is where the extra time is spent for views!
Cost wise cloud just didn’t stack up for us, esp in 2012.
Tin vs. Cloud like for like is more like 2x today, 8x cost in 2012
RAM is king here
We’re adding a second cluster with 144Gb shortly
PaaS is prohibitively expensive at scale. We had a support contract on the first cluster but on the second we’re going it alone.
Don’t mention write preference or I will shoot you in the head.
Looking for a document database not a graph database
Evaluated Couch, Riak and Postgres
- CouchBase was a new product, just merged with Memcache. Felt risky although map/reduce queries fitted well with views/tables
- Riak. Features we liked were commercially licences
- Postgres - JSON datatype was primitive at the time, worth a second look today tho
ServerDensity David Swiss Army Knife NoSQL
Community
- commercial
- & developer
Friendly API
Ultimately not bound to it - swapping out parts as we scale
There’s a lot of shit written about Mongo. Don’t read HN. Instead RTFM.
But not enough time
Mongo doesn’t have transactions. TLog is a separate mongo cluster used to control transactions + rollback. Also allows us to update a nightly backup to the last applied transaction in the case of total data loss.
TLog is in Mongo but a poor choice. Moving to Postgres.
Async faster, but not consistent. Depends on situation
Queue implemented in Mongo, moving to redis + resque (probably)
FINALLY before we go, a sneak peak of a project we’ve been working on to hide the graph entirely…
Our app already worked natively with graphs
But the model in most apps is not a graph
Aperture is our new project built on top of tripod allowing you to hide the complexity of graphs
Plain old php object