I’ve outgrown my basic stack. Now what?

Hivereader.com
I’ve outgrown my basic
stack. Now what?
Thoughts and feelings about growing
with Django and NoSQL

Hivereader.com
Our common stack is built on:
Super awesome and fast (once you learn what knobs to turn).
Lots of cool features and tools: pg_tune, pg_top, pg_bouncer
Django is a high-level Python Web framework that encourages
rapid development and clean, pragmatic design.
In-memory key-value store for small chunks of data.
Super simple and awesome

Hivereader.com
People are signing up and using your
app/game/site/bread maker/whatever

Hivereader.com
Things get start to get slow

Hivereader.com
Single server
appapp
DBDB
otherother

Hivereader.com
Find your bottlenecks:
Unless your app is doing something crazy
you’re mostly abusing the DB
Add more caching or, better yet, smarter caching

Hivereader.com
Welcome to Postgres config file
pg_tune is here to help*
*kind of

Hivereader.com
appapp
DBDB
otherother
appapp
DBDB
appapp
DBDB
otherother
appapp
DBDB
otherother

Hivereader.com
Need a solution for lots of data
that is growing quickly.
The solution needs to be targeted
for my problem.

Hivereader.com
NoSQL to the rescue?

Hivereader.com
But wait, can Postgres handle this?
most likely
partitioning
and
sharding
Can have lots of app code sometimes.
What about when you outgrow your shard key
“Don’t shard until you have to”
- every single talk I’ve seen
Slony?
“master to multiple slaves” replication

Hivereader.com
Lots of options
+ tons more

Hivereader.com
Nearly all of them are based on 2 papers
Built on CAP theorem
The theorem began as a conjecture made by University of California, Berkeley computer
scientist Eric Brewer at the 2000 Symposium on Principles of Distributed Computing.
In 2002, Seth Gilbert and Nancy Lynch of MIT published a formal proof of Brewer's
conjecture, rendering it a theorem.

Hivereader.com
Super fast
Advanced key-value store
Think of it as super memcached. With union math.
All data must fit in ram
It is often referred to as a data structure server
Keys can contain strings, hashes, lists, sets and sorted sets
Not a db solution but more of a helper.
This is now a part of our basic stack for most apps.

Hivereader.com
Document store
JSON-like documents with dynamic schemas
Ad-hoc queries
Indexing
Load-balancing MongoDB scales horizontally using sharding
MongoDB uses a readers-writer lock that allows concurrent reads access to a database but gives exclusive access to a single write
operation.However, when a write lock exists, a single write operation holds the lock exclusively, and no other read or write operations may share
the lock.
Global write lock*
Uncompressed field names
Safe off by default
Just google “mongo problems” or “moving off mongodb”

Hivereader.com
Uses pre-defined column family format
Map Reduce
Used by all people with with ‘big data’ problems
Amazing workhorse for data
You need a sizable cluster
Cluster setup can be difficult
/ hbase
I need to personally spend more time with this

Hivereader.com
JSON to store data,
JavaScript for MapReduce and HTTP for an API.
Views: embedded map/reduce
Multi-master replication
BigCouch, couchbase, Membase?
Kind of in a dev rut...
...but just pushed a new huge upgrade

Hivereader.com
Based hardcore on Amazon's Dynamo paper
Key Value store
Super good about failure, “no downtime”
Map Reduce / Secondary Indexes
Built-in full text search
Link walking
2 types of mapreduce
Javascript - can be slow as hell
Erlang - super fast

Hivereader.com
Key value + row-oriented = column family
Linear scalability and fault-tolerance on commodity
hardware or cloud infrastructure
Built by Facebook for Messages
Has CQL3 - think SQL, kind of
Baked auto cluster AMI
Super fast writes
Compresses data that’s not accessed a lot
Can tie in to Hadoop for big map reduce

Hivereader.com
Things to think about:
Is eventual consistency ok for you?
Do you know your queries you need right now?
Is your data complicated or simple?
How fast does it grow?
How long do you want that data to hang around?
Really think about trade offs.
Every system has its good and bad
There is no “winner”, so stop searching “which is best”
Think about which fits your use case
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
http://docs.basho.com/riak/1.2.1/references/appendices/comparisons/
Tons of links out there, just make sure they are relatively new

I’ve outgrown my basic stack. Now what?

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à I’ve outgrown my basic stack. Now what?

Similaire à I’ve outgrown my basic stack. Now what? (20)

Dernier

Dernier (20)

I’ve outgrown my basic stack. Now what?