RELATIONAL
SQL ACID
Relational algebra Optimal for ad-hoc queries
Tables, Columns, Rows Sharding can be difficult
Metadata separate from data
Normalized data
Optimized storage
POPULAR RDBMS
MySQL Informix
SQL Server Progress
Oracle Pervasive
Postgres Sybase
DB2 Access
Interbase, Firebird …
SQL
Unified language to create and query both data and metadata
Similar to English
Verbose(!)
Can get complex for non-trivial queries
Does not expose execution plan – you say what you want it to
return, not how
SQL EXAMPLES
If you can say what you mean, you can query the existing data
Results are near-instant when querying based on primary key
select * from valute where id=1 and sid=42
Results are fast when querying based on non-unique index
select valuta from valute where ((id=1 and sid=42)) and (valute.firma_id=123 and
valute.firma__sid=1)
Very readable for trivial queries
select r.customer,sum(rs.iznos) sveukupno from racuni r
join racuni_stavke rs on r.id=rs.racun_id
where r.id=5
order by rs.ordinal
SQL EXAMPLES
Not so readable for non-trivial queries
select "MP" tip_prometa, mprac.broj broj_racuna, mprac_stavke.kolicina kolicina, (mprac.tecaj*mprac_stavke.kolicina*mprac_stavke.rabat_iznos)
rabat_iznos, (round(mprac_stavke.cijena - mprac_stavke.rabat_iznos - mprac_stavke.rabat2_iznos - mprac_stavke.rabat3_iznos - mprac_stavke.porez1 -
mprac_stavke.porez2 - mprac_stavke.porez_potrosnja,6)*mprac_stavke.kolicina) iznos, (mprac_stavke.kolicina* ifnull((select
sum(pn_cijena*kolicina)/sum(kolicina) from mprac_skl left join skl_stavke on mprac_skl.skl_id=skl_stavke.skl_id and
mprac_skl.skl__sid=skl_stavke.skl__sid where mprac_skl.mprac_id=mprac.id and mprac_skl.mprac__sid=mprac.sid and
skl_stavke.artikl_id=mprac_stavke.artikl_id and skl_stavke.artikl__sid=mprac_stavke.artikl__sid ),0) ) iznos_nabavno, ifnull( (select
sum(mprac_stavke.kolicina*ambalaze.naknada_kom) from artikli_ambalaze left join ambalaze on ambalaze.id=artikli_ambalaze.ambalaza_id and
ambalaze.sid=artikli_ambalaze.ambalaza__sid where artikli_ambalaze.artikl_id=artikli.id and artikli_ambalaze.artikl__sid=artikli.sid and
ambalaze.kalkulacija="N" ),0) naknada, radnici_komercijalisti.ime racun_komercijalist_ime, (select naziv from skladista where skladista.tip_skladista="M"
and pj_id=mprac.pj_id limit 1) skladiste_naziv , pj.naziv pj_naziv, mprac.datum,
cast(concat("(",if(DayOfWeek(mprac.datum)=1,7,DayOfWeek(mprac.datum)-1),") ", if(DayOfWeek(mprac.datum)=1,"1 Nedjelja",
if(DayOfWeek(mprac.datum)=2,"2 Ponedjeljak", if(DayOfWeek(mprac.datum)=3,"3 Utorak", if(DayOfWeek(mprac.datum)=4,"4 Srijeda",
if(DayOfWeek(mprac.datum)=5,"5 Èetvratk", if(DayOfWeek(mprac.datum)=6,"6 Petak", if(DayOfWeek(mprac.datum)=7,"7 Subota","")))))))) as char(15))
dan_u_tjednu, cast(month(mprac.datum) as unsigned) mjesec, cast(week(mprac.datum) as unsigned) tjedan, cast(quarter(mprac.datum) as unsigned) kvartal,
cast(year(mprac.datum) as unsigned) godina, cast(if(tipovi_komitenata.tip="F",trim(concat(partneri.ime," ",partneri.prezime)),partneri.naziv) as char(200))
kupac_naziv, partneri_mjesta.postanski_broj kupac_mjesto, partneri_mjesta.mjesto kupac_mjesto_naziv, partneri_grupe_mjesta.naziv …
RDBMS SCALING
Vertical scaling
• Better CPU, more CPUs
• More RAM
• More disks
• SAN
Partitioning
Sharding
PARTITIONING
With many rows and heavy usage, partitioning is a must
What to partition
• Tables
• Indexes
• Views
Typical cases
• Monthly data
• Alphabetical keys
RDBMS SHARDING
Sharding means using several databases where each represents part
of data (500 clients on one server, another 500 on another)
Requires changing application code
connect(calculate_server_from(sharding_key))
Impossible to join data from different databases, so choose your
sharding key wisely
Very difficult to repartition your databases based on a new key
RDBMS METADATA
Metadata: data describing other data
RDBMS structures are explicitly defined, and each data type is
optimized for storage
Lots of constraints
Can get slow with lot of data
NOSQL
“Not SQL”, “Not only SQL”
Core NoSQL databases invented mostly because RDBMS made
life very hard for huge and heavy traffic web databases
NoSQL databases are the ones significantly different from
relational databases
NOSQL TYPES
Wide Column Store / Column Families
Document Store
Key Value / Tuple Store
Graph Databases
Object Databases
XML Databases
Multivalue Databases
4 MAIN DATA MODELS
Key-Value Stores
BigTable Clones (aka "ColumnFamily")
Document Databases
Graph Databases
Source: http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-to-size-and-scaling-to-complexity.html
KEY/VALUE STORES
Lineage: Amazon's Dynamo paper and Distributed HashTables.
Data model: A global collection of key-value pairs.
Example: Voldemort, Dynomite, Tokyo Cabinet
Source: http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-to-size-and-scaling-to-complexity.html
BIGTABLE CLONES
Lineage: Google's BigTable paper.
Data model: Column family, i.e. a tabular model where each row at
least in theory can have an individual configuration of columns.
Example: HBase, Hypertable, Cassandra
Source: http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-to-size-and-scaling-to-complexity.html
DOCUMENT DATABASES
Lineage: Inspired by Lotus Notes.
Data model: Collections of documents, which contain key-value
collections (called "documents").
Example: CouchDB, MongoDB, Riak
Source: http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-to-size-and-scaling-to-complexity.html
GRAPH DATABASES
Lineage: Draws from Euler and graph theory.
Data model: Nodes & relationships, both which can hold key-value
pairs
Example: AllegroGraph, InfoGrid, Neo4j
Source: http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-to-size-and-scaling-to-complexity.html
NOSQL CHARACTERISTICTS
Almost infinite horizontal scaling
Very fast
Performance doesn’t deteriorate with growth (much)
No fixed table schemas
No join operations
Ad-hoc queries difficult or impossible
Structured storage
Almost everything happens in RAM
REAL-WORLD USE
Cassandra
• Facebook (original developer, used it till late 2010)
• Twitter
• Digg
• Reddit
• Rackspace
• Cisco
BigTable
• Google (open-source version is HBase)
MongoDB
• Foursquare
• Craigslist
• Bit.ly
• SourceForge
• GitHub
WHY NOSQL?
Handles huge databases (I know, I said it before)
Redundancy, data is pretty safe on commodity hardware
Super flexible queries using map/reduce
Rapid development (no fixed schema, yeah!)
Very fast for common use cases
PERFORMANCE
RDBMS uses buffer to ensure ACID properties
NoSQL does not guarantee ACID and is therefore much faster
We don’t need ACID everywhere!
I used MySQL and switched to MongDB for my analytics app
• Data processing (every minute) is 4x faster with MongoDB, despite
being a lot more detailed (due to much simple development)
SCALING
Simple web application with not much traffic
• Application server, database server all on one machine
SCALING
Even more traffic comes in
• Load balancer
• Application server x2
• Database server
SCALING
Even more traffic comes in
• Load balancer x N
• easy
• Application server x N
• easy
• Database server xN
• hard for SQL databases
SQL SLOWDOWN
Not linear!
http://www.slideshare.net/rightscale
/scaling-sql-and-nosql-databases-in-the-
cloud
NOSQL SCALING
Need more storage?
• Add more servers!
Need higher performance?
• Add more servers!
Need better reliability?
• Add more servers!
SCALING SUMMARY
You can scale SQL databases (Oracle, MySQL, SQL Server…)
• This will cost you dearly
• If you don’t have a lot of money, you will reach limits quickly
You can scale NoSQL databases
• Very easy horizontal scaling
• Lots of open-source solutions
• Scaling is one of the basic incentives for design, so it is well handled
• Scaling is the cause of trade-offs causing you to have to use
map/reduce
RAM
Why map/reduce? I just need some simple queries. Tomorrow I
will need some other queries….
SQL databases are optimized for very efficient disk access, but for
significant scaling need RAM caching (MySQL+memcached)
NoSQL databases are designed to keep whole working set in RAM
WORKING SET
In real-world use working set is much less than complete database
• For analytics 99% of queries will be regarding last 30 days
As you need RAM only for working set, you can use commodity
servers, VPS, and just add more as your app becomes more popular
WORKING SET WOES
Foursquare has millions of users and working set the same as the database
They used a single 66GB Amazon EC2 High-Memory Quadruple Extra Large
Instance (with cheese) for millions of users
When their RAM usage was 65GB, they decided to shard
Too late, they started to have disk swaps
Disk is much slower than RAM - 100x slowdown
Server could not keep up due to swapping
11 hours outage (ouch!)
MAP/REDUCE
Google’s framework for processing highly distributable
problems across huge datasets using a large number of
computers
Let’s define large number of computers
• Cluster if all of them have same hardware
• Grid unless Cluster (if !Cluster for old-style programmers)
MAP/REDUCE
Process split into two phases
• Map
• Take the input, partition it delegate to other machines
• Other machines can repeat the process, leading to tree structure
• Each machine returns results to the machine who gave it the task
• Reduce
• collect results from machines you gave the tasks
• combine results and return it to requester
• Slower than sequential data processing, but massively parallel
• Sort petabyte of data in a few hours
• Input, Map, Shuffle, Reduce, Output
MONGODB
Data is stored as BSON (binary JSON)
• Makes it very well suited for languages with native JSON support
Map/Reduce written in Javascript
• Slow! There is one single thread of execution in Javascript
Master/slave replication (auto failover with replica sets)
Sharding built-in
Uses memory mapped files for data storage
Performance over features
On 32bit systems, limited to ~2.5Gb
An empty database takes up 192Mb
GridFS to store big data + metadata (not actually an FS)
Source: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
CASSANDRA
Written in: Java
Protocol: Custom, binary (Thrift)
Tunable trade-offs for distribution and replication (N, R, W)
Querying by column, range of keys
BigTable-like features: columns, column families
Writes are much faster than reads (!)
• Constant write time regardless of database size
Map/reduce possible with Apache Hadoop
Source: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
HBASE
Written in: Java
Main point: Billions of rows X millions of columns
Modeled after BigTable
Map/reduce with Hadoop
Query predicate push down via server side scan and get filters
Optimizations for real time queries
A high performance Thrift gateway
HTTP supports XML, Protobuf, and binary
Cascading, hive, and pig source and sink modules
No single point of failure
While Hadoop streams data efficiently, it has overhead for starting map/reduce jobs. HBase is column oriented key/value store and
allows for low latency read and writes.
Random access performance is like MySQL
Source: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
REDIS
Written in: C/C++
Main point: Blazing fast
Disk-backed in-memory database,
Master-slave replication
Simple values or hash tables by keys,
Has sets (also union/diff/inter)
Has lists (also a queue; blocking pop)
Has hashes (objects of multiple fields)
Sorted sets (high score table, good for range queries)
Has transactions (!)
Values can be set to expire (as in a cache)
Pub/Sub lets one implement messaging (!)
Source: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
COUCHDB
Written in: Erlang
Main point: DB consistency, ease of use
Bi-directional (!) replication, continuous or ad-hoc, with conflict detection, thus, master-master replication. (!)
MVCC - write operations do not block reads
Previous versions of documents are available
Crash-only (reliable) design
Needs compacting from time to time
Views: embedded map/reduce
Formatting views: lists & shows
Server-side document validation possible
Authentication possible
Real-time updates via _changes (!)
Attachment handling
CouchApps (standalone JS apps)
Source: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
HADOOP
Apache project
A framework that allows for the distributed processing of large
data sets across clusters of computers
Designed to scale up from single servers to thousands of machines
Designed to detect and handle failures at the application layer,
instead of relying on hardware for it
HADOOP
Created by Doug Cutting, who named it after his son's toy elephant
Hadoop subprojects
• Cassandra
• HBase
• Pig
Hive was a Hadoop subproject, but is now a top-level Apache project
Used by many large & famous organizations
• http://wiki.apache.org/hadoop/PoweredBy
Scales to hundreds or thousands of computers, each with several processor cores
Designed to efficiently distribute large amounts of work across a set of machines
Hundreds of gigabytes of data constitute the low end of Hadoop-scale
Built to process "web-scale" data on the order of hundreds of gigabytes to terabytes or petabytes
HADOOP
Uses distributed file system (HDFS)
• Designed to hold very large amounts of data (terabytes or even
petabytes)
• Files are stored in a redundant fashion across multiple machines to
ensure their durability to failure and high availability to very parallel
applications
• Data organized into directories and files
• Files are divided into block (64MB by default) and distributed across
nodes
Design of HDFS is based on the design of the Google File System
HIVE
A petabyte-scale data warehouse system for Hadoop
Easy data summarization, ad-hoc queries
Query the data using a SQL-like language called HiveQL
Hive compiler generates map-reduce jobs for most queries
PIG
Platform for analyzing large data sets
High-level language for expressing data analysis programs
Compiler produces sequences of Map-Reduce programs
Textual language called Pig Latin
• Ease of programming
• System optimizes task execution automatically
• Users can create their own functions
PIG LATIN
Pig Latin – high level Map/Reduce programming
Equivalent to SQL for RDBMS systems.
Pig Latin can be extended using Java User Defined Functions
“Word Count” script in Pig Latin
SUMMARY
NoSQL is a great problem solver if you need it
Choose your NoSQL platform carefully as each is designed for
specific purpose
Get used to Map/Reduce
It’s not a sin to use NoSQL alongside (yes)SQL database
I am really happy to work with MongoDB instead of MySQL