1. Databases and the cloud
Henrik Ingo
TTKK
2012-02-27
Please share and re-use this presentation,
licensed under Creative Commons Attribution license.
2012-02-27 TTKK 1
5. 1994
USSR fell 3 years ago
Finland yet to win first gold medal in hockey
Windows 3.0 replacing MS-DOS
Windows 95 did not yet exist
I learn how to write .bat scripts from my dad
Word 6.0 replacing WordPerfect 5.1
Excel 5.0 replacing Lotus
I had not yet used Internet
2012-02-27 TTKK 5
6. 1994
SQL
Standard Interpreted
C++ & Visual Basic
Flexible and expressive command line env
LAN "SQL for secretaries"
Printer Database Good for English speakers
2000: Bad for IDE w IntelliSense
Client - Server architecture
Stored procedures rule
DBA is king of business logic
2012-02-27 TTKK 6
9. Learning SQL
NoSQL advocates say: Henrik says:
SQL is hard to learn MS Access easy to learn
To really scale, you must Darn, I always got
de-normalize normalization
Most people don't get I know, teaching n:n
normalization right relations was always fun
Impedance mismatch INSERT INTO t ...
between oo and sql serialize($obj)
I just need a simple key- SELECT value FROM t
value store WHERE key=?;
2012-02-27 TTKK 9
10. 2000
MS Access not supported
in Linux
PostgreSQL and MySQL
Learn SQL for real
phpMyAdmin
to ease the pain
2012-02-27 TTKK 10
11. 2005
Graduate & do websites for
a living.
Spend 3-6 days creating
and re-creating properly
normalized DB schema
In MS Access I just
clicked next, next, next
and ok.
Time is money
Invent NoSQL:
PHP serialize()
MySQL BLOB
2012-02-27 TTKK 11
12. 2008
Join MySQL AB
Confused about
synchronous vs
asynchronous replication
Learn a lot about MySQL
NDB Cluster
2012-02-27 TTKK 12
13. High Availability
...and why it is more difficult
for databases.
2011-10-25 13
14. Performance Durability
Transactions / second (throughput) Speaking of databases
Response time (latency) Committed data is not lost
Percentiles (95% - 99%) D in ACID
Get any response at all (tps > 0) Replicas, snapshots
Measured as percentile (99.999%) point in time, backups
High Availability
Clustering Replication
Monitoring Redundancy
Failover
2011-10-25 14
15. Uptime
Percentile target Max downtime per year
90% 36 days
99% 3.65 days
99.5% 1.83 days
99.9% 8.76 hours
99.99% 52.56 minutes
99.999% 5.26 minutes
99.9999% 31.5 seconds
Beyond system availability: Average downtime per user.
2011-10-25 15
16. Clustering frameworks - general
Heartbeat
Corosync
VM of choice
Red Hat Cluster
Solaris Cluster
...
Failover
2011-10-25 16
17. Clustering frameworks - DB
Heartbeat
Corosync
MMM
VM of choice
MHA
Tungsten Enterprise
Solaris Cluster
...
Failover
2011-10-25 17
18. Sounds simple. What could possibly go wrong?
Old Master must stop service
(VIP, os, DB). But it is not
responding, so how do you make
it stop?
Polling from the outside.
Interval = 1 sec, 10 sec, 60 sec!
What if replication fails first and
client transactions don't?
Polling connectivity of DB nodes
but not client p.o.v.
Failover can be expensive (SAN,
DRBD) -> false positives costly
2011-10-25 18
19. Active-Active Shared disk clustering. (State of the art?)
Oracle RAC
(ScaleDB?)
Failover
Disk Disk
2011-10-25 19
20. Sounds simple. What could possibly go wrong?
Well, actually it's pretty good.
Data integrity protection is good.
But...
SAN is considered the biggest
SPOF of all.
Recovery time on single node
failure is +60 sec
Recovery time? Because internally
each node will lock some pages
and process them locally.
(Bloody expensive)
Disk
2011-10-25 20
22. Sounds simple. What could possibly go wrong?
Synchronous Multi Master
replication rocks :-)
Failure detection inherent in
replication protocol.
Instant failovers.
Bonus: Both Galera and NDB
provision new nodes
automatically.
Problem is solved. Time for new
problems...
2011-10-25 22
23. Performance
SAN has "some" overhead compared to local
disk
DRBD = 50% performance penalty
Replication, when implemented correctly, has 0
performance penalty.
Galera and NDB = more performance
2011-10-25 23
24. Is a clustering solution part of the solution or the part of the problem?
"Causes of Downtime in Production MySQL Servers"
by Baron Schwartz:
#1: Human error
#2: SAN
Complex clustering framework + SAN =
More problems, not less!
Galera (and NDB) =
Replication based, no SAN or DRBD
No "failover moment", no false positives
No clustering framework needed
No load balancer needed (JDBC loadbalance)
Simple and elegant!
2011-10-25 24
26. Scale-out
Invented by MySQL / LAMP
stack.
Laughed at by other RDBMSes
Now everyone does it. Because
the Internet is too big.
Originally with read-only
replicas. Then sharding.
Easy for http, inconvenient for
databases.
NoSQL systems do it really well.
MySQL NDB does it really well
and Galera pretty well.
2012-02-27 TTKK 26
27. DBA's life is more interesting!
HTTP RDBMS
Stateless Where everyone else stores
their state
Usually can partition/shard Needs expertise to
partition/shard
Scale-out = boot more servers Scale-out = Boot more servers.
Backup DB. Restore DB. Setup
replication. Tweak application
code...
Writes = Write to the database Write = Which partition/node?
Beware of read-only slaves.
Beware of eventual
consistency...
2012-02-27 TTKK 27
28. But it can be done
Automating DB deployments is MySQL
more complex. But not
Amazon RDS, others.
impossible.
Severalnines
NDB and Galera handle data
provisioning really well. But Also supports Galera and
deploying the empty nodes still NDB
manual labor.
Scalr
Scale-out happens because you
PostgreSQL
must. Scale-down will never
happen if it's too much work. Heroku
EnterpriseDB
NoSQL
Usually do this relatively well.
2012-02-27 TTKK 28
29. NoSQL
...and why it is more difficult
for relational databases.
2011-10-25 29
33. Things NoSQL guys do really well
Reading tip:
Reading tip:
No SQL (parsing) Original Amazon Dynamo
Original Amazon Dynamo
Schemaless = Win! for agile development paper
paper
HA with quorum consistency: R + W > N
Transparent sharding
Graph databases
N:N relationships, what's the big deal?
Actually makes sense to give up on SQL!
Map Reduce
Bypass ETL, get clean data
Implemented in Java or Python (or Erlang)
2012-02-27 TTKK 33
34. Best of both worlds
NoSQL MySQL
No SQL HandlerSocket
. Memcache API, NDB API
Simple key-value store BLOB
. SELECT v FROM ... WHERE k=?
...and secondary indexes Functional indexes
Virtual columns, etc...
Synchronous replication
Quorum consistency Galera, NDB
.
We have it too
Transparent sharding NDB, Spider + proprietary solutions
Graph databases Damn N:N relations!
Map Reduce against text files Map Reduce against RDBMS
Java and Python C++ can be done right
. Drizzle
2012-02-27 TTKK 34
35. Cloud
...and why it is more difficult
for databases.
2011-10-25 35
36. 4 different DB deployments
Server HW
VM
DB process
User (schema)
2011-10-25 36
37. Consider memory utilization
"All of computation is just different layers of caching."
Dedicated HW No virtualization overhead
Great performance
€€€€ Memory allocation per DB
Not cloud instance is fixed (without
restart) -> cache of idle DB
instances is wasted.
Virtualization overhead
No virtualization overhead.
"Safety margin" of unallocated
memory per VM. Busy schema can use more
cache and idle schema evicted
Memory allocation per VM is from cache.
fixed (without reboot) -> cache
of idle DB instances is wasted.
2011-10-25 37
38. Multi-tenancy
Web hosting w MySQL Drizzle
Cpanel = hack True multi-tenancy,
"virtualization for databases"
Not "cloud". Users expect
"dedicated" database Ready but experimental
instance.
2011-10-25 38