This document provides an agenda and overview for a presentation on using MySQL in AWS. The presentation covers RDS and EC2/MySQL options, managing performance, availability, implementation choices, DDL, common failures, and costs. It also introduces the presenters from PalominoDB and encourages attendees to ask questions.
4. Interactivity
Ask away; we've got time. Ben will be glad to
try and solve your problems.
AWS tutorial?
• "Click on the replica button and come back
in 30 minutes"
• "PIOPs <-> EBS. Uncheck that box and
come back in 2 hours"
7. RDS un-benefits
Fully managed
• No binlog access
• No SUPER
• No flexible topology
The more experienced a DBA you are, the
crankier you will be.
8. RDS improves!
Like all AWS properties, RDS features continue
to improve all the time.
It's perfect for developers, proofs of concept,
one-offs, absorbing temporary load.
(Tungsten supports replication into RDS from
MySQL).
10. Why RDS or EC2?
RDS
1. You can tolerate ~99% uptime (which many
people can)
2. You don't have lots of DBAs and need to
optimize for operational ease
EC2
1. Multi-region availability
2. Vertical scaling
16. CLI pain
It's written in Java right now*. The JVM
overhead makes it painfully slow for large-
scale automation.
* The future is the Redshift CLI (python,
coherent interface)
17. CLI output
Verbose and clunky
DBINSTANCE,scp01-replica2,2010-05-
22T01:53:47.372Z,db.m1.large,mysql,50,(nil),master,available,scp01-
replica2.cdysvrlmdirl.us-east-1.rds.amazonaws.com,3306,us-east-
1b,(nil),0,(nil),(nil),(nil),(nil),(nil),(nil),(nil),(nil),sun:05:00-sun:09:00,23:00-
01:00,(nil),n,5.1.50,n,simcoprod01,general-public-license
SECGROUP,Name,Status
SECGROUP,default,active
PARAMGRP,Group Name,Apply Status
PARAMGRP,default.mysql5.1,in-synch
Combining the worst features of machine- and human-readable text
formats.
21. EC2 Region SLA
99.95% SLA
“Annual Uptime Percentage” is calculated by
subtracting from 100% the percentage of 5
minute periods during the Service Year in which
Amazon EC2 was in the state of “Region
Unavailable.”
("Region unavailable" == "multiple AZs are toast")
Implies you've got to go multi-region
22. EC2 Region SLA
~99.2% Reality
The previous definition is very strict; 2 or more
regions; can't create instances; blah, blah.
1-2X year multi-AZ degradation (EBS, network,
who knows)
28. Instance sizing
Dynamicity == reduced cost
(Now, in general, $$ isn't why you go to the
cloud; it's operational efficiency & reduced
friction).
Have a spreadsheet and do capacity analyses
frequently.
30. Provisioned IOPs
Really, really nice!
• Drastically lower failure rate (order of
magnitude)
• Guaranteed throughput
Not so nice.
• Costs $$
31. Provisioned IOPs
Masters and replicas can be different.
You can convert PIOPs <-> EBS back and
forth.
Consider multi-AZ PIOPs master for the best in
durability.
32. VPCs
Go VPC from the beginning for production.
• Hard to convert
• Use ELBs for internal load-balancing
• Not sharing the 10.net with everybody
33. Cluster compute
Placement groups are available for CC
instances.
"Placement group" means "physically close
hardware".
Very low-latency 10GbE full bisection
40. Dumping Users
mysql --host=olddatabasehost -BNe "select
concat(''',user,''@'',host,''') from mysql.user where user
not like 'rds%' and user != 'master'" |
while read uh; do mysql --host=olddatabasehost -BNe
"show grants for $uh" | sed 's/$/;/; s///g'; done >
user_grants.sql
http://www.villescorner.com/2012/11/mysqldump-from-
amazon-rds-headaches-of.html
43. Local failures
Database crashes
Human error
Localized EBS hang
How to mitigate?
Multi-AZ PIOPs master
Operational excellence
Throw away & rebuild replicas
44. Local failures redux
Local failures should be, at most, annoyances.
Runbooks*
Game days
Monitoring
* Process is a poor substitute for competence.
45. If you can't deal with
expected and desired
change, you'll never be
able to handle unexpected
and unwanted change.
46. Regional failures
A well-designed architecture will save you.
How quickly can your DNS flip?
How good is your replication?
Do you have a CDN?
Is your application going to run?
Not everybody can afford this.
47. Zones and Regions
A zone is analogous to a data center (for some
small number of buildings).
A region is a geographically dispersed
collection of zones that is distinct from any
other region.
48. Zones & Regions differ
Different instance types
Different features
Different provisioning capacity
OFA had ~40% of the US-East medium
instances at one point. Couldn't duplicate that
in US-West
51. Reserved instances
• Substantial savings (how often do you turn
off production databases?)
• Secondary market
• Must match AZ and instance size
• Discount coupon
Heavy utilization instances charge the
hourly rate 24x7
53. Dynamicity
The only thing you can't do is downsize
storage.
Change instance size? Check.
Turn PIOPs off? Check.
Delete replicas? Check.
Up to meet need. Down to meet budget.
54. Upgrading
Minor upgrade (can be auto during maint
window / will reboot or failover)
*Disable this
Upgrade from 5.5 to 5.6
1) Dump/load
2) Delta load
3) Switchover