Amazon DynamoDB is a distributed data store that delivers durable low latency at any scale. In this session, we will highlight the top four techniques that DynamoDB customers use to quickly build and scale new applications with minimal operations effort. We will also walk through recently-announced features that customers have requested in order to help them further accelerate their application development on DynamoDB.
11. Faster Development
Customer Experiences
Weatherbug mobile app
Super Bowl promotion
Lightning detection & alerting
for 40M users/month
Millions of interactions over a
relatively short period of time
Developed and tested in
weeks, at “1/20th of the cost of
the traditional DB approach”
Built the app in 3 days, from
design to production-ready
13. Design for Scale
Common Problem = inefficiently designed schemas
Hot spots create premature throttling
Excessive payloads cost more to move
Design Goals:
• Optimize the schema to the access patterns
• Minimize multi-table fetches (for high-scale patterns)
• Minimize payload size for each pattern
14. Design for Scale – Partitioning
• DynamoDB automatically partitions data by the hash key
Hash key spreads data (& workload) across partitions
• Auto-partitioning driven by:
table
Data set size
Throughput provisioned
large number of unique hash keys
ready to
scale!
+
uniform distribution of workload
across hash keys
partitions
1 .. N
15. Design for Scale – Efficient Schema Design
1. Identify the individual access patterns
2. Model each pattern to its own discrete data set
3. Consolidate data sets into tables and indexes
Abbreviated Example: File Sharing
Access Patterns
given userid…
• return all items by file name
• return all items by date created
• return all items by size
• return all items by type
• return all items by date updated
16. Range Keys
• Enable modeling 1:M relationships
hash
range
attributes
userid=“@mza”
postdate=“201309-12T20:59:28Z”
posttext=“New! Develop and test your apps with DynamoDB Local:
http://aws.typepad.com/aws/2013/09/dynamodb-local-for-desktop-development.html … #aws”
userid=“@mza”
postdate=“201309-13T09:17:37Z”
posttext=“Also! Copy DynamoDB data between regions with Data Pipeline:
http://aws.typepad.com/aws/2013/09/copy-dynamodb-data-between-regions-using-the-awsdata-pipeline.html … #aws”
userid=“@werner”
postdate=“201310-04T17:41:09Z”
posttext=“cool! RT @dialtone_: Worldwide DynamoDB replication for billions of rows a day? No
problem! http://tech.adroll.com/blog/ops/2013/10/02/dynamodb-replication.html … @AdRoll
can handle that!”
17. Range Keys – Simple API
• Currently 13 operations in total
Read and
Write Items
Manage Tables
•
•
•
•
•
CreateTable
UpdateTable
DeleteTable
DescribeTable
ListTables
•
•
•
•
PutItem
GetItem
UpdateItem
DeleteItem
Read and Write
Multiple Items
•
•
•
•
BatchGetItem
BatchWriteItem
Query
Query
Scan
18. Range Keys – Query
Query
•
•
•
•
•
Available for hash+range primary key tables
Retrieve all items by hash key
Range key conditions:
==, <, >, >=, <=, begins with, between
Sorted results. Counts. Top and bottom n values. Paged responses
19. Range Keys – Query and Efficient Reads
• Query treats all items as a single read operation
Items share the same hash key = same partition
By contrast, BatchGetItem reads each item in the batch separately
• Example
Read 100 items in a table, all of which share the same hash key
Each item is 120 bytes in size
Query
RCU Consumed
BatchGetItem
3
100
note: read capacity units are 4K in size
20. Range Keys – Local Secondary Indexes
•
•
•
•
Designed for high scale multi-tenant applications
Index local to the hash key (= partition)
Up to 5 indexes with no performance degradation
UserGamesIdx
LSI’s are sparse objects
Hash Key
UserId = bob
UserId = fred
UserId = bob
index
Range Key
LastPlayed=2013-02-11
LastPlayed=2013-05-19
LastPlayed=2012-11-07
Projected Attributes
GameId = Game1
GameId = Game2
GameId = Game3
UserGames table
Hash Key
Range Key
Attributes
UserId = bob
GameId = Game1
HighScore=10500, ScoreDate=2011-10-20, LastPlayed=2013-02-11
UserId = fred
GameId = Game2
HIghScore = 12000, ScoreDate = 2012-01-10, LastPlayed=2013-05-19
UserId = bob
GameId = Game3
HighScore = 20000, ScoreDate = 2012-02-12, LastPlayed=2012-11-07
21. Use Libraries and Tools
Transactions
Atomic transactions across multiple items & tables
Tracks status of ongoing transactions via two tables
1. Transactions
2. Pre-transaction snapshots of modified items
Geolocation
Add location awareness to mobile
applications
Find Yourself – sample app
https://github.com/awslabs
23. Develop and Test Locally – DynamoDB Local
• Disconnected development with full API support
No network
No usage costs
Note! DynamoDB Local does not
have a durability or availability SLA
DynamoDB
Local
m2.4xlarge
do this instead!
24. Develop and Test Locally – DynamoDB Local
Some minor differences from Amazon DynamoDB
• DynamoDB Local ignores your provisioned throughput
settings
The values that you specify when you call CreateTable and
UpdateTable have no effect
• DynamoDB Local does not throttle read or write activity
• The values that you supply for the AWS access key and the
Region are only used to name the database file
• Your AWS secret key is ignored but must be specified
Recommended using a dummy string of characters
26. Faster Development
Customer Experiences
"Since we had such a short time frame to build
Digg Reader we had to lean heavily on some of
the hosted AWS services, like DynamoDB,
versus rolling our own.” – Digg CTO Mike Young
“If we used a different product we would have spent a lot of development
time to reach parity with DynamoDB instead of developing our business.”
– Peter Bogunovich, Software Engineer RightAction, Inc
28. Automated Operations
• As scalability increases, performance degrades
• Substantial effort is required to sustain high performance
Provision / Configure
Servers and Storage
Monitor and Handle
Hardware Failures
Repartition Data
and Balance Clusters
Update Hardware
and Software
Manage Cross-Availability
Zone Replication
29. Predictable Performance
Provisioned Throughput
• Request-based capacity provisioning model
• Throughput is declared and updated via the API or the console
CreateTable (foo, reads/sec = 100, writes/sec = 150)
UpdateTable (foo, reads/sec=10000, writes/sec=4500)
• DynamoDB handles the rest
Capacity is reserved and available when needed
Scaling-up triggers repartitioning and reallocation
No impact to performance or availability
30. Durable Low Latency
WRITES
Continuously replicated to 3 AZ’s
Always consistent
Persisted to disk (custom SSD)
READS
Strongly or eventually consistent
No trade-off in latency
31. Durable Low Latency – At Scale
WRITES
Continuously replicated to 3 AZ’s
Always consistent
Persisted to disk (custom SSD)
READS
Strongly or eventually consistent
No trade-off in latency
32. efficient design
is cost effective
“Our previous NoSQL database required
almost an full time administrator to run.
Now AWS takes care of it.”
agility = time
managed services
reduce effort
Experiment
Optimize
33. Recommended Resources
AWS Mobile Development Blog
http://mobile.awsblog.com
• Geo Library for Amazon DynamoDB (series)
• Amazon DynamoDB on Mobile (series)
DynamoDB Best Practices, How-Tos, and Tools
http://aws.amazon.com/dynamodb/resources
• Local development and testing tools
• Backup and archive
• Autoscale