AWS Webcast - Four Tips for Faster Development With DynamoDB

Four Tips for Faster Development
with DynamoDB
David Pearson
Business Development Manager
AWS Database Services

AWS Database
Services
Amazon RDS

Amazon ElastiCache

Amazon DynamoDB

Amazon Redshift

Scalable High Performance
Application Storage in the Cloud
Deployment & Administration
Application Services

Compute

Storage

Database

Networking
AWS Global Infrastructure

Scaling Databases

RDBMS

infrastructure scaling
+ application scaling

 Read Replicas
 Data Sharding
 Denormalization

NoSQL

infrastructure
scaling only

Amazon’s Database Journey
DynamoDB
RDBMS

distributed

= key/value

database service

massively scalable

distributed

= key/value

database service
predictable performance
automated operations
durable low latency
cost effective

simple API
fast development

table
item

attribute

primary key is unique
1. hash only

hash
key

mandatory

table
item

attribute

primary key is unique
1. hash only
2. hash + range

hash range
key key

optional

table
item

attribute
attributes are associated
with items rather than
tables (as in RDBMS)

hash range
key key

sparse
schema

Faster Development
Customer Experiences
Weatherbug mobile app

Super Bowl promotion

Lightning detection & alerting
for 40M users/month

Millions of interactions over a
relatively short period of time

Developed and tested in
weeks, at “1/20th of the cost of
the traditional DB approach”

Built the app in 3 days, from
design to production-ready

Faster Development

Four Tips
design for scale
leverage range keys
use libraries & tools
develop & test locally

Design for Scale
Common Problem = inefficiently designed schemas
 Hot spots create premature throttling
 Excessive payloads cost more to move

Design Goals:
• Optimize the schema to the access patterns
• Minimize multi-table fetches (for high-scale patterns)
• Minimize payload size for each pattern

Design for Scale – Partitioning
• DynamoDB automatically partitions data by the hash key
 Hash key spreads data (& workload) across partitions

• Auto-partitioning driven by:

table

 Data set size
 Throughput provisioned
large number of unique hash keys

ready to
scale!

+
uniform distribution of workload
across hash keys

partitions
1 .. N

Design for Scale – Efficient Schema Design
1. Identify the individual access patterns
2. Model each pattern to its own discrete data set
3. Consolidate data sets into tables and indexes
Abbreviated Example: File Sharing
Access Patterns
given userid…
• return all items by file name
• return all items by date created
• return all items by size
• return all items by type
• return all items by date updated

Range Keys
• Enable modeling 1:M relationships
hash

range

attributes

userid=“@mza”

postdate=“201309-12T20:59:28Z”

posttext=“New! Develop and test your apps with DynamoDB Local:
http://aws.typepad.com/aws/2013/09/dynamodb-local-for-desktop-development.html … #aws”

userid=“@mza”

postdate=“201309-13T09:17:37Z”

posttext=“Also! Copy DynamoDB data between regions with Data Pipeline:
http://aws.typepad.com/aws/2013/09/copy-dynamodb-data-between-regions-using-the-awsdata-pipeline.html … #aws”

userid=“@werner”

postdate=“201310-04T17:41:09Z”

posttext=“cool! RT @dialtone_: Worldwide DynamoDB replication for billions of rows a day? No
problem! http://tech.adroll.com/blog/ops/2013/10/02/dynamodb-replication.html … @AdRoll
can handle that!”

Range Keys – Simple API
• Currently 13 operations in total
Read and
Write Items

Manage Tables
•
•
•
•
•

CreateTable
UpdateTable
DeleteTable
DescribeTable
ListTables

•
•
•
•

PutItem
GetItem
UpdateItem
DeleteItem

Read and Write
Multiple Items
•
•
•
•

BatchGetItem
BatchWriteItem
Query
Query
Scan

Range Keys – Query
Query

•
•
•
•
•

Available for hash+range primary key tables
Retrieve all items by hash key
Range key conditions:
==, <, >, >=, <=, begins with, between
Sorted results. Counts. Top and bottom n values. Paged responses

Range Keys – Query and Efficient Reads
• Query treats all items as a single read operation
 Items share the same hash key = same partition
 By contrast, BatchGetItem reads each item in the batch separately

• Example
 Read 100 items in a table, all of which share the same hash key
 Each item is 120 bytes in size
Query
RCU Consumed

BatchGetItem

3

100

note: read capacity units are 4K in size

Range Keys – Local Secondary Indexes
•
•
•
•

Designed for high scale multi-tenant applications
Index local to the hash key (= partition)
Up to 5 indexes with no performance degradation
UserGamesIdx
LSI’s are sparse objects
Hash Key
UserId = bob
UserId = fred
UserId = bob

index

Range Key
LastPlayed=2013-02-11

Projected Attributes
GameId = Game1
GameId = Game2
GameId = Game3

UserGames table
Hash Key

Range Key

Attributes

UserId = bob

GameId = Game1

HighScore=10500, ScoreDate=2011-10-20, LastPlayed=2013-02-11

UserId = fred

GameId = Game2

HIghScore = 12000, ScoreDate = 2012-01-10, LastPlayed=2013-05-19

UserId = bob

GameId = Game3

HighScore = 20000, ScoreDate = 2012-02-12, LastPlayed=2012-11-07

Use Libraries and Tools
Transactions
 Atomic transactions across multiple items & tables
 Tracks status of ongoing transactions via two tables
1. Transactions
2. Pre-transaction snapshots of modified items

Geolocation
 Add location awareness to mobile
applications
 Find Yourself – sample app
https://github.com/awslabs

Use Libraries and Tools
Community Contributions

Develop and Test Locally – DynamoDB Local
• Disconnected development with full API support
No network
No usage costs

Note! DynamoDB Local does not
have a durability or availability SLA
DynamoDB
Local

m2.4xlarge

do this instead!

Develop and Test Locally – DynamoDB Local
Some minor differences from Amazon DynamoDB
• DynamoDB Local ignores your provisioned throughput
settings
 The values that you specify when you call CreateTable and
UpdateTable have no effect

• DynamoDB Local does not throttle read or write activity
• The values that you supply for the AWS access key and the
Region are only used to name the database file
• Your AWS secret key is ignored but must be specified
 Recommended using a dummy string of characters

Develop and Test Locally
Additional
Options

Faster Development
Customer Experiences
"Since we had such a short time frame to build
Digg Reader we had to lean heavily on some of
the hosted AWS services, like DynamoDB,
versus rolling our own.” – Digg CTO Mike Young

“If we used a different product we would have spent a lot of development
time to reach parity with DynamoDB instead of developing our business.”
– Peter Bogunovich, Software Engineer RightAction, Inc

automated operations

=

predictable performance

database service

durable low latency

cost effective

Automated Operations
• As scalability increases, performance degrades
• Substantial effort is required to sustain high performance

Provision / Configure
Servers and Storage

Monitor and Handle
Hardware Failures

Repartition Data
and Balance Clusters

Update Hardware
and Software

Manage Cross-Availability
Zone Replication

Predictable Performance
Provisioned Throughput
• Request-based capacity provisioning model
• Throughput is declared and updated via the API or the console
 CreateTable (foo, reads/sec = 100, writes/sec = 150)
 UpdateTable (foo, reads/sec=10000, writes/sec=4500)

• DynamoDB handles the rest
 Capacity is reserved and available when needed
 Scaling-up triggers repartitioning and reallocation
 No impact to performance or availability

Durable Low Latency

WRITES
Continuously replicated to 3 AZ’s
Always consistent
Persisted to disk (custom SSD)

READS
Strongly or eventually consistent
No trade-off in latency

Durable Low Latency – At Scale

WRITES
Continuously replicated to 3 AZ’s
Always consistent
Persisted to disk (custom SSD)

READS
Strongly or eventually consistent
No trade-off in latency

efficient design
is cost effective

“Our previous NoSQL database required
almost an full time administrator to run.
Now AWS takes care of it.”

agility = time

managed services
reduce effort

Experiment
Optimize

Recommended Resources
AWS Mobile Development Blog
http://mobile.awsblog.com
• Geo Library for Amazon DynamoDB (series)
• Amazon DynamoDB on Mobile (series)
DynamoDB Best Practices, How-Tos, and Tools
http://aws.amazon.com/dynamodb/resources
• Local development and testing tools
• Backup and archive
• Autoscale

Questions?
David Pearson
Business Development Manager
AWS Database Services

AWS Webcast - Four Tips for Faster Development With DynamoDB

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (7)

More from Amazon Web Services

More from Amazon Web Services (20)

Recently uploaded

Recently uploaded (20)

AWS Webcast - Four Tips for Faster Development With DynamoDB