SlideShare une entreprise Scribd logo
1  sur  64
Télécharger pour lire hors ligne
Amazon DynamoDB Deep Dive 
Matt Yanchyshyn: Solutions Architect, AWS 
Shekhar Deshkar: Chief Architect, Druva
Amazon DynamoDB Deep Dive 
• What’s new 
• JSON Document Support 
• Flexible access control 
• Partitions & Scaling 
• Optimizing Data Access Cost 
• Customer use case: Druva
Fast & flexible NoSQL database 
Predictable performance 
Seamless & massive scalability 
Fully managed; zero admin Amazon 
DynamoDB
What’s new? 
• JSON Document Support 
• Larger items: 400KB max item size 
• Additional Scaling Options 
• Expanded Free Tier: 25GB & 200M requests per month 
4
Amazon DynamoDB Deep Dive 
 What’s new 
• JSON Document Support 
• Flexible access control 
• Partitions & Scaling 
• Optimizing Data Access Cost 
• Customer use case: Druva
New Data Types 
• Scalars: Number, String, Binary, Boolean, Null 
• Multi-value: String Set, Number Set, Binary Set 
• Document: List and Map
New Data Types 
{ 
Day: "Monday", 
FavoriteDay: false, 
ItemsOnMyDesk: [ 
"Telephone", 
{ 
Pens: { Quantity : 3}, 
Pencils: { Quantity : 2}, 
Erasers: null 
} 
] 
} 
Map 
Boolean 
Nested List 
Map 
Null
Document Paths 
{ 
Day: "Monday", 
FavoriteDay: false, 
ItemsOnMyDesk: [ 
"Telephone", 
{ 
Pens: { Quantity : 3}, 
Pencils: { Quantity : 2}, 
Erasers: null 
} 
] 
} 
FavoriteDay 
ItemsOnMyDesk[0] 
ItemsOnMyDesk[1].Pens.Quantity
Example: AWS SDK for Java Document API 
Item item = new Item() 
.withPrimaryKey("Day", "Monday") 
.withJSON("document", json); 
table.putItem(item); 
Item documentItem = table.getItem(new GetItemSpec() 
.withPrimaryKey("Day", ”Monday") 
.withAttributesToGet("document")); 
Item documentItem = table.getItem(new GetItemSpec() 
.withPrimaryKey("Day", "Monday") 
.withProjectionExpression("document.ItemsOnMyDesk[1].Pens.Quantity")); 
{ItemsOnMyDesk=[Telephone, 
{Erasers=null, Pencils={Quantity=2}, 
Pens={Quantity=3}}], 
FavoriteDay=false} 
{ItemsOnMyDesk=[{Pens= 
{Quantity=3}}]}
Example: AWS SDK for Java Document API 
Item item = new Item() 
.withPrimaryKey("Day", "Friday") 
.withMap("document", new ValueMap() 
.withBoolean("FavoriteDay", true) 
.withList("ItemsOnMyDesk", "Telephone", 
new ValueMap() 
.withMap("Pens", new ValueMap() 
.withInt("Quantity", 1)) 
.withMap("Pencils", new ValueMap() 
.withInt("Quantity", 0)) 
.withMap("Erasers", new ValueMap() 
.withInt("Quantity", 1)))); 
{ 
Day: "Monday", 
FavoriteDay: false, 
ItemsOnMyDesk: [ 
"Telephone", 
{ 
Pens: { Quantity : 3}, 
Pencils: { Quantity : 2}, 
Erasers: null 
} 
] 
}
Questions?
Amazon DynamoDB Deep Dive 
 What’s new 
 JSON Document Support 
• Flexible access control 
• Partitions & Scaling 
• Optimizing Data Access Cost 
• Customer use case: Druva
Flexible access control 
• Control access to individual 
items and attributes 
• Implement using AWS IAM 
roles that authenticated 
users assume 
• Leverage SAML or WIF
Flexible access control with AWS IAM 
... 
"Effect": "Allow", 
"Action": [ "dynamodb:Query”, "dynamodb:Scan” ], 
"Resource": "arn:aws:dynamodb:REGION:abcde12345:users/example", 
"Condition": { 
"ForAllValues:StringEquals": { 
"dynamodb:LeadingKeys": ["${saml:sub}"], 
"dynamodb:Attributes": [ "userId", "firstName", "lastName” ] 
}, 
"StringEqualsIfExists": { 
"dynamodb:Select": "SPECIFIC_ATTRIBUTES" 
} 
... 
Authenticated user ID / 
DynamoDB table key 
Item attributes that 
user can access 
DynamoDB API 
actions allowed
Questions? 
…and if you want to take DynamoDB’s flexible security 
model to the next level, check-out client-side encryption 
in Java with DynamoDBMapper: 
http://bit.ly/1zerqW2
Amazon DynamoDB Deep Dive 
 What’s new 
 JSON Document Support 
 Flexible access control 
• Partitions & Scaling 
• Optimizing Data Access Cost 
• Customer use case: Druva
Partitions & Scaling 
Votes Table 
Voter 
Candidate A 
Votes: 21 
Candidate B 
Votes: 30 
UpdateItem 
ADD 1 to “Candidate A” 
(aka Atomic Increment)
Partitions & Scaling 
Votes Table 
You 
Provision 1200 Write Capacity Units 
Need to scale 
for the election
Partitions & Scaling 
Votes Table 
Partition 1 Partition 2 
You 
600 Write Capacity Units (each) 
Provision 1200 Write Capacity Units
Partitions & Scaling 
Votes Table 
Partition 1 Partition 2 
You 
(no sharing) 
Provision 1200 Write Capacity Units
Partitions & Scaling 
Votes Table 
You 
Provision 200,000 Write Capacity Units 
Partition 1 
(600 WCU) 
Partition K 
(600 WCU) 
Partition M 
(600 WCU) 
Partition N 
(600 WCU)
Partitions & Scaling 
Votes Table 
Partition 1 
(600 WCU) 
Candidate A 
Partition K 
(600 WCU) 
Partition M 
(600 WCU) 
Partition N 
(600 WCU) 
Candidate B 
Voters
Scaling Writes 
Votes Table 
Candidate A_2 
Candidate B_1 
Candidate B_2 
Candidate B_3 
Candidate B_5 
Candidate B_4 
Candidate B_7 
Candidate B_6 
Candidate A_1 
Candidate A_3 
Candidate A_4 
Candidate A_7 Candidate B_8 
Voter 
UpdateItem: “CandidateA_” + rand(0, 10) 
ADD 1 to Votes 
Candidate A_6 Candidate A_8 
Candidate A_5
Scaling Writes 
Votes Table 
Candidate A_2 
Candidate B_1 
Candidate B_2 
Candidate B_3 
Candidate B_5 
Candidate B_4 
Candidate B_7 
Candidate B_6 
Candidate A_1 
Candidate A_3 
Candidate A_4 
Candidate A_5 
Candidate A_6 Candidate A_8 
Candidate A_7 Candidate B_8 
Periodic 
Process 
Candidate A 
Total: 2.5M 
1. Sum 
2. Store Voter
Scaling Reads 
Votes Table 
Partition 1 
(600 WCU) 
Candidate A_Total 
Votes: 2.5M 
Partition K 
(600 WCU) 
Partition M 
(600 WCU) 
Partition N 
(600 WCU) 
Candidate B_Total 
Votes: 2.1M 
Voters
Scaling Reads 
Votes Table 
Partition 1 
(600 WCU) 
Candidate A_Total 
Votes: 2.5M 
Partition K 
(600 WCU) 
Partition M 
(600 WCU) 
Partition N 
(600 WCU) 
Candidate B_Total 
Votes: 2.1M 
Voters
Scaling Best Practices 
• Design for uniform data access 
• Distribute write activity 
• Understand access patterns 
• Be careful not to over-provision
Questions?
Amazon DynamoDB Deep Dive 
 What’s new 
 JSON Document Support 
 Flexible access control 
 Partitions & Scaling 
• Optimizing Data Access Cost 
• Customer use case: Druva
Optimizing Query Cost
Provisioned Throughput Capacity Units 
• 1 Write Capacity Unit (WCU) = Up to 1 KB Item 
• 1 Read Capacity Unit (RCU) = Up to 4 KB of Items 
• 0.5 RCU = Up to 4 KB of Items (Eventually Consistent)
How much throughput do I need? 
• Investigate customer workload: 
– How many requests per second per hash key? 
– How much will those operations cost? 
• Investigate customer data size: 
– How big will the table be initially? 
– How large will the table grow? 
– Will the throughput grow proportionally?
Best practices: Query vs. BatchGetItem 
• Query treats all items as a single read operation 
– Items share the same hash key = same partition 
– BatchGetItem reads each item in the batch separately 
• Example 
– Read 100 items in a table, all with the same hash key. 
Each item is 120 bytes in size 
– Query RCU = 3 
– BatchGetItem RCU = 100
Filter cost calculation 
Query UserId=Alice, Filter where OpponentId=Bob 
UserId GameId Date OpponentId 
Carol e23f5a 2013-10-08 Charlie 
Alice d4e2dc 2013-10-01 Bob 
Alice e9cba3 2013-09-27 Charlie 
Alice f6a3bd 2013-10-08 Bob 
Filters do not reduce 
the throughput cost 
Query/Scan: 1 MB max 
returned per operation. 
Limit applies before 
filtering
Example: Query & Read Capacity Units 
Page Time Views 
index.html 2014-06-25T12:00 120 
index.html 2014-06-25T12:01 122 
index.html 2014-06-25T12:02 125 
search.html 2014-06-25T12:00 200 
Query PageViews 
WHERE Page=index.html 
AND Time >= 2014-06-25T12:00 
LIMIT 10 42 bytes each X 3 items 
= 126 bytes 
= 1 Read Capacity Unit
Query cost calculation 
UserId GameId Date OpponentId … 
Carol e23f5a 2013-10-08 Charlie … 
Alice d4e2dc 2013-10-01 Bob … 
Alice e9cba3 2013-09-27 Bob … 
Alice f6a3bd 2013-10-08 
(1 item = 600 bytes) 
(397 more games for Alice) 
400 X 600 / 1024 / 4 = 60 Read Capacity Units 
(bytes per item) (KB per byte) 
(Items evaluated by Query) (KB per Read Capacity Unit)
Local Secondary Indexes (LSI) 
• Same hash key + alternate (scalar) range key 
• Index and table data is co-located (same partition) 
• Read and write capacity units consumed from the table 
• Increases WCUs per write 
• 10GB max
A1 
(hash) 
A3 
(range) 
A2 
(table key) 
A1 
(hash) 
A2 
(range) 
A3 A4 A5 
LSIs 
A1 
(hash) 
A4 
(range) 
A2 
(table key) 
A3 
(projected) 
Table 
KEYS_ONLY 
INCLUDE A3 
A1 
(hash) 
A4 
(range) 
A2 
(table key) 
A3 
(projected) 
A5 
(projected) ALL 
Local Secondary Index Projections 
• Items with fewer projected attributes will be smaller: lower cost 
• Project with care: LSI queries auto-fetch non-projected attributes 
from the table …at a cost: index matches + table matches
Global Secondary Indexes (GSI) 
• Any attribute indexed as new hash and/or range key 
• GSIs get their own provisioned throughput 
• Asynchronously updated, so eventual consistency only
A5 
(hash) 
A3 
(range) 
A1 
(table key) 
A1 
(hash) 
A2 A3 A4 A5 
GSIs 
A5 
(hash) 
A4 
(range) 
A1 
(table key) 
A3 
(projected) 
Table 
KEYS_ONLY 
INCLUDE A3 
A4 
(hash) 
A5 
(range) 
A1 
(table key) 
A2 
(projected) 
A3 
(projected) ALL 
A2 
(hash) 
A1 
(table key) KEYS_ONLY 
Global Secondary Index Projections 
GSI queries can only 
request projected 
attributes
Reduce cost with Sparse Indexes 
• DynamoDB only writes to an index if the index key value 
is present in the item 
• Use sparse indexes to efficiently locate table items that 
have an uncommon attribute 
• Example: find hockey teams that have won the Stanley 
Cup 
– GSI Hash Key: StanleyCupWinner 
– GSI Range Key: TeamId
Optimizing Scans
What about non-indexed querying? 
• Sometimes you need to scan the full table, 
e.g. bulk export 
• Use Parallel Scan for faster retrieval, if… 
– Your table is >= 20 GB 
– Provisioned read throughput has room 
– Sequential Scan operations are too slow 
• Consider cost and performance implications
Bulk Export: Scan 
Your Table 
Partition 1 Partition 2 Partition 3 Partition 4 
• Slow: only uses one 
partition at a time 
• Non-uniform utilization 
• Consumes lots of 
RCUs
Bulk Export: Parallel Scan 
Your Table 
Partition 1 Partition 2 Partition 3 Partition 4 
• Divide the work 
• Use threads, non-blocking 
IO, processes, 
or multiple servers to 
distribute the work 
• Can still consume lots of 
RCUs!
Bulk Export: Rate-limited Parallel Scan 
Your Table 
Partition 1 Partition 2 Partition 3 Partition 4 
• Limit page size and rate 
(1 MB default page size 
/ 4 KB item size) 
/ 2 eventually consistent reads 
= 128 RCUs per page (default) 
• Rate-limit each segment
Many best practices are implemented already 
• Amazon Elastic Map Reduce 
– dynamodb.throughput.read/write.percent 
• Amazon Redshift 
– readratio 
• DynamoDB Mapper (Java SDK) 
• RateLimiter library (Google Guava)
Questions?
Amazon DynamoDB Deep Dive 
 What’s new 
 JSON Document Support 
 Flexible access control 
 Partitions & Scaling 
 Optimizing Data Access Cost 
• Customer use case: Druva
Shekhar Deshkar 
Chief Architect, Druva
Reinventing Data Protection for a Mobile World 
• Druva inSync Ranked #1 by Gartner, 2 successive years 
• Top amongst 7 products against 11 critical capabilities 
• Protect & govern data at the edge of enterprises 
– Extend eDiscovery & corporate data governance to endpoints 
– Data loss prevention in case of lost devices 
• Sync-share for enterprise-wide collaboration 
• Radically simple Cloud backup and Archival 
– Eliminates Complexity of Legacy Backup and Tape Vaulting
Data management Challenges in a Mobile World 
• Cost of managing ever-growing number of 
endpoints or remote servers 
• Controlling associated data growth 
– Should I backup everything? 
– Did I miss backing up something critical? 
• Cloud? 
– How do I manage prime-time WAN bandwidth
Data backup for endpoints and at the Edge 
• Backup: Data & metadata 
• Incremental backup 
• Data deduplication 
– Global 
– Source-side
Druva Storage nCube Architecture 
• Distributed Cloud File System 
• Store: Object storage, single instance, self-healing 
• Organize: Time-indexed views, metadata server 
• Serve: Uncompromised throughput 
• End-to-end Security 
• At the ends: 2-factor encryption; At device 
• Everything in between: Data in transit, Data access 
• Designed for the Edge 
• Intelligent, global de-duplication at the edge 
• Flexible deployment 
• Why we moved away from Cassandra: 
http://aws.amazon.com/solutions/case-studies/druva/
Druva Storage on AWS
NoSQL Database & Distributed Systems 
Scenario: Collaborators trying to create same file in shared work-space 
• Need to simulate atomic operations that ensures only one of them succeeds 
• Each collaborator creates an object with a unique temporary name 
• Perform atomic rename (from unique temporary to possibly-conflicting 
name) to ensure only one succeeds. 
item = self.db.new_item(self.DRSTK_FOLDER_NAMEVER, 
key, {self.AT_DIRENT: val}) 
try: 
item.put(expected_value={self.AT_DIRENT: False}) 
except adb.ConditionCheckFailedError: 
# someone else raced against us and 
# already created a folder entry we’re hoping to create
Transactional semantics for References 
• How do we implement transaction-like semantics 
while updating two independent DynamoDB items? 
• Thread #A is trying to add reference to an existing 
item, stored as an object in S3, has checksum 
‘csum’, and is identified by ‘handle’ 
• Thread #B, is in process of dropping it’s reference to 
the same object and if no more references, removes 
the item and associated S3 object too.
Transactional semantics for References 
• Thread A: Optimistically, add a new reference to the handle 
hkey, rkey = self.handle2key(handle) 
item = self.db.new_item(self.id_ref + hkey, rkey + reverselink,{}) 
item.put() 
chkey, crkey = self.sum2key(csum) 
try: 
citem = self.db.get_item(chkey, crkey + rkey) 
except adb.KeyNotFoundError: 
# we lost race & handle vanished, drop our reference 
item.delete() 
else: # success 
Label #1 thread B 
may be forced to 
take Label #5 path 
Label #2
Transactional semantics for References 
• Thread B Drop a reference to the handle 
item = self.db.new_item(self.id_ref + hkey, rkey + reverselink) 
item.delete() 
items = self.db.query(self.id_ref + hkey, query, limit=1) 
try: 
item = items.next() 
except StopIteration: # no more references to the handle 
else: # Additional references exist, can't purge 
return 0 
# No one possibly using this block 
chkey, crkey = self.sum2key(csum) 
item = self.db.new_item(chkey, crkey + rkey) 
item.delete() 
items = self.db.query(self.id_ref + hkey, query, limit=1) 
try: 
item = items.next() 
except StopIteration: # safe to purge the block from S3 now 
else: # Someone is now using this. Don't drop block Label #5 
Label #4 
Label #3 thread A 
may be forced to 
take Label #2 path
Designing for Scale-Out Performance 
• DynamoDB hot partitions 
• Lock-free concurrent access 
– Minimize shared state 
– Identify and eliminate false-sharing 
– Distinct hash keys 
• Relax semantics for shared state 
– Shard hash key to eliminate DynamoDB hotspots 
– Collate shard state to determine global state
Provisioned Throughput Optimization 
• Application IOPs to sustain 
– DynamoDB IOPs needed to support it 
• Shield applications from Throttling errors 
• Optimize Provisioned Throughput based on History
Questions?
Poll
Thank you! 
https://aws.amazon.com/dynamodb/developer-resources/ 
http://druva.com/

Contenu connexe

Tendances

(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014
(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014
(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014Amazon Web Services
 
DynamoDB In-depth & Developer Drill Down
DynamoDB In-depth & Developer Drill Down DynamoDB In-depth & Developer Drill Down
DynamoDB In-depth & Developer Drill Down Amazon Web Services
 
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...Amazon Web Services
 
AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Div...
AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Div...AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Div...
AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Div...Amazon Web Services
 
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...randyguck
 
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...Amazon Web Services
 
Map reduce: beyond word count
Map reduce: beyond word countMap reduce: beyond word count
Map reduce: beyond word countJeff Patti
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryDatabricks
 
Cassandra Day Atlanta 2015: Data Modeling In-Depth: A Time Series Example
Cassandra Day Atlanta 2015: Data Modeling In-Depth: A Time Series ExampleCassandra Day Atlanta 2015: Data Modeling In-Depth: A Time Series Example
Cassandra Day Atlanta 2015: Data Modeling In-Depth: A Time Series ExampleDataStax Academy
 
Spark Dataframe - Mr. Jyotiska
Spark Dataframe - Mr. JyotiskaSpark Dataframe - Mr. Jyotiska
Spark Dataframe - Mr. JyotiskaSigmoid
 
Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14Julian Hyde
 
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data Modeling
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data ModelingCassandra Day SV 2014: Fundamentals of Apache Cassandra Data Modeling
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data ModelingDataStax Academy
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionPatrick McFadin
 
Streaming SQL with Apache Calcite
Streaming SQL with Apache CalciteStreaming SQL with Apache Calcite
Streaming SQL with Apache CalciteJulian Hyde
 

Tendances (20)

Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Amazon DynamoDB Workshop
Amazon DynamoDB WorkshopAmazon DynamoDB Workshop
Amazon DynamoDB Workshop
 
DynamoDB Deep Dive
DynamoDB Deep DiveDynamoDB Deep Dive
DynamoDB Deep Dive
 
(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014
(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014
(BDT203) From Zero to NoSQL Hero: Amazon DynamoDB Tutorial | AWS re:Invent 2014
 
Amazon DynamoDB Design Workshop
Amazon DynamoDB Design WorkshopAmazon DynamoDB Design Workshop
Amazon DynamoDB Design Workshop
 
DynamoDB In-depth & Developer Drill Down
DynamoDB In-depth & Developer Drill Down DynamoDB In-depth & Developer Drill Down
DynamoDB In-depth & Developer Drill Down
 
DynamoDB Design Workshop
DynamoDB Design WorkshopDynamoDB Design Workshop
DynamoDB Design Workshop
 
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...
 
AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Div...
AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Div...AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Div...
AWS Webcast - Optimize your database for the cloud with DynamoDB – A Deep Div...
 
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
 
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
 
Map reduce: beyond word count
Map reduce: beyond word countMap reduce: beyond word count
Map reduce: beyond word count
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
 
Cassandra Day Atlanta 2015: Data Modeling In-Depth: A Time Series Example
Cassandra Day Atlanta 2015: Data Modeling In-Depth: A Time Series ExampleCassandra Day Atlanta 2015: Data Modeling In-Depth: A Time Series Example
Cassandra Day Atlanta 2015: Data Modeling In-Depth: A Time Series Example
 
Indexed Hive
Indexed HiveIndexed Hive
Indexed Hive
 
Spark Dataframe - Mr. Jyotiska
Spark Dataframe - Mr. JyotiskaSpark Dataframe - Mr. Jyotiska
Spark Dataframe - Mr. Jyotiska
 
Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14
 
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data Modeling
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data ModelingCassandra Day SV 2014: Fundamentals of Apache Cassandra Data Modeling
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data Modeling
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
Streaming SQL with Apache Calcite
Streaming SQL with Apache CalciteStreaming SQL with Apache Calcite
Streaming SQL with Apache Calcite
 

En vedette

AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)Amazon Web Services
 
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDBAmazon Web Services
 
ORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, SmallerORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, SmallerDataWorks Summit
 
Scaling your analytics with Amazon EMR
Scaling your analytics with Amazon EMRScaling your analytics with Amazon EMR
Scaling your analytics with Amazon EMRIsrael AWS User Group
 
Masterclass Webinar: Amazon DynamoDB July 2014
Masterclass Webinar: Amazon DynamoDB July 2014Masterclass Webinar: Amazon DynamoDB July 2014
Masterclass Webinar: Amazon DynamoDB July 2014Amazon Web Services
 
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of TradeoffsCompression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of TradeoffsDataWorks Summit
 
AWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDBAWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDBAmazon Web Services
 
Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Wojciech Biela
 
Amazon DynamoDB Design Patterns for Ultra-High Performance Apps (DAT304) | AW...
Amazon DynamoDB Design Patterns for Ultra-High Performance Apps (DAT304) | AW...Amazon DynamoDB Design Patterns for Ultra-High Performance Apps (DAT304) | AW...
Amazon DynamoDB Design Patterns for Ultra-High Performance Apps (DAT304) | AW...Amazon Web Services
 
Securing Serverless Workloads with Cognito and API Gateway Part I - AWS Secur...
Securing Serverless Workloads with Cognito and API Gateway Part I - AWS Secur...Securing Serverless Workloads with Cognito and API Gateway Part I - AWS Secur...
Securing Serverless Workloads with Cognito and API Gateway Part I - AWS Secur...Amazon Web Services
 
Cassandra under the hood
Cassandra under the hoodCassandra under the hood
Cassandra under the hoodAndriy Rymar
 
Amazon Aurora: Amazon’s New Relational Database Engine
Amazon Aurora: Amazon’s New Relational Database EngineAmazon Aurora: Amazon’s New Relational Database Engine
Amazon Aurora: Amazon’s New Relational Database EngineAmazon Web Services
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep DiveAmazon Web Services
 
Securing Serverless Workloads with Cognito and API Gateway Part II - AWS Secu...
Securing Serverless Workloads with Cognito and API Gateway Part II - AWS Secu...Securing Serverless Workloads with Cognito and API Gateway Part II - AWS Secu...
Securing Serverless Workloads with Cognito and API Gateway Part II - AWS Secu...Amazon Web Services
 
(DEV203) Amazon API Gateway & AWS Lambda to Build Secure APIs
(DEV203) Amazon API Gateway & AWS Lambda to Build Secure APIs(DEV203) Amazon API Gateway & AWS Lambda to Build Secure APIs
(DEV203) Amazon API Gateway & AWS Lambda to Build Secure APIsAmazon Web Services
 

En vedette (18)

AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
AWS re:Invent 2016: Deep Dive on Amazon DynamoDB (DAT304)
 
Deep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDBDeep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDB
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
 
ORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, SmallerORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, Smaller
 
Scaling your analytics with Amazon EMR
Scaling your analytics with Amazon EMRScaling your analytics with Amazon EMR
Scaling your analytics with Amazon EMR
 
Masterclass Webinar: Amazon DynamoDB July 2014
Masterclass Webinar: Amazon DynamoDB July 2014Masterclass Webinar: Amazon DynamoDB July 2014
Masterclass Webinar: Amazon DynamoDB July 2014
 
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of TradeoffsCompression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of Tradeoffs
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
AWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDBAWS Webcast - Build high-scale applications with Amazon DynamoDB
AWS Webcast - Build high-scale applications with Amazon DynamoDB
 
Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.
 
Amazon DynamoDB Design Patterns for Ultra-High Performance Apps (DAT304) | AW...
Amazon DynamoDB Design Patterns for Ultra-High Performance Apps (DAT304) | AW...Amazon DynamoDB Design Patterns for Ultra-High Performance Apps (DAT304) | AW...
Amazon DynamoDB Design Patterns for Ultra-High Performance Apps (DAT304) | AW...
 
Securing Serverless Workloads with Cognito and API Gateway Part I - AWS Secur...
Securing Serverless Workloads with Cognito and API Gateway Part I - AWS Secur...Securing Serverless Workloads with Cognito and API Gateway Part I - AWS Secur...
Securing Serverless Workloads with Cognito and API Gateway Part I - AWS Secur...
 
Cassandra under the hood
Cassandra under the hoodCassandra under the hood
Cassandra under the hood
 
Amazon Aurora: Amazon’s New Relational Database Engine
Amazon Aurora: Amazon’s New Relational Database EngineAmazon Aurora: Amazon’s New Relational Database Engine
Amazon Aurora: Amazon’s New Relational Database Engine
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive
 
Securing Serverless Workloads with Cognito and API Gateway Part II - AWS Secu...
Securing Serverless Workloads with Cognito and API Gateway Part II - AWS Secu...Securing Serverless Workloads with Cognito and API Gateway Part II - AWS Secu...
Securing Serverless Workloads with Cognito and API Gateway Part II - AWS Secu...
 
(DEV203) Amazon API Gateway & AWS Lambda to Build Secure APIs
(DEV203) Amazon API Gateway & AWS Lambda to Build Secure APIs(DEV203) Amazon API Gateway & AWS Lambda to Build Secure APIs
(DEV203) Amazon API Gateway & AWS Lambda to Build Secure APIs
 

Similaire à AWS Webcast - Dynamo DB

February 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBFebruary 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBAmazon Web Services
 
(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming(WRK302) Event-Driven Programming
(WRK302) Event-Driven ProgrammingAmazon Web Services
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSAmazon Web Services
 
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAmazon Web Services
 
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB DayGetting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB DayAmazon Web Services Korea
 
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016Amazon Web Services Korea
 
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...Amazon Web Services
 
Getting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBGetting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBAmazon Web Services
 
Amazon Dynamo DB for Developers (김일호) - AWS DB Day
Amazon Dynamo DB for Developers (김일호) - AWS DB DayAmazon Dynamo DB for Developers (김일호) - AWS DB Day
Amazon Dynamo DB for Developers (김일호) - AWS DB DayAmazon Web Services Korea
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDBAWS Germany
 
AWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDBAWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDBAmazon Web Services
 
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivAmazon Web Services
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBAmazon Web Services
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL ServicesAmazon Web Services
 

Similaire à AWS Webcast - Dynamo DB (20)

February 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBFebruary 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDB
 
DynamodbDB Deep Dive
DynamodbDB Deep DiveDynamodbDB Deep Dive
DynamodbDB Deep Dive
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
AWS Data Collection & Storage
AWS Data Collection & StorageAWS Data Collection & Storage
AWS Data Collection & Storage
 
(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
 
Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討
 
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB DayGetting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
 
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
 
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
 
Getting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBGetting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDB
 
Data Collection and Storage
Data Collection and StorageData Collection and Storage
Data Collection and Storage
 
Amazon Dynamo DB for Developers (김일호) - AWS DB Day
Amazon Dynamo DB for Developers (김일호) - AWS DB DayAmazon Dynamo DB for Developers (김일호) - AWS DB Day
Amazon Dynamo DB for Developers (김일호) - AWS DB Day
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDB
 
AWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDBAWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDB
 
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDB
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
 

Plus de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Dernier

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 

Dernier (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 

AWS Webcast - Dynamo DB

  • 1. Amazon DynamoDB Deep Dive Matt Yanchyshyn: Solutions Architect, AWS Shekhar Deshkar: Chief Architect, Druva
  • 2. Amazon DynamoDB Deep Dive • What’s new • JSON Document Support • Flexible access control • Partitions & Scaling • Optimizing Data Access Cost • Customer use case: Druva
  • 3. Fast & flexible NoSQL database Predictable performance Seamless & massive scalability Fully managed; zero admin Amazon DynamoDB
  • 4. What’s new? • JSON Document Support • Larger items: 400KB max item size • Additional Scaling Options • Expanded Free Tier: 25GB & 200M requests per month 4
  • 5. Amazon DynamoDB Deep Dive  What’s new • JSON Document Support • Flexible access control • Partitions & Scaling • Optimizing Data Access Cost • Customer use case: Druva
  • 6. New Data Types • Scalars: Number, String, Binary, Boolean, Null • Multi-value: String Set, Number Set, Binary Set • Document: List and Map
  • 7. New Data Types { Day: "Monday", FavoriteDay: false, ItemsOnMyDesk: [ "Telephone", { Pens: { Quantity : 3}, Pencils: { Quantity : 2}, Erasers: null } ] } Map Boolean Nested List Map Null
  • 8. Document Paths { Day: "Monday", FavoriteDay: false, ItemsOnMyDesk: [ "Telephone", { Pens: { Quantity : 3}, Pencils: { Quantity : 2}, Erasers: null } ] } FavoriteDay ItemsOnMyDesk[0] ItemsOnMyDesk[1].Pens.Quantity
  • 9. Example: AWS SDK for Java Document API Item item = new Item() .withPrimaryKey("Day", "Monday") .withJSON("document", json); table.putItem(item); Item documentItem = table.getItem(new GetItemSpec() .withPrimaryKey("Day", ”Monday") .withAttributesToGet("document")); Item documentItem = table.getItem(new GetItemSpec() .withPrimaryKey("Day", "Monday") .withProjectionExpression("document.ItemsOnMyDesk[1].Pens.Quantity")); {ItemsOnMyDesk=[Telephone, {Erasers=null, Pencils={Quantity=2}, Pens={Quantity=3}}], FavoriteDay=false} {ItemsOnMyDesk=[{Pens= {Quantity=3}}]}
  • 10. Example: AWS SDK for Java Document API Item item = new Item() .withPrimaryKey("Day", "Friday") .withMap("document", new ValueMap() .withBoolean("FavoriteDay", true) .withList("ItemsOnMyDesk", "Telephone", new ValueMap() .withMap("Pens", new ValueMap() .withInt("Quantity", 1)) .withMap("Pencils", new ValueMap() .withInt("Quantity", 0)) .withMap("Erasers", new ValueMap() .withInt("Quantity", 1)))); { Day: "Monday", FavoriteDay: false, ItemsOnMyDesk: [ "Telephone", { Pens: { Quantity : 3}, Pencils: { Quantity : 2}, Erasers: null } ] }
  • 12. Amazon DynamoDB Deep Dive  What’s new  JSON Document Support • Flexible access control • Partitions & Scaling • Optimizing Data Access Cost • Customer use case: Druva
  • 13. Flexible access control • Control access to individual items and attributes • Implement using AWS IAM roles that authenticated users assume • Leverage SAML or WIF
  • 14. Flexible access control with AWS IAM ... "Effect": "Allow", "Action": [ "dynamodb:Query”, "dynamodb:Scan” ], "Resource": "arn:aws:dynamodb:REGION:abcde12345:users/example", "Condition": { "ForAllValues:StringEquals": { "dynamodb:LeadingKeys": ["${saml:sub}"], "dynamodb:Attributes": [ "userId", "firstName", "lastName” ] }, "StringEqualsIfExists": { "dynamodb:Select": "SPECIFIC_ATTRIBUTES" } ... Authenticated user ID / DynamoDB table key Item attributes that user can access DynamoDB API actions allowed
  • 15. Questions? …and if you want to take DynamoDB’s flexible security model to the next level, check-out client-side encryption in Java with DynamoDBMapper: http://bit.ly/1zerqW2
  • 16. Amazon DynamoDB Deep Dive  What’s new  JSON Document Support  Flexible access control • Partitions & Scaling • Optimizing Data Access Cost • Customer use case: Druva
  • 17. Partitions & Scaling Votes Table Voter Candidate A Votes: 21 Candidate B Votes: 30 UpdateItem ADD 1 to “Candidate A” (aka Atomic Increment)
  • 18. Partitions & Scaling Votes Table You Provision 1200 Write Capacity Units Need to scale for the election
  • 19. Partitions & Scaling Votes Table Partition 1 Partition 2 You 600 Write Capacity Units (each) Provision 1200 Write Capacity Units
  • 20. Partitions & Scaling Votes Table Partition 1 Partition 2 You (no sharing) Provision 1200 Write Capacity Units
  • 21. Partitions & Scaling Votes Table You Provision 200,000 Write Capacity Units Partition 1 (600 WCU) Partition K (600 WCU) Partition M (600 WCU) Partition N (600 WCU)
  • 22. Partitions & Scaling Votes Table Partition 1 (600 WCU) Candidate A Partition K (600 WCU) Partition M (600 WCU) Partition N (600 WCU) Candidate B Voters
  • 23. Scaling Writes Votes Table Candidate A_2 Candidate B_1 Candidate B_2 Candidate B_3 Candidate B_5 Candidate B_4 Candidate B_7 Candidate B_6 Candidate A_1 Candidate A_3 Candidate A_4 Candidate A_7 Candidate B_8 Voter UpdateItem: “CandidateA_” + rand(0, 10) ADD 1 to Votes Candidate A_6 Candidate A_8 Candidate A_5
  • 24. Scaling Writes Votes Table Candidate A_2 Candidate B_1 Candidate B_2 Candidate B_3 Candidate B_5 Candidate B_4 Candidate B_7 Candidate B_6 Candidate A_1 Candidate A_3 Candidate A_4 Candidate A_5 Candidate A_6 Candidate A_8 Candidate A_7 Candidate B_8 Periodic Process Candidate A Total: 2.5M 1. Sum 2. Store Voter
  • 25. Scaling Reads Votes Table Partition 1 (600 WCU) Candidate A_Total Votes: 2.5M Partition K (600 WCU) Partition M (600 WCU) Partition N (600 WCU) Candidate B_Total Votes: 2.1M Voters
  • 26. Scaling Reads Votes Table Partition 1 (600 WCU) Candidate A_Total Votes: 2.5M Partition K (600 WCU) Partition M (600 WCU) Partition N (600 WCU) Candidate B_Total Votes: 2.1M Voters
  • 27. Scaling Best Practices • Design for uniform data access • Distribute write activity • Understand access patterns • Be careful not to over-provision
  • 29. Amazon DynamoDB Deep Dive  What’s new  JSON Document Support  Flexible access control  Partitions & Scaling • Optimizing Data Access Cost • Customer use case: Druva
  • 31. Provisioned Throughput Capacity Units • 1 Write Capacity Unit (WCU) = Up to 1 KB Item • 1 Read Capacity Unit (RCU) = Up to 4 KB of Items • 0.5 RCU = Up to 4 KB of Items (Eventually Consistent)
  • 32. How much throughput do I need? • Investigate customer workload: – How many requests per second per hash key? – How much will those operations cost? • Investigate customer data size: – How big will the table be initially? – How large will the table grow? – Will the throughput grow proportionally?
  • 33. Best practices: Query vs. BatchGetItem • Query treats all items as a single read operation – Items share the same hash key = same partition – BatchGetItem reads each item in the batch separately • Example – Read 100 items in a table, all with the same hash key. Each item is 120 bytes in size – Query RCU = 3 – BatchGetItem RCU = 100
  • 34. Filter cost calculation Query UserId=Alice, Filter where OpponentId=Bob UserId GameId Date OpponentId Carol e23f5a 2013-10-08 Charlie Alice d4e2dc 2013-10-01 Bob Alice e9cba3 2013-09-27 Charlie Alice f6a3bd 2013-10-08 Bob Filters do not reduce the throughput cost Query/Scan: 1 MB max returned per operation. Limit applies before filtering
  • 35. Example: Query & Read Capacity Units Page Time Views index.html 2014-06-25T12:00 120 index.html 2014-06-25T12:01 122 index.html 2014-06-25T12:02 125 search.html 2014-06-25T12:00 200 Query PageViews WHERE Page=index.html AND Time >= 2014-06-25T12:00 LIMIT 10 42 bytes each X 3 items = 126 bytes = 1 Read Capacity Unit
  • 36. Query cost calculation UserId GameId Date OpponentId … Carol e23f5a 2013-10-08 Charlie … Alice d4e2dc 2013-10-01 Bob … Alice e9cba3 2013-09-27 Bob … Alice f6a3bd 2013-10-08 (1 item = 600 bytes) (397 more games for Alice) 400 X 600 / 1024 / 4 = 60 Read Capacity Units (bytes per item) (KB per byte) (Items evaluated by Query) (KB per Read Capacity Unit)
  • 37. Local Secondary Indexes (LSI) • Same hash key + alternate (scalar) range key • Index and table data is co-located (same partition) • Read and write capacity units consumed from the table • Increases WCUs per write • 10GB max
  • 38. A1 (hash) A3 (range) A2 (table key) A1 (hash) A2 (range) A3 A4 A5 LSIs A1 (hash) A4 (range) A2 (table key) A3 (projected) Table KEYS_ONLY INCLUDE A3 A1 (hash) A4 (range) A2 (table key) A3 (projected) A5 (projected) ALL Local Secondary Index Projections • Items with fewer projected attributes will be smaller: lower cost • Project with care: LSI queries auto-fetch non-projected attributes from the table …at a cost: index matches + table matches
  • 39. Global Secondary Indexes (GSI) • Any attribute indexed as new hash and/or range key • GSIs get their own provisioned throughput • Asynchronously updated, so eventual consistency only
  • 40. A5 (hash) A3 (range) A1 (table key) A1 (hash) A2 A3 A4 A5 GSIs A5 (hash) A4 (range) A1 (table key) A3 (projected) Table KEYS_ONLY INCLUDE A3 A4 (hash) A5 (range) A1 (table key) A2 (projected) A3 (projected) ALL A2 (hash) A1 (table key) KEYS_ONLY Global Secondary Index Projections GSI queries can only request projected attributes
  • 41. Reduce cost with Sparse Indexes • DynamoDB only writes to an index if the index key value is present in the item • Use sparse indexes to efficiently locate table items that have an uncommon attribute • Example: find hockey teams that have won the Stanley Cup – GSI Hash Key: StanleyCupWinner – GSI Range Key: TeamId
  • 43. What about non-indexed querying? • Sometimes you need to scan the full table, e.g. bulk export • Use Parallel Scan for faster retrieval, if… – Your table is >= 20 GB – Provisioned read throughput has room – Sequential Scan operations are too slow • Consider cost and performance implications
  • 44. Bulk Export: Scan Your Table Partition 1 Partition 2 Partition 3 Partition 4 • Slow: only uses one partition at a time • Non-uniform utilization • Consumes lots of RCUs
  • 45. Bulk Export: Parallel Scan Your Table Partition 1 Partition 2 Partition 3 Partition 4 • Divide the work • Use threads, non-blocking IO, processes, or multiple servers to distribute the work • Can still consume lots of RCUs!
  • 46. Bulk Export: Rate-limited Parallel Scan Your Table Partition 1 Partition 2 Partition 3 Partition 4 • Limit page size and rate (1 MB default page size / 4 KB item size) / 2 eventually consistent reads = 128 RCUs per page (default) • Rate-limit each segment
  • 47. Many best practices are implemented already • Amazon Elastic Map Reduce – dynamodb.throughput.read/write.percent • Amazon Redshift – readratio • DynamoDB Mapper (Java SDK) • RateLimiter library (Google Guava)
  • 49. Amazon DynamoDB Deep Dive  What’s new  JSON Document Support  Flexible access control  Partitions & Scaling  Optimizing Data Access Cost • Customer use case: Druva
  • 50. Shekhar Deshkar Chief Architect, Druva
  • 51. Reinventing Data Protection for a Mobile World • Druva inSync Ranked #1 by Gartner, 2 successive years • Top amongst 7 products against 11 critical capabilities • Protect & govern data at the edge of enterprises – Extend eDiscovery & corporate data governance to endpoints – Data loss prevention in case of lost devices • Sync-share for enterprise-wide collaboration • Radically simple Cloud backup and Archival – Eliminates Complexity of Legacy Backup and Tape Vaulting
  • 52. Data management Challenges in a Mobile World • Cost of managing ever-growing number of endpoints or remote servers • Controlling associated data growth – Should I backup everything? – Did I miss backing up something critical? • Cloud? – How do I manage prime-time WAN bandwidth
  • 53. Data backup for endpoints and at the Edge • Backup: Data & metadata • Incremental backup • Data deduplication – Global – Source-side
  • 54. Druva Storage nCube Architecture • Distributed Cloud File System • Store: Object storage, single instance, self-healing • Organize: Time-indexed views, metadata server • Serve: Uncompromised throughput • End-to-end Security • At the ends: 2-factor encryption; At device • Everything in between: Data in transit, Data access • Designed for the Edge • Intelligent, global de-duplication at the edge • Flexible deployment • Why we moved away from Cassandra: http://aws.amazon.com/solutions/case-studies/druva/
  • 56. NoSQL Database & Distributed Systems Scenario: Collaborators trying to create same file in shared work-space • Need to simulate atomic operations that ensures only one of them succeeds • Each collaborator creates an object with a unique temporary name • Perform atomic rename (from unique temporary to possibly-conflicting name) to ensure only one succeeds. item = self.db.new_item(self.DRSTK_FOLDER_NAMEVER, key, {self.AT_DIRENT: val}) try: item.put(expected_value={self.AT_DIRENT: False}) except adb.ConditionCheckFailedError: # someone else raced against us and # already created a folder entry we’re hoping to create
  • 57. Transactional semantics for References • How do we implement transaction-like semantics while updating two independent DynamoDB items? • Thread #A is trying to add reference to an existing item, stored as an object in S3, has checksum ‘csum’, and is identified by ‘handle’ • Thread #B, is in process of dropping it’s reference to the same object and if no more references, removes the item and associated S3 object too.
  • 58. Transactional semantics for References • Thread A: Optimistically, add a new reference to the handle hkey, rkey = self.handle2key(handle) item = self.db.new_item(self.id_ref + hkey, rkey + reverselink,{}) item.put() chkey, crkey = self.sum2key(csum) try: citem = self.db.get_item(chkey, crkey + rkey) except adb.KeyNotFoundError: # we lost race & handle vanished, drop our reference item.delete() else: # success Label #1 thread B may be forced to take Label #5 path Label #2
  • 59. Transactional semantics for References • Thread B Drop a reference to the handle item = self.db.new_item(self.id_ref + hkey, rkey + reverselink) item.delete() items = self.db.query(self.id_ref + hkey, query, limit=1) try: item = items.next() except StopIteration: # no more references to the handle else: # Additional references exist, can't purge return 0 # No one possibly using this block chkey, crkey = self.sum2key(csum) item = self.db.new_item(chkey, crkey + rkey) item.delete() items = self.db.query(self.id_ref + hkey, query, limit=1) try: item = items.next() except StopIteration: # safe to purge the block from S3 now else: # Someone is now using this. Don't drop block Label #5 Label #4 Label #3 thread A may be forced to take Label #2 path
  • 60. Designing for Scale-Out Performance • DynamoDB hot partitions • Lock-free concurrent access – Minimize shared state – Identify and eliminate false-sharing – Distinct hash keys • Relax semantics for shared state – Shard hash key to eliminate DynamoDB hotspots – Collate shard state to determine global state
  • 61. Provisioned Throughput Optimization • Application IOPs to sustain – DynamoDB IOPs needed to support it • Shield applications from Throttling errors • Optimize Provisioned Throughput based on History
  • 63. Poll