SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
Maximizing Amazon S3 Performance
Craig Carl, AWS
November 15, 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Trillions
Of Unique Customer Objects
1.5 Million+
Peak Transactions Per Second
Architecture

Optimizing PUTs

Choosing a region

Multipart upload

Building a naming scheme
Considering LISTs

Optimizing GETs
Using CloudFront
Range-based GETs
Choosing a Region
• Performance
– Proximity to your users
– Co-locating with compute, other AWS resources

• Other things to think about
– Legal and regulatory requirements
– Costs vary by region
Pay Attention to Your Naming Scheme If:
• You want consistent performance from a bucket
• You want a bucket capable of routinely
exceeding 100 TPS
http://amzn.to/18oF5LC
Transactions Per Second (TPS)

1

8

2

5

100/8 = 12.5 events/sec
100,000 users @ 10 events an hour = 224 TPS
Distributing Key Names
• Don’t do this
<my_bucket>/2013_11_13-164533125.jpg
<my_bucket>/2013_11_13-051033564.jpg
<my_bucket>/2013_11_13-061133789.jpg
<my_bucket>/2013_11_13-051033458.jpg
<my_bucket>/2013_11_12-063433125.jpg
<my_bucket>/2013_11_12-021033564.jpg
<my_bucket>/2013_11_12-065533789.jpg
<my_bucket>/2013_11_12-011033458.jpg
<my_bucket>/2013_11_11-022333125.jpg
<my_bucket>/2013_11_11-153433564.jpg
<my_bucket>/2013_11_11-065233789.jpg
<my_bucket>/2013_11_11-065633458.jpg
Distributing Key Names
• Add randomness to the beginning of the key name
<my_bucket>/521335461-2013_11_13.jpg
<my_bucket>/465330151-2013_11_13.jpg
<my_bucket>/987331160-2013_11_13.jpg
<my_bucket>/465765461-2013_11_13.jpg
<my_bucket>/125631151-2013_11_13.jpg
<my_bucket>/934563160-2013_11_13.jpg
<my_bucket>/532132341-2013_11_13.jpg
<my_bucket>/565437681-2013_11_13.jpg
<my_bucket>/234567460-2013_11_13.jpg
<my_bucket>/456767561-2013_11_13.jpg
<my_bucket>/345565651-2013_11_13.jpg
<my_bucket>/431345660-2013_11_13.jpg
Other Techniques for Distributing Key Names
• Store objects as a hash of their name
– add the original name as metadata
• “deadmau5_mix.mp3”  0aa316fb000eae52921aab1b4697424958a53ad9
– watch for duplicate names!

– prepend keyname with short hash
• 0aa3-deadmau5_mix.mp3

• Epoch time (reverse)
– 5321354831-deadmau5_mix.mp3
Randomness in a Key Name Can Be an Anti-Pattern

• Lifecycle policies
• LISTs with prefix filters
• Maintaining thumbnails of images
– craig.jpg -> stored as orig-09329jed0fc
– thumb-09329jed0fc

• When you need to recover a file with its original
name
Solving for the Anti-Pattern
• Add additional prefixes to help sorting
<my_bucket>/images/521335461-2013_11_13.jpg
<my_bucket>/images/465330151-2013_11_13.jpg
<my_bucket>/movies/293924440-2013_11_13.jpg
<my_bucket>/movies/987331160-2013_11_13.jpg
<my_bucket>/thumbs-small/838434842-2013_11_13.jpg
<my_bucket>/thumbs-small/342532454-2013_11_13.jpg
<my_bucket>/thumbs-small/345233453-2013_11_13.jpg
<my_bucket>/thumbs-small/345453454-2013_11_13.jpg

• Amazon S3 maintains keys lexicographically in its
internal indices
Distributing Your Key Names Is Always a Good Idea!

It can take some time for improvements to manifest

Open a support case if you need an immediate bump
or if you’ve got any questions!

http://amzn.to/18oF5LC
Amazon CloudFront
Using Amazon CloudFront for Distribution
•
•
•
•

Caches objects from Amazon S3
Reduces the number of Amazon S3 GETs
Low latency with multiple endpoints
High transfer rate

• Two flavors:
– Web distribution (static content)
– RTMP distribution (on-demand streaming of media)
Multipart Upload Provides Parallelism
• Allows faster, more flexible uploads
• Allows you to upload a single object as a set of parts
• Upon upload, Amazon S3 then presents all parts as
a single object
• Enables parallel uploads, pausing and resuming
an object upload, and beginning uploads before
you know the total object size
Choose the Right Part Size
• Strike a balance between part size and number of parts
– Lots of small parts increase connection overhead, invalidating the benefits
of parallelism
– Too few large parts don’t get you enough benefits of multipart; don’t get you
resiliency to network errors

• We recommend parts of 25–50 MB on higher-bandwidth
networks and parts of 10 MB on mobile networks
You Can Parallelize Your GETs, Too
• Use range-based GETs to get multithreaded
performance when downloading objects
• Compensates for unreliable networks
• Benefits of multithreaded parallelism
• Align your ranges with your parts!
If you’re using SSL and parallelizing…
• You’re likely to become CPU-constrained
because encryption is CPU-intensive
• Amazon S3 recommends using AES-256 to
optimize for security and performance
• You can leverage AES-NI hardware on your host
to improve your performance
If Your Application Relies on LIST…
• Getting the objects your customers have stored
• Seeing sets of files (all animations, videos)
• Getting logs
• Viewing inventories
• Sorting keys based on metadata
What Should You Do?
• Parallelize LIST when you need a sequential list of
your keys
• You should build a secondary index of your keys,
such as with Amazon DynamoDB, to get a faster
alternative to LIST when a sequential list isn’t
sufficient
– Sorting by metadata
– Looking up by category
– Objects by time stamp
LIST Operations with Amazon DynamoDB
• Maintain metadata in DynamoDB
– Keep data about what’s in your buckets in DynamoDB

• On PUTs, enter data about your objects in DynamoDB
• On GETs, use DynamoDB to assist in your search for
specific objects
• You can use DynamoDB to give you “LIST” based on
specific criteria
Wrap up: Maximizing Amazon S3 Performance
Architecture

Optimizing PUTs

Choosing a region

Multipart upload

Building a naming scheme

Considering LISTs

Optimizing GETs
Using CloudFront
Range-based GETs
Please give us your feedback on this
presentation

STG304
As a thank you, we will select prize
winners daily for completed surveys!

Contenu connexe

Tendances

Tendances (20)

Strategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageStrategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud Storage
 
ENT313 Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum E...
ENT313 Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum E...ENT313 Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum E...
ENT313 Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum E...
 
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
 
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
 
Optimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics WorkloadsOptimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics Workloads
 
(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWS(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWS
 
Intro to AWS: Storage Services
Intro to AWS: Storage ServicesIntro to AWS: Storage Services
Intro to AWS: Storage Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Apache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSApache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWS
 
Deep Dive - Amazon Elastic MapReduce (EMR)
Deep Dive - Amazon Elastic MapReduce (EMR)Deep Dive - Amazon Elastic MapReduce (EMR)
Deep Dive - Amazon Elastic MapReduce (EMR)
 
Interactively Querying Large-scale Datasets on Amazon S3
Interactively Querying Large-scale Datasets on Amazon S3Interactively Querying Large-scale Datasets on Amazon S3
Interactively Querying Large-scale Datasets on Amazon S3
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWS
 
AWS 201 Webinar: Introduction to Amazon Glacier
AWS 201 Webinar: Introduction to Amazon GlacierAWS 201 Webinar: Introduction to Amazon Glacier
AWS 201 Webinar: Introduction to Amazon Glacier
 
Spark and the Hadoop Ecosystem: Best Practices for Amazon EMR
Spark and the Hadoop Ecosystem: Best Practices for Amazon EMRSpark and the Hadoop Ecosystem: Best Practices for Amazon EMR
Spark and the Hadoop Ecosystem: Best Practices for Amazon EMR
 
Amazon S3 & Amazon Glacier - Object Storage Overview
Amazon S3 & Amazon Glacier - Object Storage OverviewAmazon S3 & Amazon Glacier - Object Storage Overview
Amazon S3 & Amazon Glacier - Object Storage Overview
 
ENT306 Migrating Large Scale Data Sets to the Cloud
ENT306 Migrating Large Scale Data Sets to the CloudENT306 Migrating Large Scale Data Sets to the Cloud
ENT306 Migrating Large Scale Data Sets to the Cloud
 
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
 
AWS Webcast - Archiving in the Cloud - Best Practices for Amazon Glacier
AWS Webcast - Archiving in the Cloud - Best Practices for Amazon GlacierAWS Webcast - Archiving in the Cloud - Best Practices for Amazon Glacier
AWS Webcast - Archiving in the Cloud - Best Practices for Amazon Glacier
 
Intro to AWS: Database Services
Intro to AWS: Database ServicesIntro to AWS: Database Services
Intro to AWS: Database Services
 

Similaire à Maximizing Amazon S3 Performance (STG304) | AWS re:Invent 2013

Similaire à Maximizing Amazon S3 Performance (STG304) | AWS re:Invent 2013 (20)

Deep Dive on Amazon S3 - March 2017 AWS Online Tech Talks
Deep Dive on Amazon S3 - March 2017 AWS Online Tech TalksDeep Dive on Amazon S3 - March 2017 AWS Online Tech Talks
Deep Dive on Amazon S3 - March 2017 AWS Online Tech Talks
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
Deep Dive on Amazon S3 - AWS Online Tech Talks
Deep Dive on Amazon S3 - AWS Online Tech TalksDeep Dive on Amazon S3 - AWS Online Tech Talks
Deep Dive on Amazon S3 - AWS Online Tech Talks
 
Deep Dive On Object Storage: Amazon S3 and Amazon Glacier - AWS PS Summit Can...
Deep Dive On Object Storage: Amazon S3 and Amazon Glacier - AWS PS Summit Can...Deep Dive On Object Storage: Amazon S3 and Amazon Glacier - AWS PS Summit Can...
Deep Dive On Object Storage: Amazon S3 and Amazon Glacier - AWS PS Summit Can...
 
Object Storage: Amazon S3 and Amazon Glacier
Object Storage: Amazon S3 and Amazon GlacierObject Storage: Amazon S3 and Amazon Glacier
Object Storage: Amazon S3 and Amazon Glacier
 
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
 
AWS Cost Optimization Strategy
AWS Cost Optimization StrategyAWS Cost Optimization Strategy
AWS Cost Optimization Strategy
 
AWS re:Invent 2016: Workshop: AWS S3 Deep-Dive Hands-On Workshop: Deploying a...
AWS re:Invent 2016: Workshop: AWS S3 Deep-Dive Hands-On Workshop: Deploying a...AWS re:Invent 2016: Workshop: AWS S3 Deep-Dive Hands-On Workshop: Deploying a...
AWS re:Invent 2016: Workshop: AWS S3 Deep-Dive Hands-On Workshop: Deploying a...
 
How to Build a Data Lake in Amazon S3 & Amazon Glacier - AWS Online Tech Talks
How to Build a Data Lake in Amazon S3 & Amazon Glacier - AWS Online Tech TalksHow to Build a Data Lake in Amazon S3 & Amazon Glacier - AWS Online Tech Talks
How to Build a Data Lake in Amazon S3 & Amazon Glacier - AWS Online Tech Talks
 
Deep Dive on Amazon S3
Deep Dive on Amazon S3Deep Dive on Amazon S3
Deep Dive on Amazon S3
 
Accelerating Application Development with Amazon Aurora (DAT312-R2) - AWS re:...
Accelerating Application Development with Amazon Aurora (DAT312-R2) - AWS re:...Accelerating Application Development with Amazon Aurora (DAT312-R2) - AWS re:...
Accelerating Application Development with Amazon Aurora (DAT312-R2) - AWS re:...
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
Amazon Aurora Getting started Guide -level 0
Amazon Aurora Getting started Guide -level 0Amazon Aurora Getting started Guide -level 0
Amazon Aurora Getting started Guide -level 0
 
Protect & Manage Amazon S3 & Amazon Glacier Objects at Scale (STG316-R1) - AW...
Protect & Manage Amazon S3 & Amazon Glacier Objects at Scale (STG316-R1) - AW...Protect & Manage Amazon S3 & Amazon Glacier Objects at Scale (STG316-R1) - AW...
Protect & Manage Amazon S3 & Amazon Glacier Objects at Scale (STG316-R1) - AW...
 
Builders' Day - Best Practises for S3 - BL
Builders' Day - Best Practises for S3 - BLBuilders' Day - Best Practises for S3 - BL
Builders' Day - Best Practises for S3 - BL
 
Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3
 
Level 3 Certification: Setting up Sumo Logic - Oct 2018
Level 3 Certification: Setting up Sumo Logic - Oct  2018Level 3 Certification: Setting up Sumo Logic - Oct  2018
Level 3 Certification: Setting up Sumo Logic - Oct 2018
 
Deep Dive on Amazon S3
Deep Dive on Amazon S3Deep Dive on Amazon S3
Deep Dive on Amazon S3
 

Plus de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Maximizing Amazon S3 Performance (STG304) | AWS re:Invent 2013

  • 1. Maximizing Amazon S3 Performance Craig Carl, AWS November 15, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 4. Architecture Optimizing PUTs Choosing a region Multipart upload Building a naming scheme Considering LISTs Optimizing GETs Using CloudFront Range-based GETs
  • 5. Choosing a Region • Performance – Proximity to your users – Co-locating with compute, other AWS resources • Other things to think about – Legal and regulatory requirements – Costs vary by region
  • 6. Pay Attention to Your Naming Scheme If: • You want consistent performance from a bucket • You want a bucket capable of routinely exceeding 100 TPS http://amzn.to/18oF5LC
  • 7. Transactions Per Second (TPS) 1 8 2 5 100/8 = 12.5 events/sec 100,000 users @ 10 events an hour = 224 TPS
  • 8. Distributing Key Names • Don’t do this <my_bucket>/2013_11_13-164533125.jpg <my_bucket>/2013_11_13-051033564.jpg <my_bucket>/2013_11_13-061133789.jpg <my_bucket>/2013_11_13-051033458.jpg <my_bucket>/2013_11_12-063433125.jpg <my_bucket>/2013_11_12-021033564.jpg <my_bucket>/2013_11_12-065533789.jpg <my_bucket>/2013_11_12-011033458.jpg <my_bucket>/2013_11_11-022333125.jpg <my_bucket>/2013_11_11-153433564.jpg <my_bucket>/2013_11_11-065233789.jpg <my_bucket>/2013_11_11-065633458.jpg
  • 9. Distributing Key Names • Add randomness to the beginning of the key name <my_bucket>/521335461-2013_11_13.jpg <my_bucket>/465330151-2013_11_13.jpg <my_bucket>/987331160-2013_11_13.jpg <my_bucket>/465765461-2013_11_13.jpg <my_bucket>/125631151-2013_11_13.jpg <my_bucket>/934563160-2013_11_13.jpg <my_bucket>/532132341-2013_11_13.jpg <my_bucket>/565437681-2013_11_13.jpg <my_bucket>/234567460-2013_11_13.jpg <my_bucket>/456767561-2013_11_13.jpg <my_bucket>/345565651-2013_11_13.jpg <my_bucket>/431345660-2013_11_13.jpg
  • 10. Other Techniques for Distributing Key Names • Store objects as a hash of their name – add the original name as metadata • “deadmau5_mix.mp3”  0aa316fb000eae52921aab1b4697424958a53ad9 – watch for duplicate names! – prepend keyname with short hash • 0aa3-deadmau5_mix.mp3 • Epoch time (reverse) – 5321354831-deadmau5_mix.mp3
  • 11. Randomness in a Key Name Can Be an Anti-Pattern • Lifecycle policies • LISTs with prefix filters • Maintaining thumbnails of images – craig.jpg -> stored as orig-09329jed0fc – thumb-09329jed0fc • When you need to recover a file with its original name
  • 12. Solving for the Anti-Pattern • Add additional prefixes to help sorting <my_bucket>/images/521335461-2013_11_13.jpg <my_bucket>/images/465330151-2013_11_13.jpg <my_bucket>/movies/293924440-2013_11_13.jpg <my_bucket>/movies/987331160-2013_11_13.jpg <my_bucket>/thumbs-small/838434842-2013_11_13.jpg <my_bucket>/thumbs-small/342532454-2013_11_13.jpg <my_bucket>/thumbs-small/345233453-2013_11_13.jpg <my_bucket>/thumbs-small/345453454-2013_11_13.jpg • Amazon S3 maintains keys lexicographically in its internal indices
  • 13. Distributing Your Key Names Is Always a Good Idea! It can take some time for improvements to manifest Open a support case if you need an immediate bump or if you’ve got any questions! http://amzn.to/18oF5LC
  • 15. Using Amazon CloudFront for Distribution • • • • Caches objects from Amazon S3 Reduces the number of Amazon S3 GETs Low latency with multiple endpoints High transfer rate • Two flavors: – Web distribution (static content) – RTMP distribution (on-demand streaming of media)
  • 16. Multipart Upload Provides Parallelism • Allows faster, more flexible uploads • Allows you to upload a single object as a set of parts • Upon upload, Amazon S3 then presents all parts as a single object • Enables parallel uploads, pausing and resuming an object upload, and beginning uploads before you know the total object size
  • 17. Choose the Right Part Size • Strike a balance between part size and number of parts – Lots of small parts increase connection overhead, invalidating the benefits of parallelism – Too few large parts don’t get you enough benefits of multipart; don’t get you resiliency to network errors • We recommend parts of 25–50 MB on higher-bandwidth networks and parts of 10 MB on mobile networks
  • 18. You Can Parallelize Your GETs, Too • Use range-based GETs to get multithreaded performance when downloading objects • Compensates for unreliable networks • Benefits of multithreaded parallelism • Align your ranges with your parts!
  • 19. If you’re using SSL and parallelizing… • You’re likely to become CPU-constrained because encryption is CPU-intensive • Amazon S3 recommends using AES-256 to optimize for security and performance • You can leverage AES-NI hardware on your host to improve your performance
  • 20. If Your Application Relies on LIST… • Getting the objects your customers have stored • Seeing sets of files (all animations, videos) • Getting logs • Viewing inventories • Sorting keys based on metadata
  • 21. What Should You Do? • Parallelize LIST when you need a sequential list of your keys • You should build a secondary index of your keys, such as with Amazon DynamoDB, to get a faster alternative to LIST when a sequential list isn’t sufficient – Sorting by metadata – Looking up by category – Objects by time stamp
  • 22. LIST Operations with Amazon DynamoDB • Maintain metadata in DynamoDB – Keep data about what’s in your buckets in DynamoDB • On PUTs, enter data about your objects in DynamoDB • On GETs, use DynamoDB to assist in your search for specific objects • You can use DynamoDB to give you “LIST” based on specific criteria
  • 23. Wrap up: Maximizing Amazon S3 Performance Architecture Optimizing PUTs Choosing a region Multipart upload Building a naming scheme Considering LISTs Optimizing GETs Using CloudFront Range-based GETs
  • 24. Please give us your feedback on this presentation STG304 As a thank you, we will select prize winners daily for completed surveys!