AWS Summit Auckland 2014 | Why Scale Matters and How the Cloud Really is Different

© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Why Scale Matters And How The
Cloud Really Is Different
Rodney Haywood,
Manager, Solutions Architecture, Amazon Web Services

Agenda
Redefining Scale
at AWS
AWS Designed
Hardware &
Infrastructure
Multi-AZ Design Point
& Why it Works

Perspective on Scaling
On average, AWS adds enough
new server capacity every day
to support Amazon’s global
infrastructure when it was a
$7B business (2004).

AWS Global Infrastructure
9 regions
25 availability zones
51 edge locations

Amazon S3 Growth
Q4 2006 Q4 2007 Q4 2008 Q4 2009 Q4 2010 Q4 2011 Q4 2012 Q4 2013
Peak Requests:
2,000,000+
per second
Total Number of S3 Objects
2.9 Billion 14 Billion 40 Billion 102 Billion
762 Billion
262 Billion
>1.7 Trillion
>3 Trillion
Peak requests:
1.5M/sec

DynamoDB: Eventual consistency
When 11 becomes 10!

DynamoDB: Requests Served/Month

DynamoDB: Consistent Performance at Scale

“AWS is the overwhelming market
share leader, with more than five
times the compute capacity in
use than the aggregate total of
the other fourteen providers.”

Pace of Innovation
Infrastructure pace of
innovation increasing
–  Driven by cloud service providers and
high-scale internet applications
–  Cost of datacenter and H/W
infrastructure dominates
–  Infrastructure more than just a cost
center
High focus on innovation
–  Driving down cost
–  Increasing aggregate reliability
–  Reducing resource consumption
footprint

AWS Custom Server Designs
OEM Server Ecosystem
–  Optimized for 10s to 100s of thousands of customers
–  Broadly applicable servers can run a variety of workloads
Cloud Server Ecosystem
–  Optimized for single customer
–  Highly specialized servers optimized for specific workload
–  Large scale deployments allow hardware specialization
–  Move hot s/w kernels to hardware implementations
–  Datacenters, servers, networking, storage to designed to integrated spec.

AWS Custom Storage Designs
Commercial high-density storage:
•  Quanta M4600H 4U Disk Enclosure
•  Impressive best in class general purpose design
•  We use custom design with still higher density
OEM storage & servers must target vast workload
diversity
High scale supports AWS-specific optimizations
–  More space, power, & cost efficient

Networking Equipment
•  Relative cost of networking
increasing quickly
•  Profit margins high
•  Ecosystem vertically
integrated
8%
3 year server & 10 year infrastructure amortization
Monthly Costs

Get the Network Out of the Way
Current Networks Over-SubscribedMainframe Model Goes Commodity
•  Forces workload placement
restrictions
•  Goal: Make all points in
datacenter equidistant
•  Amazon custom routers &
protocol stacks

Power Infrastructure
Negotiated power purchasing
agreements
AWS custom high-voltage
sub-stations in some regions
–  Lower power cost
–  Build faster

Procurement & Supply Chain Optimization
Global demand allows
purchasing power at volume
Direct component purchasing
–  Precise inventory control
–  Better pricing
–  Optimized designs
Supply ChainProcurement
Demand-driven supply chain
Shorter cycle time drives higher
utilization
–  Predicting next week easier
than 4 to 6 months out
Less overbuy & less capacity risk
yielding lower costs

Utilization & Economics
On premise 30% utilization
VERY good &10% to 20%
more common
Solution: Pool number of
heterogeneous services
Don’t block the business
Don’t over-buy
Transfers capital expense
to variable expense
Apply capital for business
investments rather than
infrastructure
Cost encourages prioritization
of work by application
developers
High scale needed to make a
spot market for low priority
work
Pay as You Go
Pay as You Grow
Server Utilization
Problem
Chargeback Models
Drive Good Behavior

Amazon Cycle of Innovation
15+ years of
operational excellence
LowerReduce
Prices
Innovate
Listen to
Customers
Lower
Costs
Improve
Processes
Re-invest
in
Features
42 AWS price
reductions since 2006

AWS Pace of Innovation
New
Service
Announcements
&
Updates

9
24
48 61
82
159
280
88
2007 2008 2009 2010 2011 2012 2013 2014

Conventional Design: Cross-Region Replication
5th app availability “9” only via multi-datacenter replication
Conventional approach:
–  Two datacenters in distant locations
–  Replicate all data to both datacenters
The industry-wide dominant multi-DC availability approach
–  Looks rock solid but performs remarkably poorly in
practice
Acid Test: Are you willing to pull the plug on the primary server?
99.999%

What is wrong with inter-regional replication?
Asynchronous replication between datacenters
–  Committing to an SSD order 1 to 2 msec
–  LA to New York 74 msec round trip
On failure, a difficult & high skill decision:
–  Fail-over & lose transactions, or
–  Don’t fail-over & lose availability
I’ve been on these calls in the past
–  No win situation
–  Very hard to get right

What Else is Wrong with X-Country Replication?
Fragile: Active/Passive Doesn’t Work
–  Failover to a system that hasn’t been taking operational load
–  Passive secondary not recently tested
–  Secondary config or S/W version different, incorrect load balancer config,
incorrect network ACLs, latent hardware problem, router problem,
resource shortage under load
–  Can’t test without negative customer impact
–  If you don’t test it, it won’t work
2-Way Redundancy Expensive:
–  More than ½ capacity reserved to handle failure
–  3 datacenters much less expensive but impractical w/o high scale

AWS Multi-Availability Zone Model
Choose Region to be close to user, close to data, or meeting jurisdictional
requirements
Synchronous replication to 2 (or better 3) Availability Zones
–  Easy when less than 2 to 3 msec away
–  Can failover w/o customer impact
ELB over EC2 instances in different AZs
Stateless EC2 apps easy
For persistent state use
–  DynamoDB
–  Simple Storage Service
–  Mutli-AZ RDS

New Research: Customers
Improve Availability by Migrating
Apps to AWS
32% reduction in total
application downtime
2013 AWS Customer Survey
Research Note: Benchmarking availability and reliability
in the cloud: Amazon Web Services Nucleus Research,
November 2013, Document N168

Is Hosting On-premises Less Expensive?
Utilization fundamentally higher in cloud
–  Aggregating non-correlated workloads,
scale, spot market
Amazon specific H/W designs
–  ODM acquisition of custom servers & net
gear
–  Direct purchasing of disk, memory, & CPU
–  AWS controlled hypervisor & net protocol
layers
Deep R&D: Many new data centers built each
year
Immense scale
–  Volume purchasing, highly automated,
specialists in all areas
Amazon margins are tiny compared to
enterprise margins

Summary
AWS Economics driven by scale & singular focus
–  Economies of scale
–  Increased availability through multiple-datacenter deployment
–  Steadily declining price
Mega-scale advantages available to all customers regardless of size
–  Datacenter presence near all customers world-wide
–  Multiple datacenters in each region for high availability
–  Deeper R&D investment & operational focus in datacenter, server, storage, &
networking than any IT organization in the world
–  Buying power that rivals the biggest in the world
Cloud Model Fundamentally different from the last 30 years
–  Even if rebranded as “cloud enabled”, “private cloud”, “cloud-like”

AWS Summit Auckland 2014 | Why Scale Matters and How the Cloud Really is Different

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (13)

Similar to AWS Summit Auckland 2014 | Why Scale Matters and How the Cloud Really is Different

Similar to AWS Summit Auckland 2014 | Why Scale Matters and How the Cloud Really is Different (20)

More from Amazon Web Services

More from Amazon Web Services (20)

Recently uploaded

Recently uploaded (20)

AWS Summit Auckland 2014 | Why Scale Matters and How the Cloud Really is Different