Technical 101: AWS Innovation at Scale
This session, gives an insider view of some the innovations that help make the AWS Cloud unique. He will show examples of AWS networking innovations from the interregional network backbone, through custom routers and networking rotocol stack, all the way down to individual servers. He will show examples from AWS server hardware, storage, and power distribution and then, up the stack, in high scale streaming data processing. Rodney will also dive into fundamental database work AWS is delivering to open up scaling and performance limits, reduce costs, and eliminate much of the administrative burden of managing databases. Join this session and walk away with a deeper understanding of the underlying innovations powering the cloud.
Speaker: Rodeny Haywood, Manager Solutions Architecture, Amazon Web Services
1. AWS Innovation at Scale
Learn about some of the underlying innovations that help make the AWS Cloud unique.
Rodney Haywood, Solutions Architecture Manager
Amazon Web Services
2. The Pace Quickens
• Industry generational changes rare
– Only when economics far superior
– Mainframes to UNIX Super Servers
– UNIX Super Servers to x86 Servers
• It’s happening again
– x86 on premise servers to the cloud
– Past transitions have taken a decade+
– What’s different this time is speed of change
• Bigger customer gains drive faster industry transitions
4. Perspective on Scaling
Every day, AWS adds enough new server capacity
to support all of Amazon’s global infrastructure
when it was a $7B annual revenue enterprise
5. Get Networks Out of the Way
• Relative cost of networking increasing quickly
– Server & storage prices falling fast
– Network costs trending to dominate
• Networking frozen in time
– Vertically integrated ecosystem
– Indefensible profit margins
• AWS solution
– Custom net H/W & protocol stack
– Private long haul links
0.5659
Monthly
Costs
3yr
server
&
10
yr
infrastructure
amortization
6. AWS Worldwide Network Backbone
• 11 AWS regions world-wide
• Compute & storage close to
customers & users
or required jurisdictional boundaries
• Private AWS fiber links interconnect all major regions
– Increased availability, higher performance, lower jitter, & reduced costs
7. Example AWS Region
AZ
AZ
AZ AZ AZ
Transit
Transit
• 1 of 11 AWS world-wide AWS regions
• Redundant paths to transit centers
• Transit centers connect to:
– Private links to other AWS regions
– Private links to AWS Direct Connect
customers
– Internet through peering & paid transit
• Metro-area DWDM links between AZs
• 82,864 fiber strands in region
• AZs <2ms apart & usually <1ms
• 25Tbps peak inter-AZs traffic
8. Why Does AWS Offer AZs?
AZ
AZ
AZ AZ
Transit
Transit
AZ
20ms
• Asynchronous replication between distant data centers
• Committing to an SSD order 1 to 2ms
• But, Sydney to Melbourne is ~20ms round trip
• You can’t wait 20ms to commit a transaction
• On failure, difficult & high skill decision
• Fail-over & lose transactions
• Or don’t fail-over & lose availability
• Difficult choice
• Use AZs for no-admin failover
• Sync works when <2ms
• Can be combined with regional replication
for very high availability
9. Example AWS Availability Zone
AZ
AZ
AZ AZ
Transit
Transit
AZ..
Data Center
Data Center
Data Center
Data Center
• 1 of 28 AZs world-wide
• All regions have 2 or more AZs
• Each AZ is 1 or more DC
– No data center is in two AZs
– Some AZs have as many as 6 DCs
• DCs in AZ less than ¼ ms apart
– Don’t need inter-AZ independence
– Do require low latency & full B/W
10. Example AWS Data Center
• Single DC
– Typically over 50,000 servers & often over 80,000
– Typically 25 to 30 MW
• Larger DCs undesirable
– Diminishing returns
– Blast radius
• Up to 102Tbps provisioned to
a single DC
• AWS custom network equipment
– Multi-ODM sourced
– Amazon custom network protocol
stack
11. Example Rack, Server & NIC
• First network hop must virtualize
all network traffic
– Needed for security, isolation,
metering, DDoS Protection, Capacity
Limits …
• Remove the “virtualization tax”
• Supports Single Root I/O
Virtualization (SR-IOV)
– Each guest VM gets it’s own H/W
virtualized NIC
– Much lower latency & less latency
jitter
12. Network Latency & Variability
Old New
LogarithmicScale
>10x latency variability improvement
>2x average latency reduction
13. AWS Custom Server & Storage Designs
• OEM server ecosystem:
– Very general designs able to run wide variety of workloads
– Vast, expensive, world-wide distribution network
• AWS custom servers & storage:
– Specialised servers optimized for a specific workload
– Move hot s/w kernels to hardware implementations
– Custom Intel procs beyond commercially available clock rates
– DCs, servers, networking, storage designed to integrated specs
Example Storage Rack
864 disks, 1,066kg
14. Relational Database Expensive & Hard
• Relational Database dominated by “big 3”
– Oracle, SQL Server, & DB2
• Expensive, hard to administer, don’t scale, & just about
impossible to switch
• “No SQL” scales & relieves some administrative burden
– e.g. MongoDB
• Cloud NoSQL both scales & virtually eliminates classic DB
admin Issue
– e.g., Amazon DynamoDB
15. Amazon DynamoDB
• Cloud NoSQL database optimized for latency & scale
• 3x request growth last year
– Single digit ms response times
– Still same low & predictable jitter
– 4x storage growth over same period
• Key new features
– JSON Support
– Up to 400KB items
– Global Secondary Indexes
– DynamoDB Streams
– Cross-region replication
Single Region DynamoDB Requests
Request in Trillions/month
16. Addressing RDB Administrative Challenge
• Relational easy to use, feature rich, but admin intensive
• RDBs still the core of many applications
• Also largest single largest driver of downtime & lost sleep
• Amazon RDS
• Addresses the administrative complexity issue
• Amazon RDS MySQL, Oracle, SQL Server, & PostgreSQL
• Cloud managed alone doesn’t address RDBMS cost,
availability, performance or scaling limitations
17. Multi-AZ RDS % of all RDS
25%
29%
33%
37%
41%
10/1612/16 2/17 4/17 6/17 8/17 10/1712/17 3/18 5/18 7/18 9/18
0.25790.25850.25870.2590.25940.25970.25960.25970.25940.25940.25960.26040.26080.26060.26030.25980.26090.26190.26220.26220.26360.26570.26620.26660.2680.26870.26970.27220.27220.27260.27290.27380.27440.27450.27380.2740.27380.27550.27610.2760.27630.27620.27630.27590.27630.27680.27690.27740.27790.27770.27810.27850.27910.27920.27910.27960.27970.280.28060.28010.27990.28120.28090.28080.28160.2810.28120.28080.28060.28070.28140.28160.28190.28070.28040.2810.2810.28060.2810.2810.28130.28110.28120.28130.28060.28050.28090.28120.28220.28230.28190.2820.28210.28290.28270.28280.28290.2830.28280.2830.2840.28420.28410.28440.28430.28470.28470.2850.28530.2850.28510.28470.28490.2840.28490.28390.28370.28340.28390.2830.28260.28350.28390.28490.28520.2850.28530.28510.28610.28640.28630.28670.28650.28580.28640.28670.2870.28690.28710.28650.28550.28550.28550.28560.28540.28520.28550.28570.28630.28710.28720.2870.28680.28720.28740.28760.28790.28810.28760.28740.28750.28830.28790.28820.28810.28810.28870.28860.28810.28790.28850.28880.28940.28980.28990.29060.29050.29080.29080.29050.29050.29010.29020.28980.29080.29060.290.29040.29040.29050.29050.29050.29060.28960.28980.28980.28970.290.29080.29090.29080.2910.29010.29050.29120.29190.29190.29150.29320.29270.29250.29230.29340.29330.29350.29350.29430.29450.29420.29420.29450.29450.2940.29430.30130.30120.30170.30150.30080.3010.30210.3020.30220.30290.30290.30290.30310.30280.30260.30320.30230.30250.30240.30240.30320.30320.3030.30360.3040.30340.30340.30370.30420.3040.30490.3050.30470.30470.30490.30480.30430.30430.30440.30420.30470.3040.30380.30450.30530.30540.30430.3060.3060.30620.30660.30750.30770.30740.3080.30830.30770.30760.30830.30840.30820.30810.30820.30860.30870.30950.30950.30990.30950.310.31020.31080.3110.3110.31110.3120.31190.31160.31220.31260.31230.31250.31270.31250.31280.31330.31320.31320.3130.31310.31320.31370.3140.31430.31410.31410.3140.31460.31460.31520.31550.31590.31630.31690.31790.31830.31930.31930.31910.31980.31980.32020.32110.32180.32250.32290.32380.32420.32440.32490.32590.32590.3260.32660.32750.32810.32850.32940.32960.32990.33040.33080.33160.33240.33340.3340.33420.33480.33490.33520.33480.33510.33470.33520.33550.33490.33540.33660.33660.33680.33690.33760.33810.33880.33920.33980.34010.33990.34050.34050.34130.34210.34270.34270.34240.34290.34310.34290.3440.34510.34610.34570.34530.34530.34540.34550.34580.34610.3460.34630.34760.34840.34810.34890.34890.34880.34910.34970.34970.35080.35190.35220.3520.35230.35280.35280.35350.35430.35440.35430.3540.35370.35410.35410.35460.35470.35440.35470.35440.35420.35460.35470.3550.35450.35470.35470.35390.35450.35520.35560.35630.35620.35620.35620.35620.35640.35660.35650.35650.35690.3570.3570.3570.3570.3570.35720.35740.35720.35710.35740.35790.3580.35810.35920.35890.35930.36010.36030.36050.36040.36060.36070.36080.36150.36170.36170.36130.36180.36230.36270.36310.36340.36370.36350.36390.36360.36380.36410.36410.36410.36470.36480.36410.36480.36520.36530.36520.36520.36580.36530.36610.36630.36640.36660.36650.36680.36720.3670.3670.36720.36730.36680.36660.36670.36760.3680.36770.36730.36780.36820.3680.36820.36850.36870.3690.36970.36970.37010.36950.37060.37090.37040.37080.37070.37160.37140.37190.37230.37210.37210.37180.37240.37210.37290.3730.37240.37360.37380.37350.37370.3740.37440.37440.37420.37450.37430.37480.37550.37550.3750.37540.37540.37520.37570.37590.37620.37610.37590.37660.37680.3770.37730.37740.37740.37690.37750.37760.37770.37840.37860.37810.37730.37730.37780.3780.37850.37870.37830.3780.37860.37850.37890.37970.37990.37980.37950.37990.37970.380.38160.38150.38160.38210.38220.38210.38160.38230.38210.38230.38220.38190.38220.38220.38270.3830.38240.38260.38320.38310.38310.38330.38330.38270.38310.38320.38270.3830.38340.38360.38350.38380.38450.38420.38440.3850.38520.38490.38510.38530.38430.38480.38520.38550.38530.38540.3850.38470.38470.38580.38590.38560.38560.38560.3850.38570.38610.38670.38640.38640.38610.38610.38590.38670.38690.38680.38690.3870.38660.38680.38730.38760.38710.3870.38730.38780.38780.38870.38890.38840.3880.3880.3880.3890.3900.3900.3890.3900.3890.3890.3900.3910.3910.3910.3900.3900.3900.3900.3910.3910.3910.3910.3910.3920.3920.390.390.390.390.390.390.390.390.390.390.390.390.390.390.390.390.390.390.390.390.390.390.390.390.30.30.30.30.30.30.30.30.30.30.30.30.30.30.30.30.3
RDS Multi-AZ Relational Availability
Seconda
ry
Seconda
ry
Secondary
DB
Synchronous Replication
Primary DB
AZ1 AZ2
Application
• Hard to reliably get beyond 3 9s in single
building deployments:
– RDS MySQL Multi-AZ Synchronous replication
• Not new technology
– EMC SRDF/S & Oracle Fast Start Failover
– But all come with “enterprise” pricing
• RDS MySQL Multi-AZ makes
sync replication inexpensive &
easy
– More application 9s & way more sleep
18. Amazon Aurora
• Custom AWS MySQL Storage Engine
– Enterprise DB features at cloud pricing
– Drop in compatible with MySQL apps
– Storage engine separate from relational
• Triple AZ storage engine handles faults
W/O read or write pause
– e.g. Entire DC can go down at same time
as a disk or server failure
• “Impossible” faults such as loss of 2
DCs still don’t lose data
– Synchronous multi-DC replication
SQL
Txns
Caching
StoreStore
Transform: Write to
Read Format Optimized
StoreStore
AZ 1 AZ 2
StoreStore
AZ 3
Transform: Write to
Read Format Optimized
Transform: Write to
Read Format Optimized
19. • MySQL updates combined with new storage manager
– 3x write performance
– 5x read performance
• Supports up to 16-way read replicas (RDS mySQL: 5-way)
• 400x less lag (2,000ms vs 5ms)
• Supports up to 64TB tables (RDS/MySQL: 3TB)
• Near instant fail-over (no database crash recovery time)
• Auto-recovery from storage faults
– Auto-data page patch or full disk loss recovery without operational impact
Amazon Aurora Performance
AZ 1 AZ 2 AZ 3
20. Amazon Redshift Parallel SQL Data Warehouse
• Up to 128 server parallel SQL DB
– Columnar data warehouse
• Disruptive cost USD$1,000/TB/Year
– Fastest growing AWS service
– Already 1000s of customers
– Multiple PB+ clusters in production
• Automated provisioning, patching,
security, resize, backup/restore
• Massive data scaling
– DW1: HDD; scale from 2TB to 2PB
– DW2: SSD; scale from 160GB to 326TB
Full 10Gbps Links
Ingestion,
Backup, &
Restore
SQL Clients/BI Tools
128GB RAM
16TB disk
16 cores
128GB RAM
16TB disk
16 cores
128GB RAM
16TB disk
16 cores
128GB RAM
16TB disk
16 cores
S3 / EMR / DynamoDB / SSH
Customer VPC
Internal
VPC
JDBC/ODBC
Leader
Node
Compute
Node
Compute
Node
Compute
Node
21. Amazon EBS at 20,000 IOPS
• Provisioned IOPS (SSD)
– Max volume to 16TB (From:1TB)
– Max I/O rate to 20,000 IOPS (From:4k IOPS)
– Max throughput to 320MB/s (From:180MB/s)
• General Purpose (SSD)
– Max volume size to 16TB (From:1TB)
– Max I/O rate to 10,000 IOPS (From:3k IOPS)
– Max throughput to 160MB/s (From:128 MB/s)
Amazon EBS
22. Internal Challenge to External Service
AWS Metering:
• Tens of millions records/sec
• Multiple TB per hour
• 100,000s of internal sources
• Scales, low-cost, auditable,
with real time alerting
Amazon Kinesis:
• Producers call put
• Sequence # returned
• Distributed over shards
• Scales per shard at 1
MB/s & 1000 TPS
23. Power Infrastructure
• Some DCs with custom power sub-stations
• Negotiated power purchasing agreements
• Custom switchgear firmware
• 3 100% carbon neutral regions:
– US West (Oregon)
– AWS GovCloud (US)
– EU (Frankfurt)
• 150 megawatt wind farm in Benton County,
expected to start generating approximately
500,000 megawatt hours (MWh) of wind power
annually as early as January 2016
• 4.8 megawatt hour pilot of Tesla’s energy
storage batteries in US West (Northern
California)
24. Rapid Pace of Innovation
2007 2008 2009 2010 2011 2012 2013 2014 2015
166
516
280
159
826148249
(as at 31 March)