This document discusses optimizing performance for EC2 instances and EBS volumes. It provides guidance on provisioning IOPS for different types of storage workloads and database software. The key recommendations are to use EBS-optimized instances with Provisioned IOPS (PIOPS) volumes for random I/O workloads like databases, size volumes appropriately based on the needed IOPS and throughput, and architect for consistent low latency by adjusting the queue depth.
6. Review: Provisioned IOPS Volumes
❶ Select a new type of Provisioned IOPS volume
❷ Specify the volume capacity
❸ Specify the number of I/O operations per
second your application needs, up to 4000
IOPS per volume. The volume will deliver the
specified I/O operations per second.
Minimum ratio of capacity to IOPS = 1:30
$aws ec2 create-volume --availability-zone us-east-1a --size 134 --volume-type io1 --iops
4000
7. Amazon EBS Standard
• IOPS: ~100 IOPS steady-state, with best-effort bursts to hundreds
Amazon Elastic
Block Storage
(EBS)
• Throughput: variable by workload, best effort to 10s of MB/s.
• Latency: Varies, reads typically <20 ms writes typically <10 ms
• Capacity: As provisioned, up to 1 TB
8. EBS PIOPS
• IOPS: Within 10% of up to 4000 IOPS,
99.9% of a given year, as provisioned.
Amazon Elastic
Block Storage
(EBS)
*
• Throughput: 16 KB per I/O = up to 64 MB/s, as provisioned.
*
• Latency: low and consistent. Second/IOPS, at recommended QD
• Capacity: As provisioned, up to 1 TB
19. Architecting for Performance
•
•
•
IOPS consistency requires EBSoptimized instances
Maximum throughput delivered by
Amazon EBS is limited by Amazon
EC2 bandwidth
EBS throughput =
EBS IOPS × Block size
–
Ex: 64 MB/s = 4000 IOPS × 16 KB
EBS
Instance vCPU Optimized Max MB/s Max 16k IOPS
t1 micro
1
No
32MB/s
2000
m1.small
1
No
64MB/s
4000
m1.medium 1
No
64MB/s
4000
m1.large
2
Yes
64MB/s
4000
m1.xlarge
4
Yes
128MB/s
8000
m3.xlarge
4
Yes
64MB/s
4000
m3.2xlarge
8
Yes
128MB/s
8000
c1.medium
2
No
32MB/s
2000
c1.xlarge
8
Yes
128MB/s
8000
cc2.8xlarge 32
NA
800MB/s
50,000
m2.xlarge
2
No
64MB/s
4000
m2.2xlarge
4
Yes
64MB/s
4000
m2.4xlarge
8
Yes
128MB/s
8000
cr1.8xlarge 32
NA
800MB/s
50,000
hi1.4xlarge 16
NA
800MB/s
50,000
cg1.4xlarge 16
NA
800MB/s
50,000
Max 8k =
2x
Max 4k =
4x*
Max 2k =
8x*
*Maximum IOPS is also limited to ~100,000 per 32 vCpu,
irrespective of block size/throughput.
20. EBS-Optimized
Network interference tests
Row Labels
AvgBW
AvgIOPs
m3.2xlarge (EBS-optimized)
•
no network load
EBS-optimized offers a “SAN-like” experience
random
•
No impact on IOPS or
Amazon EBS
throughput
read
57,542
3,596
write
61,713
3,857
rw (70/30)
Network interference results:
66,997
4,186
read
61,708
3,856
write
61,651
3,853
rw (70/30)
66,996
4,187
read
59,835
3,739
write
63,407
3,962
rw (70/30)
68,859
4,303
read
61,736
3,858
write
63,360
3,959
rw (70/30)
68,859
4,302
sequential
with network load-test1
random
sequential
21. I/O Characteristics
•
–
•
•
PIOPS delivers same number of IOPS for
sequential and random I/O
•
PIOPS delivers same number of IOPS for
reads or writes
Sequential and random
I/O type
–
PIOPS always measures I/O in terms of
16 KB or smaller
4 KB to 64 MB
I/O pattern
–
•
•
I/O size
Read and write
PIOPS is optimized for database workloads
22. Smaller I/O (4 KB, 16 KB)
Results for 400 GB volume with 4000 IOPS at QD 8; EBS-optimized instances
•
•
•
•
Why are 4 KB I/O size in
sequential operations
driving greater than 4000
IOPS?
Why is m1.large and
m3.xlarge IOPS at 16 KB
less than 4000 IOPS?
Database needs 5000
ops/second. How many
IOPS do I need to
provision?
What happens when
customers want to burst
beyond provisioned IOPS?
m1.large
IOPS and BW
performance at QD
Avg BW
8
AvgIOPs ( KB)
M3.xlarge
Avg BW
AvgIOPs ( KB)
m3.2xlarge
Avg
BW
AvgIOPs ( KB)
Write
sequential
4K
4146
16,587
5997
23,990
7767 31,068
16K
3712
59,402
4157
55,461
4153 60,332
4K
4082
16,329
4433
17,733
4178 16,712
16K
3713
59,422
3743
53,813
4153 60,332
4K
5301
21,205
9232
36,929
13450 53,802
16K
3533
56,535
4796
56,824
4153 60,332
4K
4538
18,154
5864
23,457
4177 16,711
16K
3510
56,168
3583
51,246
4153 60,332
Write random
Read
sequential
Read random
23. Larger I/O (128 KB, 512 KB)
Results for 400 GB volume with 4000 IOPS at QD 8
•
Why am I seeing only 462
IOPS on a volume?
m1.large
IOPS and BW
performance at QD 8 AvgIOPs
M3.xlarge
Avg
BW(KB)
AvgIOPs
Avg
BW(KB)
m3.2xlarge
AvgIOPs
Avg
BW(KB)
Write sequential
•
Why there is no difference
in performance for
random and sequential
workloads?
How should I configure
500 MB/s read or write
throughput using PIOPS
volumes
462
59,268
462
59,145
522
66,843
512K
115
59,292
115
59,278
130
66,804
128K
462
59,265
462
59,241
522
66,843
512K
•
128K
115
59,291
115
59,272
130
66,843
128K
455
58,240
454
58,225
522
66,843
512K
113
58,003
114
58,589
130
66,843
128K
455
58,236
454
58,215
522
66,843
512K
113
57,960
114
58,496
130
66,805
Write random
Read sequential
Read random
4000, 16 KB read/write per second, or 2000 32
KB read/write per second, or 1000 64 KB
read/write per second…
24. Write Latency
•
•
16 KBk random WRITE- M3.2Xlarge EBSoptimized
Database applications care
about latency as much as IOPS
delivered
There is an Interdependency
among IOPS, queue depth, and
latency
Current guidance is queue
depth of 1 for every 200 IOPS,
but if latency-bound and writeheavy, 1:500 – 1:1000 is better.
4500
9
4000
Write IOPS
•
8
7.71
7.48
4152
3500
7
6.18
3000
6
5.54
2500
5
2000
4
3.62
3.56
3.13
1500
3
2.03
1000
2
1.47
QD500
0
1
845
AvgIOPS ( Count)
1
845
4
4152
8
4153
12
4177
16
4152
20
4176
24
4177
28
4177
32
4151
AvgTP90 ( ms)
3.13
1.47
2.03
3.56
3.62
5.54
6.18
7.48
7.71
0
L
a
t
e
n
c
y
25. Read Latency
16 KB random READ - M3.2Xlarge EBS-optimized
•
4500
Reads can take advantage of a
deeper queue
100
4153
4120
4000
91.14
93.18
93.70
90
80
•
Current guidance is queue
depth of 1 for every 250 IOPS
EBS-optimized provides
predictable latency
Read
•
IOPS
3500
70
3000
60
2500
50
1965
2000
1864
40
1500
30
1000
20
500
10
5.18
AvgIOPS ( Count)
3.88
3.43
2.15
1.46
0.68
1
4
8
12
16
20
1864
4153
4153
4177
4120
2800
24
1965
28
1213
32
1089
AvgTP90 ( ms)
0.68
91.14
93.18
93.70
QD
0
1.46
2.15
3.43
3.88
5.18
0
L
a
t
e
n
c
y
26. Architecting for Performance: Latency
• Performance requirements may be driven by IOPS or
latency or both
• Recommendation is to start with queue depth of 4
and tune based on IOPS and latency requirement
– Some customers may need lowest possible latency; this can be
achieved at queue depth of 1 or 2
• Very high queue depths ( >24) may decrease IOPS
count as well as increase latency
27. Pre-warming EBS volumes
• Typically 5%, extreme worst case of 50% performance
reduction in IOPS and latency when volumes are used without
pre-warming
– Performance is as provisioned when all the chunks are accessed
• Recommendation if testing or you have spare setup time:
– Write to every 4 MB block before using new volumes
• Linux: DD
• Windows: NTFS Full format
– Takes roughly an hour to pre-warm 1TB 4KB PIOPS volume
– Be warned, can take up to a day for a 1 TB standard EBS volume
34. Performance – Extra-large Production Scale
•
Leverage SSD instance type
(hi1.4xlarge)
o 2 × 1 TB SSD storage (ephemeral
storage)
o Perfect for replicas
•
If replicas on SSD instance types, disable
integrity features such as fsync and
full_page_writes on those hosts to
improve performance
36. What About Performance Cost?
hi1.4xlarge
cc2.8xlarge
hi1.4xlarge
VS.
24 @ 4 KB
PIOPS
If >10 KB write IOPS, TEST,
but probably choose PIOPS
If >20 KB IOPS read, choose hi1
If 3 YR, and >8 KB IOPS, choose hi1
On demand, If <20 KB read IOPS, choose PIOPS
$11773 on-demand,
$10589 effective 3 YR reserved
$4538 on-demand,
$1539 effective 3 YR reserved
37. What about Capacity Cost?
hs1.8xlarge
cc2.8xlarge
hs1.8xlarge
VS.
48x
1TB
EBS
$7312 on-demand,
$6128 effective 3 YR reserved
If >43TB, or > 800MB/s, choose hs1
If 3 year, and >18TB, choose hs1
$6734 on-demand,
$2408 effective 3 YR reserved