3. • What is the EC2 instance platform and a
virtualization primer
• How to make the most of your EC2 instance
experience through the lens of three instance types
• How to think about the future of EC2 instances
Agenda
11. Virtualization Primer: I/O and Devices
• Scheduling I/O requests between virtual devices
and shared physical hardware
• Split driver model for shared devices; requires
host resources
• Intel VT-d
– Direct pass through and IOMMU for dedicated devices
12. Our Philosophy
• Bigger, faster, less expensive, consistent,
and flexible
• Our bar is bare metal performance
• Customers help us prioritize our
roadmap
• Look under the hood through the lens
of three recent platforms: C4, T2, and I2
15. Review: T2 Instances
• Lowest cost EC2 instance at $0.013 per hour
• Burstable vs. Fixed Performance
Name vCPUs
Baseline
Performance
Platform RAM (GiB)
CPU Credits /
Hour
t2.micro 1 10% 32-bit or 64-bit 1 6
t2.small 1 20% 32-bit or 64-bit 2 12
t2.medium 2 40% 32-bit or 64-bit 4 24
t2.large 2 60% 64-bit 8 36
16. Tip: Understand CPU credits
• http://aws.amazon.com/blogs/aws/low-cost-
burstable-ec2-instances/
• http://aws.amazon.com/ec2/instance-types/t2/
• Attend this session
17. How Credits Work
Baseline Rate
Credit
Balance
• A CPU Credit provides the
performance of a full CPU core for
one minute
• An instance earns CPU credits at
a steady rate
• An instance consumes credits
when active
• Credits expire (leak) over time
Burst
Rate
18. CPU Credits
• A CPU Credit provides the performance of a full
CPU core for one minute
• Hefty initial CPU credit balance for good startup
experience
• Use credits when active, accrue credits when
idle
• Transparency on credit balances
21. Review: I2 Instances
• 16 vCPU: 3.2 TB SSD; 32 vCPU: 6.4 TB SSD
• 365K random read IOPS for 32 vCPU instance
Model vCPU Memory
(GiB)
Storage Read IOPS Write IOPS
i2.xlarge 4 30.5 1 x 800 SSD 35,000 35,000
i2.2xlarge 8 61 2 x 800 SSD 75,000 75,000
i2.4xlarge 16 122 4 x 800 SSD 175,000 155,000
i2.8xlarge 32 244 8 x 800 SSD 365,000 315,000
22. Tip: Use 3.8+ kernel
• Amazon Linux 13.09 or later
• Ubuntu 14.04 or later
• RHEL7 or later
• Etc.
23. Pre-3.8.0 Kernels
• All I/O must pass through I/O Domain
• Requires “grant mapping” prior to 3.8.0
• Grant mappings are expensive operations due to TLB flushes
read(fd, buffer, BLOCK_SIZE)
24. 3.8.0+ Kernels – Persistent and Indirect
• Grant mappings are setup in a pool once
• Data is copied in and out of the grant pool
• Copying is significantly faster than remapping
read(fd, buffer, BLOCK_SIZE)
26. SSDs and Wear Leveling
• FLASH has a limited number of writes per
physical sector
– Sectors wear out; must be erased before being rewritten to
– Erasing actually sets values to all 1s, slow
• Modern drive firmware is very sophisticated to
extend life of FLASH
• TRIM and/or over provisioning helps avoid
garbage collection and write amplification
27. Flash Translation Layer
• All new writes go to TAIL
• Random write == Sequential write
• Even sector wear (Wear Leveling)
HEAD
TAIL
28. Garbage Collection
• New writes have to search for free space
• Results in garbage collection pauses
• Large writes may require defragmentation
HEAD
TAIL
32. Why PV-HVM is faster than PV
• PV-HVM allows Application to call directly into the Kernel
• PV requires going through the VMM
• Applications that are system call bound are most affected
Kernel
Application
Kernel
Application
VMM
Application
VMM
Kernel
Bare Metal PV-HVM PV
34. Time Keeping Explained
• Time keeping in an instance is deceptively hard
• gettimeofday(), clock_gettime(), QueryPerformanceCounter()
• The TSC
– CPU counter, accessible from userspace
– Requires calibration, vDSO
– Invariant on Sandy Bridge+ processors
• Xen pvclock; does not support vDSO
• On current generation instances, use TSC as clocksource