Amazon EC2 provides a broad selection of instance types to accommodate a diverse mix of workloads. In this session, we provide an overview of the Amazon EC2 instance platform, key platform features, and the concept of instance generations. We dive into the current generation design choices of the different instance families, including the General Purpose, Compute Optimized, Storage Optimized, Memory Optimized, and GPU instance families. We will also provide an overview of the newest instances announced at re:Invent, including the latest generation of Memory and Compute Optimized Instances R4 and C5 instances, new Storage Optimized High I/O I3 instances, and new larger T2 instances. We also detail best practices and share performance tips for getting the most out of your Amazon EC2 instances.
Learning Objectives:
• Get an overview of the EC2 instance platform, key platform features, and the concept of instance generations
• Learn about the latest generation of Amazon EC2 Instances
• Learn best practices around instance selection to optimize performance
2. Understanding the factors that going into choosing an EC2 instance
Defining system performance and how it is characterized for
different workloads
A look into our current generation instances and their features
How Amazon EC2 instances deliver performance while providing
flexibility and agility
How to make the most of your EC2 instance experience through the
lens of several instance types
What to Expect from the Session
7. Choices and Flexibility
Choice of Processor
Memory
Storage Options
Accelerated Graphics
Burstable Performance
8. Servers are hired to do jobs
Performance is measured differently depending on the job
Hiring a Server
?
9. Performance Factors
Resource Performance factors Key indicators
CPU Sockets, number of cores, clock
frequency, bursting capability
CPU utilization, run queue length
Memory Memory capacity Free memory, anonymous paging,
thread swapping
Network
interface
Max bandwidth, packet rate Receive throughput, transmit throughput
over max bandwidth
Disks Input / output operations per
second, throughput
Wait queue length, device utilization,
device errors
Acceleration FPGA or GPU offloading from CPU Parallelism and Code Design
10. Broad Set of Compute Instance Types
M4
General
purpose
Compute
optimized
C4
C3
Storage and I/O
optimized
I3
G2
GPU or FPGA
enabled
Memory
optimized
X1
P2
F1
R4
R3
C5
I2
D2
11. Resource Utilization
For given performance, how efficiently are
resources being used
Something at 100% utilization can’t
accept any more work
Low utilization can indicate more resource
is being purchased than needed
12. Example: Web Application
MediaWiki installed on Apache with 140 pages of content
Load increased in intervals over time
17. Give back instances as easily as you can acquire new ones
Find an ideal instance type and workload combination
EC2 Instance Pages provide “Use Case” Guidance
With EBS, storage and instance size don’t need to be coupled
Instance Selection = Performance Tuning
18. “Launching new instances and running tests
in parallel is easy…[when choosing an
instance] there is no substitute for measuring
the performance of your full application.”
- EC2 Documentation
19. How not to choose an EC2 instance
Brute Force Testing
Ignoring Metrics
Favoring old generation instances
Guessing based on what you already have
21. Choosing the right size
Understand your unit of work
Web request
Database / Table
Batch Process
What is that unit’s requirements?
CPU threads
Memory Constraints
Disk & Network
What are it’s availability requirements?
26. Review: M4 Instances
General Purpose Instance Family
Balance of Compute, Memory, and Network Resources
2.3 GHz Intel Xeon® E5-2686 v4 (Broadwell) processors or
2.4 GHz Intel Xeon® E5-2676 v3 (Haswell) processors
Model vCPU Memory
(GiB)
Storage EBS
Bandwidth
(Mbps)
m4.large 2 8 EBS Only 450
m4.xlarge 4 16 EBS Only 750
m4.2xlarge 8 32 EBS Only 1,000
m4.4xlarge 16 64 EBS Only 2,000
m4.10xlarge 40 160 EBS Only 4,000
m4.16xlarge 64 256 EBS Only 10,000
Databases, Data Processing, Caching, SAP, SharePoint, and other enterprise applications.
27. Review: T2 Instances
Lowest cost EC2 instance at $0.0065 per hour
Burstable performance
Fixed allocation enforced with CPU credits
Model vCPU Baseline CPU Credits
/ Hour
Memory
(GiB)
Storage
t2.nano 1 5% 3 .5 EBS Only
t2.micro 1 10% 6 1 EBS Only
t2.small 1 20% 12 2 EBS Only
t2.medium 2 40%** 24 4 EBS Only
t2.large 2 60%** 36 8 EBS Only
t2.xlarge 4 90%** 54 16 EBS Only
t2.2xlarge 8 135%** 81 32 EBS Only
General Purpose, Web Serving, Developer Environments, Small Databases
28. How Credits Work
A CPU credit provides the performance of a
full CPU core for one minute
An instance earns CPU credits at a steady rate
An instance consumes credits when active
Credits expire (leak) after 24 hours
Baseline rate
Credit
balance
Burst
rate
31. Review: C4 Instances – “Compute”
Custom Intel E5-2666 v3 at 2.9 GHz Turbo to 3.5 Ghz
P-state and C-state controls
Model vCPU Memory (GiB) EBS (Mbps)
c4.large 2 3.75 500
c4.xlarge 4 7.5 750
c4.2xlarge 8 15 1,000
c4.4xlarge 16 30 2,000
c4.8xlarge 36 60 4,000
Batch & HPC workloads, Game Servers, Ad Serving, & High Traffic Web Servers
32. C5 Instance Preview
• Next Generation “Skylake” Intel® Xeon® Processor
family
• AVX 512 Instruction Set
• Up to 72 vCPUs in a single instance
• 144 GB of RAM
• Coming early 2017
42. Review: G2 Instances – “GPU”
• Up to 4 NVIDIA GRID K520 GPUs in a single instance
• Each with 1,536 CUDA cores and 4GB of video memory
• High-performance platform for graphics applications using
DirectX or OpenGL
G2
Instance
Size
GPUs vCPUs Memory
(GiB)
SSD Storage
g2.2xlarge 1 8 15 1 x 60GB
g2.8xlarge 4 32 60 2 x 120GB
Video creation services, 3D visualizations, streaming graphics, server-side graphics workloads
43. Review: P2 GPU Instances – “Parallel”
• Up to 16 K80 GPUs in a single instance
• Supports CUDA 7.5 and above, OpenCL 1.2, and the
GPU Compute APIs
• Including peer-to-peer PCIe GPU interconnect
P2
Model GPUs GPU Peer
to Peer
vCPUs Memory
(GiB)
GPU
Cores
GPU
Memory
Network
Bandwidth*
p2.xlarge 1 - 4 61 2,496 12 Gib High
p2.8xlarge 8 Y 32 488 19,968 96 Gib 10Gbps
p2.16xlarge 16 Y 64 732 39,936 192 Gib 20Gbps
*In a placement group
Deep learning, HPC simulations, and Batch Rendering
44. Review: F1 Instances – “FPGA”
• Up to 8 Xilinx Virtex UltraScale Plus VU9p FPGAs in a single instance
with four high-speed DDR-4 per FPGA
• Largest size includes high performance FPGA interconnects via PCIe
Gen3 (FPGA Direct), and bidirectional ring (FPGA Link)
• Designed for hardware-accelerated applications including financial
computing, genomics, accelerated search, and image processing
F1
Instance Size FPGAs FPGA
Link
FPGA
Direct
vCPUs Memory
(GiB)
NVMe
Instance
Storage
Network
Bandwidth*
f1.2xlarge 1 - 8 122 1 x 480 5 Gbps
f1.16xlarge 8 Y Y 64 976 4 x 960 30 Gbps
*In a placement group
45.
46. Next steps
Visit the Amazon EC2 documentation
Launch an instance and try your app!