In today’s world, grid computing needs are dynamic due to business, market, and technology changes. With AWS, you can easily create grid computing clusters running Microsoft HPC Pack 2012 R2 to meet these dynamic computing needs. This session covers architectural patterns and best practices using Amazon EC2, Amazon S3, AWS Directory Service, and AWS CloudFormation to create on-demand Windows HPC clusters. We also review automation frameworks to more easily and dynamically provision Windows HPC clusters in an on-demand fashion.
4. Low cost with flexible pricing Efficient clusters
Unlimited infrastructure
Faster time to results
Concurrent Clusters on-demand
Increased collaboration
Why AWS for HPC?
5. Popular HPC workloads on AWS
Genome
processing
Modeling and
Simulation
Government and
Educational Research
Monte Carlo
Simulations
Transcoding and
Encoding
Computational
Chemistry
6. Benefits of Agility
Elastic Cloud-Based Resources
Actual demand
Resources scaled to demand
Waste Customer
Dissatisfaction
Actual Demand
Predicted Demand
Rigid On-Premises Resources
7. Cost Benefits of HPC in the Cloud
Pay As You Go Model
Use only what you need
Multiple pricing models
On-Premises
Capital Expense Model
High upfront capital cost
High cost of ongoing support
8. AWS Journey for HPC Customer
Dev, Test, Eval True Production Mission Critical All-in
Build new production apps
Migrate production apps
Build mission-critical apps
Migrate mission-critical apps
Development and test
Eval and training
Corporate standard
“Cloud First”
13. Auto Scaling and Amazon CloudWatch
Match demands of cluster queue with appropriate compute needs
CloudWatch
Auto Scaling group
Windows HPC Job Manager
14. Amazon Elastic Block Store
• Designed for five nines of availability
• Attaches to Amazon EC2 within the same Availability Zone
• Point-in-time snapshots to Amazon S3
• Checkbox enabled encryption
Magnetic
General Purpose
(SSD)
Provisioned IOPS
(SSD)
Volume types
When performance
matters, use SSD-
backed volumes!
Network attached persistent block storage volumes for Amazon EC2
15. Amazon EBS
• Default 30 GB volume
• Gets initial I/O credit of 5.4M
• Burst for up to 30 mins @ 3000 IOPS
• Accumulate 90 I/O credits/second
Windows Boot Volume
Decrease launch time of instances by leveraging General Purpose SSD
16. Amazon Simple Storage Service
Store input and result datasets for dynamic and transitive Windows HPC clusters
Redundancy
Durability: designed for 99.999999999%
Availability: designed for 99.9%
Capacity
Consumption-based storage model
Virtually unlimited capacity
Security
Encryption in Transit: HTTPS/TLS
Encryption at Rest: SSE, SSE-C, SSE-KMS
Ease of use
Storage Classes: Standard, RRS, Glacier
Lifecycle Policies: archive, expiration
17. Amazon S3
Copy data to Amazon S3 and enable SSE
Write-S3Object –BucketName mybucket -Folder .Scripts -KeyPrefix SampleScripts -ServerSideEncryption
Copy data from Amazon S3 to a local folder
Read-S3Object –BucketName mybucket -KeyPrefix SampleScripts –Folder .
• Bucket: mybucket
• Keyname Space: SampleScripts
• Local Folder: .Scripts
Migrate data to AWS and Windows HPC clusters with AWS Tools for PowerShell
18. AWS CloudFormation
• Create templates to describe the AWS resources used to run your
application
• Provision identical copies of a stack
• Templates can be stored in a source control system
• Track all changes made to your infrastructure stack
• Modify and update resources in a controlled and predictable way
• Just choose what resources and configurations you need
• Customize your template via parameters
Consistently and easily deploy Windows HPC clusters based on workflow needs
Templated resource provisioning
Infrastructure as code
Declarative and flexible
20. AWS Architecture for HPC
Hybrid or “burst” All-in AWS
Choose the right deployment architecture for the use case
Core infrastructure:
Users directory
Bastion host
On-premises
AWS
AWS Directory Service
Amazon EC2
Cluster infrastructure:
Head node
Compute node
Storage
AWS
AWS
On-premises/AWS
Amazon EC2
Amazon EC2
Amazon S3
User workstations On-premises Amazon WorkSpaces
21. AWS Architecture for HPC
“Burst” to virtually unlimited compute capacity in AWS
Amazon VPC
Users
Bastion
Core
Head
Compute
Compute
Compute
Compute
Compute
Compute
Compute
Compute
ClusterWorkstations
Head
HPC
Users
CoreCluster
On-Premise
HPC
HPC HPC
22. AWS Architecture for HPC
Deploy users, infrastructure, and cluster all in AWS
Amazon VPC
Core
Head
Compute
Compute
Compute
Compute
Compute
Compute
Compute
Compute
ClusterWorkstations
Users
Bastion
24. Windows Server on AWS
Easy Licensing
OS $/Hr
BYOL
Optimized AWS
Software for
Windows
EC2Config, drivers
Experience
October 2008
Every use case
Every industry
OS Choice
2003R2
2008, 2008R2
2012, 2012R2
Microsoft Portfolio
SQL Server
SharePoint
Exchange, Lync
Customize Systems
50+ EC2 instances
32, 64 bits
CPU, GPU
25. AWS Architecture for Windows HPC
Networking best practices for Windows HPC clusters
• Network Design- Leverage both public and private subnets, manage sizing
• Availability – Use multi-AZ design
• Access Control – use VPC endpoint and NAT for external accesses
Availability Zone A
Availability Zone B
Private Subnet
10.0.10.0/24
Public Subnet
10.0.0.0/24
Core
Private Subnet 2
10.0.11.0/24
VPC
Endpoint
NAT
Public Subnet
10.0.1.0/24
NAT
26. AWS Architecture for Windows HPC
• Domain Controller – Highly available extension of your existing environment
• Remote Desktop Gateway - Increase security posture
Core infrastructure best practices for Windows HPC clusters
Availability Zone A
Availability Zone B
Private Subnet
10.0.10.0/24
Public Subnet
10.0.0.0/24
DC
Core
Private Subnet 2
10.0.11.0/24
DC
RDGW
Public Subnet
10.0.1.0/24
27. AWS Architecture for Windows HPC
• Head Node – Size independent of Compute Node, General Purpose family
• Compute Nodes – use Auto Scaling groups and cluster instances
• S3 Bucket – Persistent, secure, available storage of cluster input and results
Cluster infrastructure best practices for Windows HPC clusters
Availability Zone B
Availability Zone A
Private Subnet
10.0.10.0/24
Public Subnet
10.0.0.0/24
Core
Private Subnet 2
10.0.11.0/24
Head
Compute
Compute
Compute
Compute
Compute
Compute
Compute
Compute
Cluster
Public Subnet
10.0.1.0/24
S3
Bucket
VPC
Endpoint
28. AWS Architecture for Windows HPC
All at once, complete Windows HPC infrastructure on AWS
Availability Zone B
Availability Zone A
Private Subnet
10.0.10.0/24
Public Subnet
10.0.0.0/24
DC
S3
Bucket
Core
Private Subnet 2
10.0.11.0/24
DC
Head
Compute
Compute
Compute
Compute
Compute
Compute
Compute
Compute
Cluster
VPC
Endpoint
RDGW
NAT
Public Subnet
10.0.1.0/24
NAT
29. AWS Architecture for Windows HPC
Launch multiple clusters right-sized to complete work in amount of time specified
Private Subnet
10.0.10.0/24
Public Subnet
10.0.0.0/24
DC
Core
Private Subnet 2
10.0.11.0/24
DC
Head
Compute
Compute
Compute
Compute
Compute
Compute
Compute
Compute
Cluster
Head
Compute
Compute
Compute
Compute
Compute
Compute
Compute
Compute
Head
Compute
Compute
Compute
Compute
Compute
Compute
Compute Compute Compute Compute
RDGW
NAT
Public Subnet
10.0.1.0/24
NAT
Availability Zone A
Availability Zone B
S3
Bucket
VPC
Endpoint
31. Secure Windows HPC Workloads on AWS
AWS Resource Access: Enable access to AWS resource through
policies in IAM roles
Encryption at Rest: Enable encryption on EBS volumes and specify
server side encryption for objects in Amazon S3
Create private access to input and output results stored in Amazon S3
via VPC endpoints
Ensure auditability of AWS account by enabling AWS CloudTrail
Leverage native AWS security features to enhance the
security posture of Windows HPC
32. Optimized network for Windows HPC
Enhanced Networking: SR-IOV feature provides higher PPS
performance, lower latencies, and very low network jitter
Placement Groups: All instances get low latency, full bisection,
10Gbps bandwidth between instances
EBS Optimization: Get up to 4000Mbps additional dedicated
throughput dedicated to your storage needs
AWS PV Drivers / Intel Drivers: Make sure you stay current with
the latest
Get the most of AWS networking for your HPC workloads
33. Optimized processing with Windows HPC
Hyper-threading: Most current generation AWS instances provide
hyper-threading, keep it or deactivate it based on your needs
Turbo Boost: Latest generation of instances leave you control C-
state and P-state registers for your processors
The right instance: Choose your constraints (price, CPU, GPU,
RAM, network) and get the instance type that fits your use case
The right storage: Choose the amount and support of instance
storage or Amazon EBS storage required, and leverage storage
services such as Amazon S3
Get the most of your instances for your HPC workloads
34. Automated Windows HPC computing
Windows PowerShell®: You can get all the installation and
configuration of the instances done automatically
AWS Tools for Windows PowerShell: Your cluster can become
aware of the infrastructure it is running on
Auto Scaling: Automate provisioning and scaling of your cluster to
have your workloads finished when you need them
AWS CloudFormation: Deploy your clusters in a few clicks, create
test clusters in minutes
Get your cluster as code, running in minutes from scratch
36. Windows HPC AWS CloudFormation Template
Enable automated deployments of clusters with pre-built template
Amazon VPC
DC
RDGW
Core
Head
Compute
Compute
Compute
Compute
Compute
Compute
Compute
Compute
Cluster
37. AWS CloudFormation Templates: Prerequisites
Things to do before starting the template
Select your region and base image
• VPC + Subnet: Just input selected CIDR
• Instance Types: for all instances
• (Optional) Placement Group: Create a VPC placement group
Prepare installation media then snapshot
• Download Microsoft HPC Pack and unzip to HPCPack2012R2-Full
• Extract SQL Server installation to SQLInstall
• Download Intel SR-IOV drivers and extract to PROWinx64
• Download latest AWS PV drivers and extract to AWSPVDriverSetup
Select installation configuration:
• Define domain configuration and credentials
38. AWS CloudFormation Template: Core
Building the core Windows infrastructure
Base Network
• VPC + Public Subnet: Select your CIDR
• DHCP Option Set: Configured to use DC
• Security Groups: For bastion and cluster
Core Infrastructure:
• Domain Controller in new forest
• Remote Desktop Bastion Host (outside of domain)
• Domain User “Join Computer to Domain” privileges
39. AWS CloudFormation Template: Cluster
Building the Microsoft HPC cluster on AWS
Head-Node
• Multi-role: database, HPC Head node, Share
• Monitored: Amazon CloudWatch Custom metrics
Compute Nodes:
• Automated: Automatic configuration to join the cluster
• Scalable: Auto Scaling group resizing the cluster based on load
• Up-to-date: auto upgrade of AWS and Intel Drivers
40. Windows HPC AWS CloudFormation Template
In < 30 minutes, your cluster will be ready to accept jobs.
41. Getting Started Collateral
QwikLAB: Launching Microsoft HPC Pack on AWS:
https://www.qwiklab.com/focuses/preview/1604?search=19103
Reference CloudFormation Template:
https://github.com/awslabs/aws-cfn-windows-hpc--template