Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Amazon EC2 Spot Instances Workshop

98 vues

Publié le

Workshop conducted on Amazon EC2 Spot Instances at August 2019 AWSUGBLR Meetup

Publié dans : Technologie
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Amazon EC2 Spot Instances Workshop

  1. 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ec2-spot-india@amazon.com Sunday, 25 Aug 2019 Amazon EC2 Spot Instances Compute at up to 90% off. Scale more. Get faster results. Build resilient services. Chakravarthy Nagarajan Specialist Solution Architect, EC2 Spot Sridhar Bharadwaj Business Development Manager, EC2 Spot
  2. 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda • EC2 Spot Instances overview • Pricing model • Major features and functionality • Interruption details • Spot orchestration options (Auto Scaling Groups, Fleet) • Console Demo • Monitoring price and usage • Use Cases – where to use Spot • Main takeaways for success with Spot
  3. 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Spot – cool Cost Savings
  4. 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 purchase options Spot Instances Spare EC2 capacity at off On-Demand prices Fault-tolerant, flexible, stateless workloads Reserved Instances Make a 1 or 3-year commitment and receive a off On-Demand prices Committed & steady-state usage On-Demand Pay for compute capacity with no long-term commitments Spiky workloads, to define needs
  5. 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. To optimize Amazon EC2, combine purchase options for fault- tolerant, flexible, stateless workloads
  6. 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spare capacity at scale Clemson university – 1.1 Million cores https://tinyurl.com/clemson-spot
  7. 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot Instances - basics Price changes infrequently based on long term supply and demand of spare capacity in each pool independently Just request capacity and pay the current rate. No Bidding Interruptions only happen when OD needs capacity. No outbidding
  8. 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Large customer base
  9. 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 Spot integrations
  10. 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Flexibility is key to successful adoption Instance flexible Time flexible AZ flexible
  11. 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. $0.27 $0.29$0.50 1b 1c1a 8XL $0.30 $0.16$0.214XL $0.07 $0.08$0.082XL $0.05 $0.04$0.04XL $0.01 $0.04$0.01L C4 $1.76 On Demand $0.88 $0.44 $0.22 $0.11 EC2 Spot pools- instance type flexibility Each instance family Each instance size Each Availability Zone (60) In every region (20) Is a separate Spot pool R5 M4 C5 I3 M5d R4 D2 C4
  12. 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Time flexibility examples • Model training • Genomics • Development • Testing • One-time queries Time sensitive workloadsTime insensitive workloads examples • Web services • APIs • Analytics • Grid computing • Containers
  13. 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot Blocks • Defined duration workload without interruptions • Commit for 1-6 hours. Instances terminate after duration. • Lower discounts compared to Spot • No Auto-Scaling and No Instance Diversification
  14. 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What about interruptions?
  15. 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Interruption behaviors Terminate HibernateStop
  16. 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot Instance Advisor https://aws.amazon.com/ec2/spot/instance-advisor/
  17. 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Spot pricing history
  18. 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. aws ec2 describe-spot-price-history --start-time 2018-05-06T07:08:09 --end-time 2018-05-06T08:08:09 -- instance-types c4.2xlarge --availability-zone eu-west-1a --product-description "Linux/UNIX (Amazon VPC)“ { "SpotPriceHistory": [ { "Timestamp": "2018-05-06T06:30:30.000Z", "AvailabilityZone": "eu-west-1a", "InstanceType": "c4.2xlarge", "ProductDescription": "Linux/UNIX (Amazon VPC)", "SpotPrice": "0.122300" } ] } EC2 Spot pricing history – API access
  19. 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot orchestration options comparison Auto Scaling Groups Spot Fleet Maintains target capacity (upon interruptions or failures) Instance type diversification Availability zone diversification Allocation strategy (N Lowest Pools, Lowest Price) Autoscaling (target tracking, stepped with custom metrics) ELB integration *Detach on interruption notification requires automation On-demand capacity mixed with Spot Lifecycle hooks, termination policies, protection, detach, processes Weights Roadmap
  20. 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring Spot usage – Cost Explorer Y e a r b y m o n t h – l o n g t e r m t r e n d s
  21. 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring Spot usage – Cost Explorer M o n t h b y d a y – s h o r t t e r m c h a n g e s a n d a n o m a l y d e t e c t i o n
  22. 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring Spot usage • Filter and group by: account, instance type, region, tags • Data is available via API • If you need: - Deeper insights - Hour-level data resolution or one hour data freshness - Resource ids Use Cost and Usage Reports or Spot Instance Data Feed (easily query with Athena or visualize with Quicksight / Tableau / Looker /…)
  23. 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring Spot usage – Savings Summary
  24. 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Is my workload Spot Ready? Stateless Fault-Tolerant Flexible: Multi- AZ and Instance Flexibility Loosely Coupled Looks familiar?
  25. 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Main takeaways for success with Spot • Be instance type agnostic and let ASG/Fleet provide the required capacity at the lowest price • Adopt Launch Templates to benefit from new ASG and Fleet features • New instance families generally have higher interruption rates • Architect for fault-tolerance to be Spot compatible and increase your availability
  26. 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you https://aws.amazon.com/ec2/spot ec2-spot-india@amazon.com
  27. 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Usage of EC2 Spot Instances with specific Workloads
  28. 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Stateless web application or API frontend • Spot diversification before all else • To mitigate risk, launch On-Demand in a different pool if Spot capacity is insufficient • Availability and performance should not be impacted, ensure low bootstrap time & some over-provisioning https://tinyurl.com/SpotAppnextBlog M o s t l y i d e n t i c a l f o r q u e u e w o r k e r s
  29. 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 Auto Scaling
  30. 30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is Amazon EC2 Auto Scaling? Amazon EC2 Auto Scaling
  31. 31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Auto Scaling Group (ASG) introduction  Logical group of instances for your service  Minimum and maximum bound for the number of instances that can be in the ASG  Launch or terminate instances to meet the desired capacity Desired Min Max
  32. 32. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scheduled scaling Dynamic scaling Predictive scaling Manual scaling Amazon EC2 Auto Scaling
  33. 33. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Predictive scaling in Amazon EC2 Auto Scaling Machine learning technology behind the scenes Machine learning model Load metric and forecasts next two days based on the pre-trained model Performs regression analysis between load metric and scaling metric Schedules scaling actions for the next two days, hourly Repeats every day Capacity provisioning on-premises Capacity provisioning with dynamic scaling Capacity provisioning with predictive scaling and dynamic scaling Time Load/Capacity Time Load/Capacity Time Load/Capacity Provisioned capacity Actual capacity demand
  34. 34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Auto Scaling Groups with Multiple Purchase Options and Instance Types
  35. 35. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Before: Multiple ASGs to use Spot, On-Demand, and RIs together m4.large Spot ASG m5.large Spot ASG c4.large On-Demand ASG Availability Zone 1 Availability Zone 2 Availability Zone3 One ASG for each purchase option and instance type
  36. 36. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. After: Include Spot, On-Demand, and RIs in a single ASG m4.large Spot m5.large Spot c4.large On-Demand Availability Zone 1 Availability Zone 2 Availability Zone3 A single ASG combines purchase options and instance types
  37. 37. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Save up to 90% using EC2 Auto Scaling and EC2 Fleet Automatically provision and scale instances across instance families and purchase models in a single Auto Scaling group Lowest cost Specify what percentage of your Auto Scaling group capacity should be fulfilled by On-Demand Instances and Spot Instances to optimize cost Prioritized list Use a prioritized list for On-Demand Instance types to scale capacity during an urgent, unpredictable event to optimize performance Reduce operational overhead
  38. 38. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 Fleet and Allocation strategies Amazon EC2 Fleet • Provisions capacity across multiple instance types according to allocation strategies Allocation strategies • On-Demand prioritized list of instance types • Spot instances across the N lowest priced instance pools • Capacity Optimized • Allocation strategies determine which instance types are launched and terminated
  39. 39. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. API Parameters "MixedInstancesPolicy": { "LaunchTemplate": { "LaunchTemplateSpecification": { "LaunchTemplateName": ”MyLaunchTemplate" }, "Overrides": [ { "InstanceType": "c5.large" }, { "InstanceType": "c4.large" } ] }, "InstancesDistribution": { "OnDemandAllocationStrategy": "prioritized", "OnDemandBaseCapacity": 10, "OnDemandPercentageAboveBaseCapacity": 50, "SpotAllocationStrategy": ”capacity-optimized" } } AZ1 and AZ2 Desired Min Max On-Demand Base 50% On-Demand 50% Spot Minimum On-Demand (10)
  40. 40. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Instance type overrides and allocation strategies • ASG adjusts to new configuration as scale up and down • As ASG scales up • Launch instances according to the new configuration • As ASG scales down • Prioritize terminating instances not matching the new configuration • New termination policy: AllocationStrategy Instance type overrides: m4.large, m5.large m4.large m5.large Instance type overrides: m5.large, c5.large m4.large m5.large c5.large Instance type overrides: m5.large, c5.large m4.large m5.large c5.large
  41. 41. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Recommendations on Mixed Instances Policy Choose at least 2 instance type overrides • Improves availability for On-Demand and Spot Instances Diversify across at least N = 2 Spot Instance pools • Reduces risk from fluctuations in Spot capacity and prices Choose instance types of same size across families • Maintains stability as dynamically scale up and down Use default spot max price • Leverages spot cost savings while defaulting to on-demand price as maximum price to pay
  42. 42. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Spot with Amazon ECS
  43. 43. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot and Containerized workloads • Best practices overlap: cattle, not pets • Cluster instances (worker nodes) are conceptually redundant and ephemeral in containerized workloads • Instance type flexibility is easy
  44. 44. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Integrated directly into ECS console, or use CloudFormation, Terraform ECS creates a Spot Fleet in your account, Or – use an Auto Scaling Group to bootstrap instances Autoscaling ECS service: CPU/MEM, HTTP requests Spot Fleet / ASG on reserved metric - CPU, MEM Interruptions are handled automatically via scripts installed in User Data ECS: automatically Drain the instance Spot Fleet / ASG: automatically replace the instance ECS - Provisioning and scaling
  45. 45. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Spot with Amazon EKS
  46. 46. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kubernetes and EKS considerations • EKS is a managed control plane, Spot is only relevant for the worker nodes • CA (cluster-autoscaler) & ASG: Use one ASG with a diversified set of instance types • Run a DaemonSet on every worker to catch the Spot interruption and drain the node • Use labels to identify Spot nodes (for the DaemonSet, and other purposes – schedule non-prod?)
  47. 47. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kubernetes and EKS scaling • HPA (horizontal pod autoscaler)  Autoscales the number of pods in a Deployment/ReplicaSet • CA (cluster-autoscaler)  Autoscales the number of worker nodes in the cluster when: o Pods cannot be scheduled due to lack of compute resources o Nodes are underutilized and important pods can be rescheduled elsewhere
  48. 48. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Availability Zone 1 etcd Master etcd Master Availability Zone 2 Availability Zone 3 etcd Master EKS – Master & etcd vs Worker nodes Worker nodes Your AWS account Managed master & etcd Not visible in your AWS account
  49. 49. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Spot with Amazon EMR
  50. 50. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Use Spot in EMR, unless heavily time-constrained Spot best practice – diversify and be instance type agnostic  Use Instance Fleets - up to 5 instance types  Improved chances of getting capacity, decreased impact from interruptions  Spark will automatically recover from instance failures/interruptions  Enable Dynamic allocation of executors (default in EMR) Decouple storage from compute (from HDFS to S3 EMRFS) Defined duration (Spot blocks) and fallback to On-Demand If you need auto-scaling (i.e for Presto), use uniform groups EMR – provisioning and instance types r3.4xlarge i3.4xlarge r4.4xlarge m4.10xlarge
  51. 51. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. # Parallelized nodes Time # Parallelized nodes Time Job running time: 1 hourJob running time: 10 hours Parallelization with Spot Instances
  52. 52. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EMR Nodes – Long running clusters On- demand Master node EMR cluster Task nodesCore nodes On- demand Spot Spot Spot SpotHDFS HDFS Core nodes can be added and removed gracefully Master Node must keep running Cluster can tolerate loss of task nodes.
  53. 53. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EMR Nodes – Data driven workloads On- demand Master node EMR cluster Task nodesCore nodes On- demand On- demand Spot Spot Spot SpotHDFS HDFS • On-demand for Master • On-demand for Core, so no loss of data from these nodes • Spot for task nodes
  54. 54. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EMR Nodes – Cost-driven & Transient workloads Spot Master node EMR cluster Task nodesCore nodes Spot Spot Spot Spot Spot SpotHDFS HDFS • All node types use EC2 Spot • Can use any EC2 Spot instance choices.
  55. 55. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EMR Choices for EC2 Spot - Compute flexibility • Instance groups • Instance fleets
  56. 56. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EMR Instance groups
  57. 57. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EMR Instance fleets
  58. 58. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Instance fleets – Target capacity • Target capacity is a mix of OD and EC2 Spot • Can specify # of units each of these is contributing to target capacity • If no spot • Can specify provisioning timeout AND • Switch to OD
  59. 59. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. When to choose instance fleets • Widest variety of provisioning options • Choose a mix of instance types • Choose mix of on-demand/spot • Specify Target capacity as mix of on-demand or spot • Auto-scaling not available
  60. 60. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you https://aws.amazon.com/ec2/spot ec2-spot-india@amazon.com

×