Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Cloud Capacity Planning..an Oxymoron? - South Bay SRE Meetup Aug-09-2016

660 vues

Publié le

Coburn Watson, Director of Performance and Reliability Engineering at Netflix discusses the differences between Cloud and traditional DC/On-Prem capacity planning models . He additional covers some of the distinct methodologies applied at Netflix to improve the rate of innovation, overall reliability, while keeping a pulse on efficiency.

Publié dans : Technologie
  • Soyez le premier à commenter

Cloud Capacity Planning..an Oxymoron? - South Bay SRE Meetup Aug-09-2016

  1. 1. Cloud Capacity Planning South Bay SRE meetup - August 9th, 2016
  2. 2. ● Cloud Capacity Planning..an Oxymoron? ● Santa Cloud: How Netflix Does Holiday Capacity Planning ● The Data Behind the Planning Presenting...
  3. 3. Cloud Capacity Planning..an Oxymoron? South Bay SRE Meetup: August 9th, 2016
  4. 4. ● > 83M households ● 190 Countries ● 35% of Internet traffic in US at peak ● Entirely on Cloud*, three regions ● Evacuate a region monthly...for 24 hours ● Capacity planning ~ 5 people! (in the room :-) * Content served from homegrown OpenConnect CDN
  5. 5. Capacity Planning Concerns ● Facility considerations (Space, Power, Network, Cooling) ● Supply Chain Management Constraints and Relationships ● Hardware lifetime contour & failure rates (MTBF) ● Systems management staff ● Seasonal and unexpected burst considerations ● Workload colocation and performance demands ● Over-provisioning for reliability and rate of innovation ● Effective tooling ● Business continuity planning
  6. 6. (Cloud) Capacity Planning Concerns ● Facility considerations (Power, Network, Cooling) ● Supply Chain Management Constraints and Relationships ● Hardware lifetime contour & failure rates (MTBF) ● Systems management staff ● Seasonal and unexpected burst considerations ● Workload colocation and performance demands ● Over-provisioning for reliability and rate of innovation ● Effective tooling ● Business continuity planning
  7. 7. Cloud-specific CP Factors ● Capacity bounds..unknown (-) ● Vendor Decisions (-/+) ○ Hardware/Offering Evolution Timeline ○ Resource Demand (CPU/Mem/Disk/Net) Matrix ● On-Demand Capability (+)
  8. 8. Netflix Model ● Depend on the AWS on-demand pool for elasticity ● Monitor insufficient capacity exceptions (ICEs) for boundaries ● Invest heavily in 3 year reservations ● Maintain relatively few, large reserved pools ● Cloud Capacity Analytics team develops tools for insight ● Leverage cross-account resource borrowing
  9. 9. The Triad Cloud Impact Innovation Reliability Efficiency Default Preferred
  10. 10. Considerations of Scale ● Capacity required for critical footprint might require “guarantees” ● API-based observability has limits ● All resources have capacity limits/throttles ● Resource limits by default set for lowest common denominator ● Get creative with unused, but paid for capacity ● Billing file size!
  11. 11. Summary Capacity Planning
  12. 12. Coburn Watson ● Director of Performance and Reliability at Netflix ○ Site Reliability Engineering, Performance and OS Engineering, Traffic Management, Chaos Engineering, Capacity Planning, Cloud Network Engineering ● @coburnw, cwatson@netflix.com ● Looking for some great capacity planning-minded folks ● Performance and Reliability Youtube Channel

×