Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

How we Auto Scale applications based on CPU with Kubernetes at M6Web?

393 vues

Publié le

I explain how to use Requests and Horizontal Pod Autoscaler to autoscale an application. Here with yaml example of our geolocation app at https://tech.m6web.fr/
This talk were given at our Last Friday Talk oct. 18.

Questions & Answers:
Q1: Is it relevant to put high values on the requests?
The value of the requests is taken into account for the triggering of the HPA.
If the app consumes a lot of resources, then yes,
If it consumes little, then autoscaling will be triggered late, or not at all (it will crash before)
In all cases, the application must hold the load beyond the value of the requests: it can consume more

Q2: Is it relevant to have a very high max HPA?
Yes if the app can consume these resources under normal circumstances,
On the other hand to have a max HPA at 1000 time the max value of the application has little interest
It's more like a safeguard if you ever have a bug and you consume too much

Q3: Custom Metrics are defined at the request level?
No, Requests are the CPU and RAM, notions defined at the level of app containers.
Metrics , custom or not, are used to define the HPA target: it is therefore defined at the HPA level

Q4: What is the price of putting a very high max HPA?
None: these requests are not reserved until the pods are launched
So it doesn't cost anything, it's just protection

Q5: What is the waiting time to launch an additional node?
It depends on the cloud provider,
At AWS, for the moment, it's between 3 and 5 minutes,
So it's not instantaneous and it can be problematic in very high peak loads (we look at overprovisioning)

Q6: What is the waiting time to scroll pods?
A few seconds: we start new containers that are created very quickly
We use docker containers for the moment, but Kubernetes is not restrictive to this.

Q7: Can we scroll on a metric history?
Not really. We scale according to a metric, on current values,
The purpose of kubernetes is to have an infra that automatically scales according to the current load.
Predicting a load is not part of its objectives.
However, it is still something that can be done depending on the Prometheus request we make

Publié dans : Internet
  • Soyez le premier à commenter

How we Auto Scale applications based on CPU with Kubernetes at M6Web?

  1. 1. Kubernetes Auto scaling On production at M6Web
  2. 2. Photo by frank mckenna on Unsplash
  3. 3. Kubernetes Container orchestrator Maximize capacity Adapt to demand Keep applications healthy
  4. 4. Kubernetes ● Handles Network ● Container Autoscaling ● Containers High Availability ● Given available resources: CPU, RAM, etc. ● YAML config
  5. 5. Kubernetes Objects 1. Pod 2. Node 3. Cluster 4. Horizontal Pod Autoscaler 5. Service
  6. 6. Kubernetes Objects 1. Pod 2. Node 3. Cluster 4. Horizontal Pod Autoscaler 5. Service
  7. 7. Kubernetes Objects 1. Pod 2. Node 3. Cluster 4. Horizontal Pod Autoscaler 5. Service
  8. 8. Kubernetes Objects 1. Pod 2. Node 3. Cluster 4. Horizontal Pod Autoscaler 5. Service
  9. 9. Kubernetes Objects 1. Pod 2. Node 3. Cluster 4. Horizontal Pod Autoscaler 5. Service
  10. 10. Kubernetes Objects 1. Pod 2. Node 3. Cluster 4. Horizontal Pod Autoscaler 5. Service
  11. 11. Kubernetes Objects 1. Pod 2. Node 3. Cluster 4. Horizontal Pod Autoscaler 5. Service
  12. 12. Kubernetes Objects 1. Pod 2. Node 3. Cluster 4. Horizontal Pod Autoscaler 5. Service
  13. 13. Example with our geolocation API
  14. 14. Pod 2 containers: ● Nginx ○ Rewrite rules ○ HTTP headers sanitation ○ HTTP enhancements (gzip, etc.) ● PHP-FPM ○ Executes PHP code
  15. 15. Deployment This Pod is executed inside a cluster
  16. 16. Production Cluster ● Master nodes ● Worker nodes
  17. 17. Pod in production ● 2 replicas of pod ● scheduled by Kubernetes on different worker nodes
  18. 18. Horizontal Pod Autoscaler ● Controls pod replicas ● Dynamically changed according to CPU usage
  19. 19. Service ● Unchanged over time ● No matter how many pods exist ● Round Robin over pods
  20. 20. It works !!
  21. 21. Auto scales depending on the load
  22. 22. THIS IS FUN !! Yeah really !!
  23. 23. Photo by Lightscape on Unsplash
  24. 24. It’s only YAML
  25. 25. Configure HPA inside your project
  26. 26. To scale on 80% of Requests
  27. 27. Requests Containers can request ressources ● CPU ● RAM Those ressources are guaranteed Helps Kubernetes to optimize the scheduling of pods Photo by Kaizen Nguyễn on Unsplash
  28. 28. Requests are the highest resources an app will normally consume They must not be considered as minimum or maximum usable resources Requests are not Limits
  29. 29. Autoscaling 1. Horizontal Pod Autoscaler 2. Requests
  30. 30. Scalability based on CPU
  31. 31. How it works ?
  32. 32. Horizontal Pod Autoscaler compares mean utilization of pods to Requests values
  33. 33. Example: geolocation at 16h30 ● Pod Requests: 300m (millicores) ● Horizontal Pod Autoscaler target: 80% ● HPA scales from 240m
  34. 34. Example: geolocation at 16h30 ● Pod Requests: 300m (millicores) ● Horizontal Pod Autoscaler target: 80% ● HPA scales from 240m CPU: 215m CPU: 205m Mean CPU: 210m 70% of Requests No need to scale
  35. 35. Example: geolocation at 18h30 ● Pod Requests: 300m (millicores) ● Horizontal Pod Autoscaler target: 80% ● HPA scales from 240m CPU: 282m CPU: 264m Mean CPU: 273m 91% of Requests need to scale up
  36. 36. Photo by Roman Schurte on Unsplash
  37. 37. Of how much it scales ?
  38. 38. desiredReplicas = ceil[ currentReplicas * ( currentMetricValue / desiredMetricValue )]
  39. 39. Example A: 18h30 2 * ( 0.708 / 0.480 ) = 2.95 +1 pod
  40. 40. Example B: 20h47 10 * ( 3.361 / 2.4 ) = 14 +4 pods
  41. 41. Photo by Willian Justen de Vasconcellos on Unsplash
  42. 42. How can a developer define requests?
  43. 43. Kubernetes metrics With Prometheus ● Pod usage ● Use by container ● Observation over time ● Define pod limits ● Load test
  44. 44. Example of a dashboard that shows CPU/RAM utilization of a pod
  45. 45. It will evolve over time
  46. 46. Metrics are per container? Yes, and so are Requests I lied earlier. Metrics are not by pod.
  47. 47. Containers Requests inside Pods
  48. 48. Containers Requests inside Pods
  49. 49. By container metrics ● Horizontal Pod Autoscaler compares containers metrics overall pods ● It does the same calculation as above, but for each different container ● We can therefore scale up because of Nginx or PHP independently ● It always takes the highest value to define the number of replicas
  50. 50. Photo by rawpixel on Unsplash
  51. 51. cluster- autoscaler Node scaling: + Unschedulable pods - Underutilized nodes
  52. 52. autoscaling based on Metrics Custom metrics: ● Redis queue ● PHP-FPM listen queue ● etc.
  53. 53. Thanks Photo by Benjamin Voros on Unsplash
  54. 54. Photo by Kalle Kortelainen on Unsplash questions and answers available in comments

×