How we Auto Scale applications based on CPU with Kubernetes at M6Web?

9 Dec 2018
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web?
1 sur 55

Contenu connexe

Tendances

CNCF Rajkot group- Know the magic of kubernetes with AWS EKSCNCF Rajkot group- Know the magic of kubernetes with AWS EKS
CNCF Rajkot group- Know the magic of kubernetes with AWS EKSamanmakwana3
Helm chart-introductionHelm chart-introduction
Helm chart-introductionGanesh Pol
[GS네오텍]  Google Kubernetes Engine [GS네오텍]  Google Kubernetes Engine
[GS네오텍] Google Kubernetes Engine GS Neotek
Scale your (aks) cluster, luke!Scale your (aks) cluster, luke!
Scale your (aks) cluster, luke!Alessandro Melchiori
From AWS to GCP, TABLEAPP Architecture StoryFrom AWS to GCP, TABLEAPP Architecture Story
From AWS to GCP, TABLEAPP Architecture StoryYen-Wen Chen
Business Continuity with Microservices-Based Apps and DevOps: Learnings from ...Business Continuity with Microservices-Based Apps and DevOps: Learnings from ...
Business Continuity with Microservices-Based Apps and DevOps: Learnings from ...DevOps.com

Similaire à How we Auto Scale applications based on CPU with Kubernetes at M6Web?

Autoscaling in kubernetes v1Autoscaling in kubernetes v1
Autoscaling in kubernetes v1JurajHantk
Adaptive Scaling of Microgateways on KubernetesAdaptive Scaling of Microgateways on Kubernetes
Adaptive Scaling of Microgateways on KubernetesWSO2
Journey of Kubernetes ScalingJourney of Kubernetes Scaling
Journey of Kubernetes ScalingOpsta
Kubernetes: Beyond Baby StepsKubernetes: Beyond Baby Steps
Kubernetes: Beyond Baby StepsDigitalOcean
Kubernetes Walk Through from Technical ViewKubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewLei (Harry) Zhang
Deep Dive Amazon EC2Deep Dive Amazon EC2
Deep Dive Amazon EC2Amazon Web Services

Similaire à How we Auto Scale applications based on CPU with Kubernetes at M6Web?(20)

Dernier

DSL - EDM OFFER - DUNK.pptxDSL - EDM OFFER - DUNK.pptx
DSL - EDM OFFER - DUNK.pptxMarcLewis35
AusNOG 2023: A quick look at QUICAusNOG 2023: A quick look at QUIC
AusNOG 2023: A quick look at QUICAPNIC
Microsoft Blockchain Case Studies.pptxMicrosoft Blockchain Case Studies.pptx
Microsoft Blockchain Case Studies.pptxJoelJohn481077
The value of measuring your accessibility maturityThe value of measuring your accessibility maturity
The value of measuring your accessibility maturityIntopia
[FediForum] Reisman FairPay - Rethinking Revenue.pdf[FediForum] Reisman FairPay - Rethinking Revenue.pdf
[FediForum] Reisman FairPay - Rethinking Revenue.pdfTeleshuttle Corporation
Guide to play with a GOD-TIER Swain adc.pptxGuide to play with a GOD-TIER Swain adc.pptx
Guide to play with a GOD-TIER Swain adc.pptxMizuBeats

How we Auto Scale applications based on CPU with Kubernetes at M6Web?

Notes de l'éditeur

  1. We often talk about clouds and containers: The problem: make the apps of the devs work without overprovisioning servers. The cloud makes it possible to ensure that our invoice at the end of the month is consistent with the consumption of our data
  2. This is one of the objectives of k8s: Optimize resources Adapt the infra to the real use Keep applications healthy
  3. Kubernetes achieves these objectives in several ways
  4. Reminder about Kubernetes: I will present the objects used in this presentation
  5. A `pod` is the smallest unit you handle with Kubernetes. This is an instance of the application for Kubernetes. If you give a request to a pod, it knows how to answer it: it’s an autonomous instance of an app.
  6. A pod can be composed of several containers. To meet the load or improve availability, several replicas of the pod can be created.
  7. A `node` is a machine, often virtual, that is part of the cluster.
  8. A `cluster` groups all the machines with which Kubernetes works. A cluster is composed of master nodes where Kubernetes' internal functionalities run and worker nodes where your applications run.
  9. To adapt to the load, the number of pods of an application is changed. A `HorizontalPodAutoscaler` or HPA dynamically controls the number of replicas of a pod, often according to their CPU consumption. If our application consumes a lot: we add pods, if it consumes a little, we remove pods
  10. Here, we have increased the charge. Kubernetes noticed this, and therefore added pods to compensate for this additional load.
  11. A `service` exposes a pod on the network - whether it is the cluster's internal network or the Internet. It is a single entry point per application. We always have a single entry point: a service, that we have 30 pods spread over 23 nodes.
  12. It was quick, I take up these notions with an example
  13. The example is our geolocation API, which has been running in production for several months.
  14. The app geo is a POD composed of two containers, PHP and Nginx. A single pod is enough to respond to an HTTP request.
  15. We deployed this pod in a production cluster, so that it could be executed on a node.
  16. On the production cluster, composed of several worker nodes and several master nodes.
  17. We have defined 2 replicas minimum of our application, to secure its execution if one of the pods crashes. By deploying two replicas of our pod, Kubernetes scheduled them on nodes in its cluster.
  18. Depending on the load, the HPA will change the number of replicas of our geolocation application
  19. The geo application is accessed through a single entry point on the network: the service. The evolution of the number of pods is therefore transparent for customers, whether they have 2 pods or 23.
  20. And it's working pretty well!
  21. Here is an example with our last football game: We started the evening with our minimum of 2 pods and for the peak load of the evening, we climbed to 14 pods of the application.
  22. It's super cool, we autoscale our application depending on the load! And it's quite new for us sysadmins.
  23. It's beautiful, it sounds almost magical like that, but not at all.
  24. It's YAML and it's in the developers' repositories: you have your hands on it.
  25. There are 2 things to configure: 1) HPA -> The HPA here is configured on CPU consumption with a target at 80% of Container Requests
  26. 2) -> Requests Are reserved resources: Kubernetes will not launch the pod if these resources are not available in the cluster.
  27. To optimize its resources, Kubernetes needs to know the size of the apps: this is the purpose of Requests If k8s knows the size of the app, it runs it on the right server
  28. Requests are reserved resources. The app can consume more or less. Requests is the normal consumption of an app, it is the 100% use of the app in good conditions. However, the app must be able to handle more than 100%.
  29. So to be able to autoscale an app in k8s, you need these two elements.
  30. And it works very well! Here, we see with the use of the CPU in relation to the number of pods: the curves evolve in the same way.
  31. More precisely? Does the HPA makes an average, the median of the pods? The use of nodes?
  32. It is an average: It compares the average consumption of the pods with its target: the 80% of Requests.
  33. We're going back to the previous football game to see the evolution. Here, at 4:30 p. m., so before the game and the peak load.
  34. At 4:30pm, we had 2 pods, each consuming CPU resources. However, they consumed 70% of the requests, which is less than the HPA target. So there's no need to scramble.
  35. The same example at 6:30 p. m.: we consume more. We're past the target: there's a need for scale up.
  36. Resources are reserved according to demand When the consumption of our application approaches the target, we scale But not all peak loads are the same: how does HPA adapt?
  37. We saw it during our football walk: sometimes we scale 1 pod, sometimes 4 at a time. Of how much should the HPA scale?
  38. HPA follows a simple formula: He takes the number of pods, He looks at how much they consume resources compared to what we would like them to consume, The result is the number of pods that would have to rotate to respect this load.
  39. In our example of the soccer game at 6:30 p. m, We had 2 pods, which started consuming a lot of resources The result of the mathematical calculation is 3 pods. He added 1.
  40. Same example at 8:47 p.m. We used more resources, the HPA did its calculation, the result is 14 pods, we had 10, he added 4.
  41. That's how we scale our pods: kubernetes allows us this autoscaling quite simply, We know how to scroll until the cluster's resources are exhausted, It is driven by yaml, in the projects of the devs: on which they have the hand
  42. It's the CPU cores and so on, but how does a dev get autonomous on that?
  43. All the previous graphs come from our prod grafana with real prod metrics, accessible by devs.
  44. Example of a dashboard that shows CPU/RAM utilization of a pod
  45. It's yaml: it's easy to change and it's taken into account right away. And the previous values were changed after the football match, because they were not at all optimal
  46. 2 last points before closing the subject
  47. The cloud allows us to add more servers for a few hours. We use the cluster-autoscaler for this.
  48. HPA controllable by metrics. By default we use the CPU consumption, but the metrics can be custom as long as it is a metric that exists in Prometheus.
  49. You now know how application autoscaling works in kubernetes and you have the possibility to do it independently
  50. questions: Is it relevant to put high values on the requests? The value of the requests is taken into account for the triggering of the HPA. If the app consumes a lot of resources, then yes, If it consumes little, then autoscaling will be triggered late, or not at all (it will crash before) In all cases, the application must hold the load beyond the value of the requests: it can consume more Is it relevant to have a very high max HPA? Yes if the app can consume these resources under normal circumstances, On the other hand to have a max HPA at 1000 time the max value of the application has little interest It's more like a safeguard if you ever have a bug and you consume too much Custom Metrics are defined at the request level? No, Requests are the CPU and RAM, notions defined at the level of app containers. Metrics , custom or not, are used to define the HPA target: it is therefore defined at the HPA level What is the price of putting a very high max HPA? None: these requests are not reserved until the pods are launched So it doesn't cost anything, it's just protection What is the waiting time to launch an additional node? It depends on the cloud provider, At AWS, for the moment, it's between 3 and 5 minutes, So it's not instantaneous and it can be problematic in very high peak loads (we look at overprovisioning) What is the waiting time to scroll pods? A few seconds: we start new containers that are created very quickly We use docker containers for the moment, but Kubernetes is not restrictive to this. Can we scroll on a metric history? Not really. We scale according to a metric, on current values, The purpose of kubernetes is to have an infra that automatically scales according to the current load. Predicting a load is not part of its objectives. However, it is still something that can be done depending on the Prometheus request we make