I explain how to use Requests and Horizontal Pod Autoscaler to autoscale an application. Here with yaml example of our geolocation app at https://tech.m6web.fr/ This talk were given at our Last Friday Talk oct. 18. Questions & Answers: Q1: Is it relevant to put high values on the requests? The value of the requests is taken into account for the triggering of the HPA. If the app consumes a lot of resources, then yes, If it consumes little, then autoscaling will be triggered late, or not at all (it will crash before) In all cases, the application must hold the load beyond the value of the requests: it can consume more Q2: Is it relevant to have a very high max HPA? Yes if the app can consume these resources under normal circumstances, On the other hand to have a max HPA at 1000 time the max value of the application has little interest It's more like a safeguard if you ever have a bug and you consume too much Q3: Custom Metrics are defined at the request level? No, Requests are the CPU and RAM, notions defined at the level of app containers. Metrics , custom or not, are used to define the HPA target: it is therefore defined at the HPA level Q4: What is the price of putting a very high max HPA? None: these requests are not reserved until the pods are launched So it doesn't cost anything, it's just protection Q5: What is the waiting time to launch an additional node? It depends on the cloud provider, At AWS, for the moment, it's between 3 and 5 minutes, So it's not instantaneous and it can be problematic in very high peak loads (we look at overprovisioning) Q6: What is the waiting time to scroll pods? A few seconds: we start new containers that are created very quickly We use docker containers for the moment, but Kubernetes is not restrictive to this. Q7: Can we scroll on a metric history? Not really. We scale according to a metric, on current values, The purpose of kubernetes is to have an infra that automatically scales according to the current load. Predicting a load is not part of its objectives. However, it is still something that can be done depending on the Prometheus request we make