Kubernetes Introduction. The concepts you need to understand to effectively develop and run applications in a Kubernetes environment. Focusing primarily on application developers, but it also provides an overview of managing applications from the operational perspective. It’s meant for anyone interested in running and managing containerized applications on more than just a single server.
6. Container are great but…
• Isolation.
• Immutability
• Efficient resource
utilization.
• Lightweight
• Portable
But …
• Dozens, even thousands of
containers over time.
• How to
manage/deploy/connected/up
dated ?
• Integrate and orchestrate
these modular parts
• Provide communication
across a cluster
• Make them fault tolerant
7. Kubernetes come to help
• Desire state
management.
• Resilience.
• Automate roll-out and
roll-back.
• Elastic.
• Cloud-agnostic.
• Efficient resource
management.
• Abstract infrastructure
layer.
9. Pod
• The smallest and
simplest unit in the
Kubernetes object model.
• Containers in pods share
network namespace,
volume.
• Pods are logical hosts
and behave much like
VMs
10. Pod template
• 1 Descriptor conforms to
version v1 of Kubernetes API
• 2 You’re describing a pod.
• 3 The name of the pod
• 4 Container image to create
the container from
• 5 Name of the container
• 6 The port the app is listening
on
12. Liveness probes
Kubernetes can check if a container is still
alive through liveness probes. 3 mechanisms:
• HTTP GET probe.
• TCP socket probe.
• Exec probe.
13. ReplicaSet
A Kubernetes resource that
ensures its pods are always
kept running. Has 3 essential
parts:
• A label selector, which
determines what pods are in
the Replicationset ’s scope
• A replica count, which
specifies the desired number
of pods that should be
running
• A pod template, which is
used when creating new pod
replicas
15. Daemonset
• A DaemonSet makes sure it
creates as many pods as
there are nodes and deploys
each one on its own node.
• Example:
• a log collector
• a resource monitor
• Kube proxy
17. Job
• Perform a single completable
task.
• Useful for ad hoc tasks,
where it’s crucial that the
task finishes properly.
18. Service
• Pods are ephemeral.
• Horizontal scaling
means multiple pods
may provide the same
service
• a Service is an abstraction which
defines a logical set of Pods.
• The set of Pods targeted by a Service
is usually determined by a selector.
• Each service has an IP address and
port that never change while the
service exists.
19. Service
3 types:
• ClusterIP: use for internal cluster, discovering by
DNS. Example: mdt-detector.redaction
• NodePort: each cluster node opens a port on the
node itself (hence the name) and redirects traffic
received on that port to the underlying service
• LoadBalancer: an extension of
the NodePort type
24. Deployment
• A higher-level resource meant
for deploying applications and
updating them declaratively.
• Create a Deployment, a
ReplicaSet resource is
created underneath.
• In a Deployment, the actual
pods are created and
managed by the
Deployment’s ReplicaSets
• Deployment provide the
capability upgrade without
downtime.
27. Statefulset
A StatefulSet makes sure pods are rescheduled in such a way that
they retain their identity and state
You can reach the pod through its fully qualified domain
name, which is a-0.foo.default.svc.cluster.local.
33. Basic step
• Dockerize.
• Write deployment/service.
• Define configmap or variable
env.
• Resource usage.
• Liveness/Ready probe.
• Helm chart
• How to structure application into
pods? Multiple container in 1 pod
or multiple pod?
• How to integrate with other
service?
• Does it need to communicate
with outside?
• Does it need stateful?
• How to integrate with Ecom?
34. Helm
• Helm helps you manage Kubernetes
applications — Helm Charts help you
define, install, and upgrade even the
most complex Kubernetes application.
• Charts are easy to create, version,
share, and publish — so start using
Helm and stop the copy-and-paste.
39. API server
API server is the central component used by all other components and by clients, such
as kubectl. It provides a CRUD (Create, Read, Update, Delete) interface for querying
and modifying the cluster state over a RESTful API. It stores that state in etcd.
41. Scheduler
• Filtering the list of all nodes to obtain a list of
acceptable nodes the pod can be scheduled to
• Prioritizing the acceptable nodes and choosing the best
one. If multiple nodes have the highest score, round-
robin is used to ensure pods are deployed across all of
them evenly.
42. Controller Manager
• ReplicaSet, DaemonSet, and
Job controllers.
• Deployment controller.
• StatefulSet controller.
• Node controller.
• Service controller.
• Others
Controllers do many different things, but they all watch the
API server for changes to resources (Deployments,
Services, and so on) and perform operations for each
change, whether it’s a creation of a new object or an
update or deletion of an existing object.
44. Kube-proxy
The iptables proxy mode doesn’t—it selects pods randomly. When only a few clients use a
service, they may not be spread evenly across pods. For example, if a service has two backing
pods but only five or so clients, don’t be surprised if you see four clients connect to pod A and
only one client connect to pod B. With a higher number of clients or pods, this problem isn’t so
apparent.
kube-proxy makes sure connections to the service IP and
port end up at one of the pods backing that service
Developers are lazy and somewhere in the mid-late 80s they started abbreviating the words based on their first letter, last letter, and number of letters in between. This is why you’ll sometimes see i18n for internationalization and l10n for localization. There are also new numeronyms such as Andreessen Horowitz (a16z) and of course our favorite kubernetes (k8s).
I18n= internationalization
L10n = localization
K8s = kubernetes
O11y = observability
Containers are great. They provide you with an easy way to package and deploy services, allow for process isolation, immutability, efficient resource utilization, and are lightweight in creation.
A container has its own filesystem, CPU, memory, process space, and more. As they are decoupled from the underlying infrastructure, they are portable across clouds and OS distributions.
They provide you with an easy way to package and deploy services, allow for process isolation, immutability, efficient resource utilization, and are lightweight in creation.
containers are only a low-level piece of the puzzle. The real benefits are obtained with tools that sit on top of containers — like Kubernetes. These tools are today known as container schedulers.
The basic idea of Kubernetes is to further abstract machines, storage, and networks away from their physical implementation. So it is a single interface to deploy containers to all kinds of clouds, virtual machines, and physical machines.
All pods in a Kubernetes cluster reside in a single flat, shared, network-address space which means every pod can access every other pod at the other pod’s IP address. No NAT (Network Address Translation) gateways exist between them. When two pods send network packets between each other, they’ll each see the actual IP address of the other as the source IP in the packet.
pods are logical hosts and behave much like physical hosts or VMs in the non-container world. Processes running in the same pod are like processes running on the same physical or virtual machine, except that each process is encapsulated in a container.
Kubernetes can check if a container is still alive through liveness probes
Kubernetes can probe a container using one of the three mechanisms:
An HTTP GET probe performs an HTTP GET request on the container’s IP address, a port and path you specify. If the probe receives a response, and the response code doesn’t represent an error (in other words, if the HTTP response code is 2xx or 3xx), the probe is considered successful. If the server returns an error response code or if it doesn’t respond at all, the probe is considered a failure and the container will be restarted as a result.
A TCP Socket probe tries to open a TCP connection to the specified port of the container. If the connection is established successfully, the probe is successful. Otherwise, the container is restarted.
An Exec probe executes an arbitrary command inside the container and checks the command’s exit status code. If the status code is 0, the probe is successful. All other codes are considered failures.
Default probed every 10s
container is restarted after the probe fails three consecutive times (#failure=3).
If the pod disappears for any reason, such as in the event of a node disappearing from the cluster or because the pod was evicted from the node, the Replicaset notices the missing pod and creates a replacement pod.
Service discovery and Load Balancing may be managed by a Service object. Services provide a single virtual IP address and dns name load balanced to a collection of Pods matching Labels.
Service is a resource you create to make a single, constant point of entry to a group of pods providing the same service.
Each service has an IP address and port that never change while the service exists.
Clients can open connections to that IP and port, and those connections are then routed to one of the pods backing that service.
This way, clients of a service don’t need to know the location of individual pods providing the service, allowing those pods to be moved around the cluster at any time.
Ingress (noun)—The act of going in or entering; the right to enter; a means or place of entering; entryway.
The controller determined which service the client is trying to access, looked up the pod IPs through the Endpoints object associated with the service, and forwarded the client’s request to one of the pods.
Ingress controller didn’t forward the request to the service. It only used it to select a pod. Most, if not all, controllers work like this.
Ingress controller: actually is an application in k8s with deployment and services.
a volume is created when the pod is started and is destroyed when the pod is deleted. Because of this, a volume’s contents will persist across container restarts. After a container is restarted, the new container can see all the files that were written to the volume by the previous container. Also, if a pod contains multiple containers, the volume can be used by all of them at once.
volumes are a component of a pod and are thus defined in the pod’s specification—much like containers. They aren’t a standalone Kubernetes object and cannot be created or deleted on their own. A volume is available to all containers in the pod, but it must be mounted in each container that needs to access it. In each container, you can mount the volume in any location of its filesystem.
Components of the Control Plane
The Control Plane is what controls and makes the whole cluster function. To refresh your memory, the components that make up the Control Plane are
The etcd distributed persistent storage
The API server
The Scheduler
The Controller Manager
These components store and manage the state of the cluster, but they aren’t what runs the application containers.
Components running on the worker nodes
The task of running your containers is up to the components running on each worker node:
The Kubelet
The Kubernetes Service Proxy (kube-proxy)
The Container Runtime (Docker, rkt, or others)
Add-on components
Beside the Control Plane components and the components running on the nodes, a few add-on components are required for the cluster to provide everything discussed so far. This includes
The Kubernetes DNS server
The Dashboard
An Ingress controller
The API server doesn’t do anything else except what we’ve discussed. For example, it doesn’t create pods when you create a ReplicaSet resource and it doesn’t manage the endpoints of a service. That’s what controllers in the Controller Manager do.
But the API server doesn’t even tell these controllers what to do. All it does is enable those controllers and other components to observe changes to deployed resources. A Control Plane component can request to be notified when a resource is created, modified, or deleted. This enables the component to perform whatever task it needs in response to a change of the cluster metadata.
Clients watch for changes by opening an HTTP connection to the API server. Through this connection, the client will then receive a stream of modifications to the watched objects. Every time an object is updated, the server sends the new version of the object to all connected clients watching the object. Figure 11.4 shows how clients can watch for changes to pods and how a change to one of the pods is stored into etcd and then relayed to all clients watching pods at that moment.
Can the node fulfill the pod’s requests for hardware resources
Is the node running out of resources (is it reporting a memory or a disk pressure condition)?
If the pod requests to be scheduled to a specific node (by name), is this the node?
Does the node have a label that matches the node selector in the pod specification (if one is defined)?
If the pod requests to be bound to a specific host port is that port already taken on this node or not?
If the pod requests a certain type of volume, can this volume be mounted for this pod on this node, or is another pod on the node already using the same volume?
Does the pod tolerate the taints of the node? Does the pod specify node and/or pod affinity or anti-affinity rules? If yes, would scheduling the pod to this node break those rules?
HPA continuously checks metrics values you configure during setup AT A DEFAULT 30 SEC intervals
HPA attempts to increase the number of pods If the SPECIFIED threshold is met
HPA mainly updates the number of replicas inside the deployment or replication controller
The Deployment/Replication Controller WOULD THEN roll-out ANY additional needed pods
Consider these as you rollout HPA:
The default HPA check interval is 30 seconds. This can be configured through the — horizontal-pod-autoscaler-sync-period flag of the controller manager
Default HPA relative metrics tolerance is 10%
HPA waits for 3 minutes after the last scale-up events to allow metrics to stabilize. This can also be configured through — horizontal-pod-autoscaler-upscale-delay flag
HPA waits for 5 minutes from the last scale-down event to avoid autoscaler thrashing. Configurable through — horizontal-pod-autoscaler-downscale-delay flag
HPA works best with deployment objects as opposed to replication controllers. Does not work with rolling update using direct manipulation of replication controllers. It depends on the deployment object to manage the size of underlying replica sets when you do a deployment
The CA checks for pods in pending state at a default interval of 10 seconds.
When If there is one or more pods in pending state because of there are not enough available resources on the cluster to allocate on the cluster them, then it attempts to provision one or more additional nodes.
When the node is granted by the cloud provider, the node is joined to the cluster and becomes ready to serve pods.
Kubernetes scheduler allocates the pending pods to the new node. If some pods are still in pending state, the process is repeated and more nodes are added to the cluster.