In order to effectively manage multiple AKS, EKS, or GKE clusters in the public cloud and multiple users or teams who need cluster access, you need a solid multi-tenant cluster management strategy in place.
To help you get started on the right track, this cheatsheet was created to drive multi-tenancy success. In it, you’ll learn how to deliver governance and standardization across your AKS, EKS, or GKE clusters.
2. In 2019, Gartner predicted that by 2022 more than 75% of global organizations
would be running containerized applications in production, up from less than
30% at that time. The CNCF 2019 survey, published roughly one year later in
2020, says that more organizations than that are already in production.
Kubernetes enables the mobility of applications (from the data center, to
the cloud and out to the edge), improves the speed at which services can be
delivered, and improves the quality, uptime, and elastic scalability of those
services to meet the rapidly changing demands of consumers. As seen by its
growth, Kubernetes is a foundational technology that has accelerated digital
transformation.
Kubernetes adoption in an organization requires a multi-faceted approach.
First is designing to meet stakeholder requirements, which include availability,
agility, security and integration goals; second, organizing and enabling the
team for success in Day 2 operations; and third, building a culture that allows
for modern application development and deployment practices such as
GitOps.
Kubernetes is not just about orchestrating containers: platform teams must
architect, integrate, test, and manage both the core Kubernetes software
and several other projects needed for Day 2 operations. There are over
100 different certified Kubernetes distributions in the market, making it
challenging to determine what approach to take. Beyond getting started with
a single pre-production cluster, most organizations will need some form of
management and centralized governance, as even the smallest teams will have
more than one cluster.
Kubernetes is one of the fastest growing
open source projects in history.
This cheat sheet offers:
•
Guidance on what to consider when
planning your Kubernetes deployment
•
Some commonly used Kubernetes
commands to get started
Executive Summary
CHEATSHEET: KUBERNETES FOR OPERATIONS | 2
D2iQ
3. Table of Contents
Kubernetes
for Operations
What is Kubernetes? 4
4 Main Approaches to Production Kubernetes 5
Kubernetes Platform Design Considerations 7
Understanding Kubernetes Components 12
Kubernetes Commands Basics 17
Kubernetes Success 20
CHEATSHEET: KUBERNETES FOR OPERATIONS | 3
D2iQ
4. Kubernetes
Status
Version v1.17.8
Active
General Cluster Information
What is
Kubernetes?
Governed by the CNCF (or Cloud Native Computing
Foundation), Kubernetes is just one of several
open source projects needed for a functional and
operational container management platform.
Therefore, during the adoption of Kubernetes,
platform operators must decide on an approach to
delivering this capability to their organization.
By definition:
Kubernetes (K8s) is an open-source
system for automating deployment,
scaling, and management of
containerized applications.
CHEATSHEET: KUBERNETES FOR OPERATIONS | 4
D2iQ
5. DIY Open Source
Enterprises that take a Do It Yourself
(DIY) Open Source approach choose
to build their own Kubernetes
platform with custom automation,
testing, and integration from the
CNCF community. This option may or
may not include a supporting vendor
when issues arise.
4 Main Approaches to
Production Kubernetes:
Public Cloud
Public Cloud Providers offer
Kubernetes as a service in their
cloud. These services are typically
specific to the provider (for example,
Amazon’s EKS) and are generally
a simple bare bones Kubernetes
deployment. Support and some
basic lifecycle management of
the service is often done by the
cloud provider. Some additional
integration work is often required of
the consumer to enable monitoring
or other capabilities needed for
ongoing production operations.
02
01
Software
Software solutions generally provide
a complete tested and certified
distribution of Kubernetes, integrated
add-on (or addon) components,
and automation for management of
the platform. These solutions are
generally designed to simplify the
deployment and management of
Kubernetes on-premise, in the cloud,
or at the edge. Software solutions
are backed by vendors for support
and managed by the customer. They
are increasingly focused on providing
features for centralized management
of multiple clusters.
03
Managed Service
In this case, a service provider
takes responsibility for the
management of Kubernetes either
in the customer environment (on-
premise or cloud) or their own cloud
(SaaS). Managed service providers
lower the operational burden of
Kubernetes by managing it on
behalf of the customer. Managed
service offerings can be a black box,
without offering much in the way
of control over the configuration or
implementation of Kubernetes.
04
CHEATSHEET: KUBERNETES FOR OPERATIONS | 5
D2iQ
6. There are a few key decision criteria when
considering which approach to take:
Degree of lock-in:
Is the approach based on the upstream and open
source Kubernetes and other Cloud Native Computing
Foundation (CNCF) open source components, or
limited by proprietary solutions?
Degree of portability:
Can you consitently run on-premise, on the cloud,
on the edge, on VMs, on bare metal, in an air-gapped
environment or DMZ?
Degree of control required:
Is it flexible and customizable, standardized, or rigid;
integrated with upstream open source projects, open
APIs, or tied to proprietary solutions?
Degree of operational burden:
Does it require custom automation, provide
automation, or no automation (is fully managed); is it
owned and operated by the customer or by the service
provider or somewhere in between; how complete is
the solution or how much additional integration work
is required to have a production ready system?
Cost:
Kubernetes, while open source, is not free. The
solution should be cost effective and not tie you in
by being bundled with other solutions, software,
hardware, cloud resources or years of professional
services. What is the total cost of ownership?
01
02
03
04
05
CHEATSHEET: KUBERNETES FOR OPERATIONS | 6
D2iQ
8. Planning for Kubernetes is a step that
can often be missed, as many Kubernetes
projects start with experimentation.
There are several design considerations when planning
for Kubernetes in production. The criteria depends
heavily on the approach taken, as discussed in the
previous section. Some level of planning for the
following capabilities will be required regardless of the
approach. The diagram below describes many of the
topics that will need to be considered from a technical
perspective. Planning for a production system will also
need to address the people and process aspects, which
includes proper training and support.
Diagram 1
CHEATSHEET: KUBERNETES FOR OPERATIONS | 8
D2iQ
9. Some applications that run on the
Kubernetes platform may require
backing storage. Containers, by default,
are ephemeral; If they are terminated,
any data (or state) is lost. Containers
may be attached to backing storage (of
many types) to maintain state or store
data. The Kubernetes platform should
be able to leverage storage plug-ins to
enable interoperability with different
storage providers, classes, or types.
The Container Storage Interface (CSI)
defines a standard specification for
how the platform interacts with storage
systems. The Kubernetes platform
should be able to use storage systems
that comply with the CSI standard.
To address the needs of restoring a
production system in the event of
catastrophic failure, backups and
a disaster recovery strategy are a
necessity. There are a number of ways
that operators approach this problem,
which may be dictated by the types
of workloads running on the cluster,
business criticality, and a number of
other factors. Some environments may
be rebuilt completely from scratch
with the “click of a button,” using an
infrastructure as code (IaC) approach
coupled with GitOps for application
deployment. In those cases, all of
the configuration for the cluster is
in code and easily reproducible. But
in many cases, that approach may
not be feasible. Complexities exist
for applications that store state
(databases) or dependencies may exist
due to the type of infrastructure (edge,
cloud, or on-premise).
Observability refers to a set of
capabilities that help with logging,
monitoring, alerting and tracing
against defined criteria for the
cluster, resources, and deployed
applications. There are three key
aspects to observability that
need to be addressed: (1) logging:
aggregating all cluster logs inclusive
of the infrastructure, system,
and application; (2) monitoring
and alerting: aggregating metric
(measures and counters) inclusive
of the infrastructure, system and
applications with threshold-based
notifications; and (3) tracing,which
enables the instrumentation needed
for root cause analyses of issues
where multiple different components
interact. Tracing is primarily used
to debug distributed applications
running on Kubernetes, since most
cloud-native applications follow
a loosely coupled, microservices
architecture approach.
Security is a broad topic and requires
careful consideration. Security
starts with ensuring that the
Kubernetes cluster itself is properly
configured for best practices and
free of vulnerabilities. Images that
are deployed to the cluster need
to be scanned for CVEs and bad
practices. Users on the system
need to be authenticated and have
specific access rights associated
with their role or responsibility.
Appropriate observability must be in
place to enable audit trails, change
management and more. Fortunately,
there is an entire category of open
source and commercially available
products to help organizations
address security considerations for
production operations.
PureKubernetesFoundation
As depicted in the Diagram 1, Kubernetes alone is only the foundation of the platform required for a production deployment.
Minimally, any Kubernetes cluster will require backing storage to maintain state (in this case, an etcd database), as well as choice
of a container runtime environment (in this case, containerd). It is also important that the distribution of Kubernetes be from the
pure upstream source. The CNCF has a conformance (or certification process) to identify if a distribution meets the upstream
requirements.
With a standard, open source and upstream base, a number of additional areas will need to be addressed for an operational-ready
solution. The following are some of the areas that will need to be considered:
Storage Integration
Backups and
Disaster Recovery
Observability
Security
CHEATSHEET: KUBERNETES FOR OPERATIONS | 9
D2iQ
10. Access management addresses
the needs of user authentication
and authorization. It is critical to
ensure that access to production
systems is governed and that the
level of access granted is aligned
with the role or responsibility of the
user of the platform. Enterprises
may integrate Kubernetes with an
identity provider (LDAP or Active
Directory etc.) to align with other
access management processes.
Access Management
A service mesh can provide advanced
networking capabilities, such as
ingress to applications, enhanced
security, circuit breaking, routing,
tracing, and more. These capabilities
can introduce some additional
overhead in terms of resource
utilization as well as some additional
operational complexity, requiring
planning prior to deployment.
Service Mesh
Production Kubernetes clusters must
meet several different networking
requirements. Like storage, Kubernetes
networking leverages an open
standard (CNI, or Container Networking
Interface) for interoperability with
networking providers. CNI providers
enable a variety of networking
capabilities, such as a network overlay,
IP address management, routing and
Networking
The CNCF ecosystem provides a wide
variety of choices when it comes to
addressing the many requirements and
capabilities required for a production
Kubernetes; however, navigating the
ecosystem is not easy. Identifying a
partner that can help an organization
navigate the complexities, or leveraging
a commercial product, can help remove
many of the burdens that would come
with assembling a production-ready
platform alone.
Summary
much more. The Kubernetes platform
should be able to use software defined
networking capabilities that comply
with the CNI standard. The platform
may require an ingress controller to
route traffic from outside the cluster to
running applications, a load balancing
component, or integrated DNS services.
CHEATSHEET: KUBERNETES FOR OPERATIONS | 10
D2iQ
Diagram 2: CNCF Cloud Native Landscape https://landscape.cncf.io/
Manager
Application Monitoring
Debugging
Training
Storage
Kubernetes
Installer
Container
Runtime
Container Platform
Resource Manager
Service Discovery,
Mesh Networking
Log View
Search
11. Like the integration of a production platform, testing can be
operationally expensive considering the scope of testing needs.
Offloading testing to a Kubernetes provider can significantly
reduce the total cost of ownership for managing, operating, or
owning a Kubernetes platform.
These additional requirements need to address integration
with desired and standard operating systems, hardware types,
virtualization platforms, and cloud infrastructure. This will
generally result in a set of capabilities focused on automation
and leverage the open standards that are defined by the
ecosystem. These sets of requirements are identified in the
outer ring of the image in Diagram 1.
Beyond the basics
With the basic needs of the production
platform addressed, the next set of
requirements address the ongoing needs
of testing and integration.
CHEATSHEET: KUBERNETES FOR OPERATIONS | 11
D2iQ
13. Standard Components
of Kubernetes
Master Nodes:
These are the minimum components required for a Kubernetes cluster:
API SERVER
• Handles all requests coming in to cluster
• Processes requests and updates etcd
• Performs authentication/authorization
• More https://bit.ly/3lJxkuI
ETCD SERVER
• Consistent highly available key-value store
• Persistent data storage for all kubernetes objects
• More https://bit.ly/3tU923M
SCHEDULER
• Decides where pods should run based on multiple factors
— affinity, available resources, labels, QoS, etc.
• More: https://bit.ly/39dAW2Z
Worker Nodes:
KUBELET — AGENT ON EVERY WORKER
• Instantiate pods (group of one or more containers) using
PodSpec and insures all pods are running and healthy
• More https://bit.ly/3chDzmk
KUBE PROXY — AGENT ON EVERY WORKER
• Network proxy and load balancer for Kubernetes Services
• More https://bit.ly/3lOzEjS
CONTROLLER MANAGER
• Daemon process that implements the control loops built
into Kubernetes — e.g. rolling deployments
• More: https://bit.ly/39ekQ9r
CHEATSHEET: KUBERNETES FOR OPERATIONS | 13
D2iQ
14. Kube-DNS (CoreDNS)
• Provisioned as a pod and a service on Kubernetes
• Every service gets a DNS entry in Kubernetes
• Kube-DNS provides DNS based service discovery in the cluster
Metrics Server
•
Provides API for cluster wide usage metrics like CPU and
memory utilization
•
Feeds the usage graphs in the Kubernetes Dashboard (GUI)
— see Dashboard image under “Kubernetes Constructs” section
Kubectl
• Official command line for Kubernetes
•
Industry standard Kubernetes commands start with “kubectl”
Web UI (Dashboard)
•
Official GUI of Kubernetes – Industry standard GUI for a
Kubernetes clusters
Standard Add-ons for Kubernetes
These are other core components that are required for the
majority of production scenarios:
Light
CHEATSHEET: KUBERNETES FOR OPERATIONS | 14
D2iQ
15. Required for a Kubernetes Platform
It is often said that Kubernetes alone is not
enough for a production system. These are some
of the auxillary ecosystem components required
for any production Kubernetes platform.
Infrastructure
•
Kubernetes can be installed on bare metal, public cloud
instances or virtual machines
Ingress Controller
• HTTP traffic access control for Kubernetes services
• Interacts with Kubernetes API for state changes
• Applies ingress rules to service load balancer
Declarative and automated installer
• Deploys on any infrastructure
• Supports a repeatable and standardized deployment methodology
Private Container Registry
• Registry for an organization’s standard container images
•
Require access credentials (from IDM or secrets located in
Kubernetes pod)
Monitoring
• Metrics collected on Kubernetes infrastructure and hosted objects
• Typical options: Prometheus, Sysdig, Datadog
Secrets Management
•
Holds sensitive information such as passwords, OAuth
tokens, and ssh keys required for services, developers
and operations
Container Runtime
• Manages containers lifecycle
• Typical options: containerd, runC
Network Plugin
• Network overlay for policy and software defined networking
•
Network overlays use the Container Network Interface (CNI)
standard that works with all Kubernetes clusters
Logging Auditing
• Centralized logging for Kubernetes
• Typical options: Elastic, Splunk, Sumo Logic, Loki
Load Balancing
• Software load balancing to each Kubernetes services
CHEATSHEET: KUBERNETES FOR OPERATIONS | 15
D2iQ
16. Kubernetes Basic Concepts
These are some foundational Kubernetes concepts:
Image via the Kubernetes Dashboard Github:
https://github.com/kubernetes/dashboard
Virtual segmentation of single clusters.
Namespaces
Continuous loop that ensures given number of pods
are running
ReplicaSet
Infrastructure fabric of Kubernetes (host of worker
and master components)
Nodes
Manages external HTTP traffic to hosted services
Ingresses
A logical layer that provides IP/DNS/etc. persistence
to dynamic pods
Services
A logical grouping of one or more containers that is
managed by Kubernetes
Pods
Role based access controls for Kubernetes cluster
Roles
Manages a ReplicaSet, pod definitions/updates and
other concepts
Deployments
CHEATSHEET: KUBERNETES FOR OPERATIONS | 16
D2iQ
18. Below are some useful
commands for IT professionals
getting started with Kubernetes.
COMMANDS
A full list of Kubectl commands can be found in the reference documentation
https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands
kubectl [command] [TYPE] [NAME] [flags]
Kubectl Command Description
$ kubectl version Print the client and server
version information
$ kubectl cluster-info IP addresses of master and services
$ kubectl get namespaces List all the namespaces in the
Kubernetes cluster
$ kubectl cordon NODE Mark node as unschedulable and used
for maintenance of the cluster
Light Da
CHEATSHEET: KUBERNETES FOR OPERATIONS | 18
D2iQ
19. $ kubectl delete pod foo Delete foo pods from the cluster
$ kubectl delete svc foo Delete the service called foo from
the cluster
$ kubectl create service clusterip
foo
Create a clusterIP for a service
named foo
$ kubectl autoscale
deployment foo --min=2
--max=10 --cpu-percent=70
Autoscale pod foo with a minimum of
2 and maximum of 10 replicas when
CPU utilization is equal to or greater
than 70%
$ kubectl drain NODE Removes pods from node via graceful
termination for maintenance
$ kubectl drain NODE
--dry-run=client
Find the names of the objects that will
be removed
$ kubectl drain NODE
--force=true
Removes pods even if they refuse to
terminate gracefully
$ kubectl taint nodes
node1 key=value:NoSchedule
Taints repel work, so only pods with
a matching “toleration” are allowed
to run on that node. Pods with the
appropriate toleration include the
identically-tainted nodes in the list of
candidate nodes during scheduling.
Pods without the matching toleration
do not
$ kubectl explain RESOURCE Print documentation of resources.
This is especially useful if you require
information such as what API group a
particular construct is a part of, what
fields are needed in the manifest, and
what those fields do
Kubectl Command Description
$ kubectl version Print the client and server version
information
$ kubectl cluster-info IP addresses of master and services
$ kubectl get namespaces List all the namespaces in the
Kubernetes cluster
$ kubectl cordon NODE Mark node as unschedulable and used
for maintenance of the cluster
Kubectl Command Description
$ kubectl explain RESOURCE Print documentation of resources.
This is especially useful if you require
information such as what API group a
particular construct is a part of, what
fields are needed in the manifest, and
what those fields do
$ kubectl scale --replicas=COUNT
rs/foo
Scale a ReplicaSet (rs) named footo
a particular number of pods. Can
also scale a Replication Controller, or
StatefulSe
$ kubectl apply -f frontend-v2.
json
Find the names of the objects that will
be removed
$ kubectl label pods foo
GPU=true
Add the label ‘GPU’ and set the value
to ‘true’ to the pods ‘foo’. If the label
GPU already exists on the pod foo,
this will fail unless we add the flag
--overwrite
CHEATSHEET: KUBERNETES FOR OPERATIONS | 19
D2iQ
21. Kubernetes adoption
requires more than
just technology to
deliver outstanding
business outcomes.
For that reason, establishing the right partner to
guide your strategy, enable your teams with the
necessary training and certification, and provide end-
to-end support for both Kubernetes and the full stack
of associated cloud native services, becomes critical
to success. This approach addresses needs across
the domains of people, process and technology,
as the technology alone is not enough to achieve
success. The benefits of a comprehensive approach
include faster time to benefit, reduced total cost of
ownership, improved operational efficiency and less
stress on your ops teams.
Diagram 3
CHEATSHEET: KUBERNETES FOR OPERATIONS | 21
D2iQ
22. The D2iQ Kubernetes Platform
The D2iQ Kubernetes Platform (DKP)
provides a differentiated approach
and unique set of technologies, expert
professional services, training, and
support that enables you to deliver on the
promise of digital transformation.
As cloud native pioneers with over a decade of
large-scale distributed computing expertise,
D2iQ is uniquely positioned to help you leverage
the best of what open source Kubernetes has to
offer while adding the must-have, enterprise-
grade capabilities your organization needs.
The D2iQ Kubernetes Platform is built to ensure
your deployment of Kubernetes is done with
ease and agility, regardless of your level of
organizational maturity. Its unique foundation
was designed to provide a fully interoperable
and Cloud Native Computing Foundation (CNCF)
conformant open-source experience, which
helps make sure you always leverage the best
innovations that the industry has to offer
for Kubernetes in production at scale, while
ensuring the security, resilience and TCO that
your organization demands.
D2iQ Kubernetes Platform: https://d2iq.com/kubernetes-platform
CHEATSHEET: KUBERNETES FOR OPERATIONS | 22
D2iQ
23. Learn More
D2iQ simplifies and automates the really difficult tasks needed for enterprise
Kubernetes in production at scale, reducing operational burden and TCO. With
an open source and flexible approach to automation, the D2iQ Kubernetes
Platform will deliver the results required regardless if you are deploying in the
cloud, on-premise, in a highly secure air-gapped environment, or on the edge.
As a cloud native pioneer, we have more than a decade of experience tackling
the most complex, mission-critical deployments in the industry.
Ready to see how D2iQ can power Kubernetes
in your organization? Contact sales@d2iQ.
com today to get started. From weekly touch-
base meetings to biweekly roadmap calls,
customer success managers and solution
architects work lockstep with your technology
organization to eliminate the learning curve.
For more information visit www.D2iQ.com
CHEATSHEET: KUBERNETES FOR OPERATIONS | 23
D2iQ