BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
Building a CI/CD driven infrastructure for managing kubernetes clusters on bare metal
1. Stephan Fudeus, Expert Continuous Delivery
Dr. Sascha Mühlbach, Expert Infrastructure Architect
BUILDING A CI/CD DRIVEN INFRASTRUCTURE FOR
MANAGING KUBERNETES CLUSTERS ON BARE METAL
2. 1&1 Mail & Media Development & Technology GmbH
United Internet / 1&1 Mail & Media
18.06.18!2
United Internet
▪ is a leading European internet specialist
▪ > 9000 employees
▪ 90k servers in 10 data centers
▪ Access
• DSL and mobile
▪ Applications
• Hosting
• Consumer (WEB.DE etc.)
1&1 Mail & Media
▪ main brands GMX, WEB.DE and MAIL.COM
▪ various services around a free or paid mail account (calendar, news portal, cloud
storage)
▪ 33 million active users / month
3. 1&1 Mail & Media Development & Technology GmbH
Speakers
18.06.18!3
▪ Stephan Fudeus
▪ Joined 1&1 in 2005
▪ Long-term experience in building highly scalable multi-tenant applications
▪ Product Owner and TechLead for our Kubernetes Clusters
▪ Twitter: @der_sfu
▪ Dr. Sascha Mühlbach
▪ Expert Infrastructure Architect
▪ 15 years professional experience
▪ Responsible for the global operations strategy of the applications and
systems infrastructure
4. 1&1 Mail & Media Development & Technology GmbH
■ Motivation / Environment
■ Cluster-Design
■ Network-Setup
■ Load Balancing and Cluster Ingress
■ Deployment
■ Git-driven cluster operations
■ Multi-Tenancy
■ Onboarding & Training
!4
Agenda
18.06.18
5. 1&1 Mail & Media Development & Technology GmbH
Motivation
18.06.18!5
▪ Why Container?
▪ Strong coupling between code and application runtime environment
▪ One build responsibility
▪ Hide core infrastructure from application
▪ Reproducibility in development
▪ Follow new standards in software development
▪ Requirements
▪ Reliable platform that provides the same level of availability that our existing environment is delivering
▪ Efficient deployment for geo-redundant services in multiple data centers
▪ Self-service for the development and application teams
▪ Multi-tenancy with strong separation for security reasons
▪ Must fit into the existing network environment
▪ Mostly automated operation of the base Kubernetes platform
6. 1&1 Mail & Media Development & Technology GmbH
Environment
6/18/18!6
▪ Organizational Environment
▪ Approx. 25 Dev Teams with 10 Ops Teams
▪ Strong organizational separation between PM / Dev / Ops
▪ Ops Teams provide 24/7 and are responsible for operating the application and the operating system stack
▪ One dedicated Ops Team to build and run the Kubernetes platform
▪ Technical Environment
▪ Existing infrastructure of bare metal and virtual machines (KVM, ESX)
▪ All servers are Puppet managed
• Our infrastructure has ~15.000 Puppet clients
▪ Majority of services are written in Java
▪ Ongoing transition to CD and microservices
▪ 3 datacenters (2 in DE, 1 in US) that are owned by us
7. 1&1 Mail & Media Development & Technology GmbH
Cluster Design
18.06.18!7
8. 1&1 Mail & Media Development & Technology GmbH
Network Setup for Frontend Zone
!8
▪ Integration of existing F5 BigIP load balancing platform with their features
▪ Service IPs are BGP-routed to Balancer and then forwarded with SNAT to NodePorts
▪ BGP enables global redundancy
▪ No public IPs inside Kubernetes cluster
kube-proxy kube-proxy kube-proxy
POD
POD
POD
POD
POD
ServiceType: NodePort
10.8.0.1:30001 10.8.0.2:30001 10.8.0.3:30001
82.165.230.17:443
Pool Member
10.8.0.1:30001
10.8.0.2:30001
10.8.0.3:30001
F5 - K8S
KubernetesAPI
REST Calls
Worker
Node
9. 1&1 Mail & Media Development & Technology GmbH
Network configuration via ConfigMap for F5
18.06.18!9
10. 1&1 Mail & Media Development & Technology GmbH
▪ In backend networks, we use MetalLB (no specific Layer 7 requirements)
▪ Service IPs are BGP-announced with ECMP distribution (easy scaling)
▪ LoadBalancing only with K8S base algorithms or ingress controller features
Network Setup for Backend Zone
18.06.18!10
kube-proxy kube-proxy kube-proxy
POD
POD
POD
POD
POD
Worker Node
MetalLB
10.176.0.6:80
MetalLB
11. 1&1 Mail & Media Development & Technology GmbH
Network configuration via Service for MetalLB
18.06.18!11
12. 1&1 Mail & Media Development & Technology GmbH
Load Balancing & Cluster Ingress
18.06.18!12
▪ Additional L7 ingress support by using Traefik
▪ Easier maintenance via wildcard DNS and wildcard TLS cert
▪ Additional functionalities
▪ Load balancing (wrr, drr)
▪ Circuit breaking
▪ HTTP/HTTPS redirect
▪ Host-based / Path-based routing
▪ Health-Checks
▪ Session-Stickyness
▪ Multiple ingress domains
▪ Personal namespaces
▪ System services
▪ Applications
13. 1&1 Mail & Media Development & Technology GmbH
Lessons Learned Load Balancing
18.06.18!13
▪ F5 with BIG-IP Controller for Kubernetes
▪ Solves the automation issue
▪ Provides all loadbalancing features of the existing platform
▪ Nodeport balancing mode has some caveats
▪ Extensive use of SNAT in our environment sometimes difficult to debug
▪ Not all features of the F5 configuration implemented yet
▪ MetalLB
▪ Works well with our BGP setup
▪ Has some limitations, e.g., balancing methods
▪ ECMP distribution may drop open connections on every topology change (e.g. during rolling upgrades)
▪ Traefik
▪ Needs to be combined with MetalLB for BGP functionality
▪ Separation of ingress domains with multiple controllers
14. 1&1 Mail & Media Development & Technology GmbH
Deployment
18.06.18!14
▪ Container Linux
▪ No SSH by default – fully automated operations
▪ Only the essential base system, reduced attack surface
▪ etcd, apiserver, kubelet managed via systemd
▪ Automated processes based on changes in an etcd instances
▪ Matchbox / Terraform
▪ Creation of ignition files via terraform and confd based on data in dedicated etcd
▪ Updates triggered via GitLab-CI
▪ Additional manual Gitlab-CI jobs for deployment of addons, namespaces, …
▪ Own H/A Image Registry
▪ User management via OIDC by using Dex for LDAP integration
▪ Vault as Cluster-CA
15. 1&1 Mail & Media Development & Technology GmbH
Git-driven cluster operations
18.06.18!15
▪ Maturity level via 3 branches (master, integration, production)
▪ All cluster operations are triggered based on Gitlab-CI pipelines
▪ automatically on git-pushes to relevant branch
▪ manually triggered jobs for cluster changes
▪ scheduled jobs for periodic changes (namespace updates / purges)
▪ The amount of clusters makes this a bit cumbersome
16. 1&1 Mail & Media Development & Technology GmbH
Git-driven operations use cases
18.06.18!16
▪ Full redeployment of clusters
▪ Only if cluster is broken, will wipe everything
▪ Will redeploy all nodes in parallel
▪ Rolling upgrade of clusters
▪ Usually done on a weekly basis
▪ Will wipe and reset nodes one by one
▪ Namespaces update
▪ Nightly updates for production
▪ on-push for integration
▪ Addon update
▪ Addons as helm charts, rendered via helm template and injected via kubectl apply
▪ Done ad-hoc for addon changes without redeployment
17. 1&1 Mail & Media Development & Technology GmbH18.06.18!17
18. 1&1 Mail & Media Development & Technology GmbH
Multi Tenancy
18.06.18!18
▪ Common platform for several teams
▪ PodSecurityPolicies (no-root, no host-net, r/o layers)
▪ Dedicated resources for teams
• Dedicated in-cluster prometheus for scraping
• Configurable log-sink (Elasticsearch, Kafka)
▪ Authentication via OIDC <-> Dex <-> LDAP
▪ Maximum separation between teams targeted
▪ Namespaces are a „managed“ resource
▪ Resource constraints defined centrally per namespace
▪ Users are restricted to their namespaces via RBAC
▪ Network policies
▪ Team-centric „helper“ namespace
• e.g. $team-helper
• Used for managed resources, e.g. team-prometheus
▪ Individual namespaces per (group of) application and stage
• $team-$app-live, $team-$app-prelive
19. 1&1 Mail & Media Development & Technology GmbH
Multi Tenancy
18.06.18!19
▪ Dedicated namespaces for individuals
▪ Purpose: Training, PoC, Experiments
▪ Daily process to read users from LDAP and generate and flush namespaces
▪ Service exposure via central ingress controller (traefik)
20. 1&1 Mail & Media Development & Technology GmbH
Namespace-Config via yaml
18.06.18!20
Rendered via helm
36 resulting manifests:
1 kind: Deployment
1 kind: Ingress
1 kind: Service
2 kind: ConfigMap
2 kind: ServiceAccount
4 kind: LimitRange
4 kind: Namespace
4 kind: ResourceQuota
8 kind: NetworkPolicy
9 kind: RoleBinding
21. 1&1 Mail & Media Development & Technology GmbH
Onboarding & Training
18.06.18!21
▪ 4 training blocks for system administrators (1-2 days each)
▪ Docker & Kubernetes
▪ GoCD & Helm
• Pipeline Design
• Helm Templating
▪ Development Techniques for Ops
• Repositories and versioning
• Secure Software Development Lifecycle
▪ Operating Container Applications
• Monitoring, Logging and Failure Handling
• Operations Lifecycle