SlideShare une entreprise Scribd logo
1  sur  41
THE FOLLOWING CONTAINS CONFIDENTIAL INFORMATION.
DO NOT DISTRIBUTE WITHOUT PERMISSION.
Kubernetes Navigation Stories
DevOpsStage 2019
Director of Infrastructure Engineering at thredUP
Senior Engineering Manager at Hotwire
Roman Chepurnyi
Staff Software Engineer at thredUP
Senior Software Engineer at Toptal
Oleksii Asiutin
3
4
ThredUP Technology
100% in k8s since mid-2018
● 70 Software Engineers
● 5 Infrastructure Engineers
● 50 applications
● 100 EC2 nodes
Stack
● NodeJS, react
● Ruby, .NET, Java
● RabbitMQ, SQS
● Redis
● MySQL Aurora
5
CNCF Case study
https://www.cncf.io/thredup-case-study/
6
ThredUP Infrastructure
7
Life after Kubernetes migration
● Fixing shortcuts and gaps
○ IAM
○ Secrets management
● Developers experience
○ Staging environment
○ Local development
● Infrastructure optimization
○ Auto-scaling
○ Spot Instances
○ Security
○ Networking
8
Authentication
Hey Infra team, I need
an access to k8s cluster
Oh my
9
Auth mechanisms
Singed certificates for
everyone
Openid Connect
aws-iam-authenticator
?
10
AWS-IAM-Authenticator
Client
aws-iam-authenticator
binary
API
Server
API
Server
API
Server
AWS-
IAM-
AUTH
AWS-
IAM-
AUTH
AWS-
IAM-
AUTH
Webhook
DaemonSet on master nodes
11
AWS-IAM-Authenticator – kubeconfig
Cluster prod – Role read-only
Cluster stage – Role developer
Cluster dev – Role admin
kubeconfig
12
AWS-IAM-Authenticator – kubeconfig generation
dev
dev lead
infra team
kubeconfig generation service
IAM identity: john-smith
Kubeconfig for dev
IAM identity: lara-jones
Kubeconfig for dev-lead
prod
stage
dev
+ group
kubeconfig
IAM user group
13
Secrets Management
14
Hashicorp Vault
https://www.vaultproject.io/
Init Container
App Container
shared in-memory
volume
app secrets
https://github.com/cruise-automation/daytona
k8s Pod
15
SOPS – Secrets OPerationS
https://github.com/mozilla/sops
Supported formats:
YAML
JSON
.env
# secrets.production.yaml
app_secrets:
db_username: cart_service
db_password: supersecret
16
SOPS – Encryption
$ sops -e --kms <AWS-KMS-ARN> secrets.production.yaml
app_secrets:
db_username:
ENC[AES256_GCM,data:KuhPWLhijVc/9wa6,iv:V7YS/QglsuYwpmBcTZjOwFz8p10yt+qOcRgg+/OL4Uo=,tag
:jchhWABpUVYK4kpRKlrYPQ==,type:str]
db_password:
ENC[AES256_GCM,data:TWjWb4up6nx+gSk=,iv:VoI9vnYrIdYxjTmSsqFzbXZ9z8LsZp4ud8LgVocxGAs=,tag
:PVNKEAq3OvWGiUSmM3aHpw==,type:str]
sops:
kms:
- arn: AWS-KMS-ARN
created_at: '2019-09-26T09:00:30Z'
enc:
AQICAHhGGWsaRwq5wtMieLutm2hnsC2WqAifhQ6HgfjDUdbvpQE5pwGLIOabNseXxCnNWo0YAAAAfjB8BgkqhkiG
9w0BBwagbzBtAgEAMGgGCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMhPJ/IHKNPgmqzN8vAgEQgDvTzDYH71MH
x5nGWHjzNjpNDjnTw3pgS8IPf26qVhcdrO7Uv1g7yjKsJIVdcD00/hSNCgg6+KgulNgHmw==
gcp_kms: []
17
SOPS – helm template and secrets storage
18
SOPS – Deployment
$ sops -d -i ./helm/env/secrets.production.yaml
$ helm upgrade --install --wait --timeout 600 
-f ./helm/env/secrets.production.yaml 
-f ./helm/env/production.yaml 
app_name ./helm/app_name
19
Staging Environments
prepare persistent layer
weekly DB snapshot
store helm charts config override
- values
- tags
- secrets
- dependencies
web-client
content
checkout
backend
setup CD deploy
20
Staging Environment
$git checkout -b devopsstage
$git push -u origin devopsstage
wait 4-5 min
use https://devopsstage.threduptest.com/
21
Local development
When your service has a lot of dependencies (MySQL, Redis, RabbitMQ and 5 other services)
22
Local Development
macbook: Thredup $ git clone git@github.com:thredup/node-proxy.git
Cloning into 'node-proxy'...
...
macbook: Thredup $ cd node-proxy/
macbook: node-proxy (master) $ npm install
added 6 packages from 8 contributors and audited 6 packages in 0.595s
found 0 vulnerabilities
macbook: node-proxy (master) $ npm test
> proxy@1.0.0 test ~/Thredup/node-proxy
...
macbook: node-proxy (master) $ npm start
> proxy@1.0.0 start
> node server.js
23
Local Development with Docker
macbook: Thredup $ docker run -it -v ${PWD}:/app -p 3000:3000
node:12-alpine sh
/ $ apk add --no-cache mysql-dev
/ $ npm install
/ $ npm test
/ $ npm start
> proxy@1.0.0 start
> node server.js
24
Local Development with Docker Compose
version: "3.7"
services:
web:
image: node:12-alpine
volumes:
- ./:/app
ports:
- "3000"
environment:
REDIS_HOST: "127.0.0.1"
mysql:
image: ...
...
redis:
image: ...
25
Local Development with Docker Compose
macbook: Thredup $ docker-compose up -d
…
macbook: Thredup $ docker-compose exec web sh
/ $ npm install
/ $ npm test
/ $ npm start
> proxy@1.0.0 start
> node server.js
26
Local Development with Docker Compose
And then you need another service as a dependency ;-)
...and another one
…
docker-compose.yaml ~ 330 lines
MySQL DB ~25Gb
27
Local Development with Docker Compose
And you need to keep it
UP TO DATE
28
Dynamic Staging Env
Local development - Telepresence
https://www.telepresence.io/
Service A
Service A
Service B Service C Service D
29
Local development with Telepresence
macbook: Thredup $ telepresence --swap-deployment 
deployment-name 
--expose 3000 
--method container 
--docker-run --rm -it -v ${PWD}:/app 
000000000001.dkr.ecr.us-east-1.amazonaws.com/cart:latest
...
...
/ $ npm install
/ $ npm test
/ $ npm start
> proxy@1.0.0 start
> node server.js
30
Horizontal Pod Autoscaling (HPA)
● Do not over-provision
● Be ready for traffic spikes
metrics:
- type: External
external:
metricName: trace.rack.request.hits
metricSelector:
matchLabels:
env : production
service : some-service
targetAverageValue: 10
31
HPA lessons learned
offender pods:
request 1 core
use 3+ cores on start
response time spikes
autoscaling pattern
32
HPA lessons learned
add warmup script
update deployment strategy
33
Cluster autoscaler
● overflow capacity in production
● utilize spot instances
34
Spot instances and AZRebalance
● spot termination works https://github.com/mumoshu/kube-spot-termination-notice-handler
● except when instance is terminated by Availability Zone
Terminating EC2 instance: i-0e685dc2a84b65f63
Cause:CauseAt 2019-07-18T06:09:59Z instances were launched to balance instances in
zones us-east-1a us-east-1e with other zones resulting in more than desired number of
instances in the group. At 2019-07-18T06:11:30Z an instance was taken out of service
in response to a difference between desired and actual capacity, shrinking the
capacity from 4 to 3. At 2019-07-18T06:11:30Z instance i-0e685dc2a84b65f63 was
selected for termination.
35
Spot instances and AZRebalance
metadata:
creationTimestamp: 2017-10-12T16:28:23Z
generation: 2
name: m4xlarge
spec:
image: 405610825889/harden-k8s-x.14-debian-stretch-amd64-hvm-ebs-2019-08-16
machineType: m4.2xlarge
maxPrice: "0.20"
maxSize: 30
minSize: 5
role: Node
rootVolumeSize: 100
subnets:
- us-east-1a
- us-east-1c
- us-east-1e
suspendProcesses:
- AZRebalanceapiVersion: kops/v1alpha2
kind: InstanceGroup
Confidential 36
Container vulnerability scan
https://github.com/arminc/clair-scanner
https://snyk.io/blog/top-ten-most-popular-docker-images-each-contain-at-least-30-vulnerabilities/
Confidential 37
Container runtime security
https://falco.org
https://snyk.io/blog/top-ten-most-popular-docker-images-each-contain-at-least-30-vulnerabilities/
38
Service Mesh
39
Service Mesh
● Visibility
● Simple configuration
● Security (policies, mTLS)
40
What’s next
● Finish Istio rollout
● More security
● Knative builds
● Have fun!
THANK YOU
https://www.thredup.com/devopsstage-2019

Contenu connexe

Tendances

Why I love Kubernetes Failure Stories and you should too - GOTO Berlin
Why I love Kubernetes Failure Stories and you should too - GOTO BerlinWhy I love Kubernetes Failure Stories and you should too - GOTO Berlin
Why I love Kubernetes Failure Stories and you should too - GOTO BerlinHenning Jacobs
 
Network OS Code Coverage demo using Bullseye tool
Network OS Code Coverage demo using Bullseye toolNetwork OS Code Coverage demo using Bullseye tool
Network OS Code Coverage demo using Bullseye toolVikram G Hosakote
 
KubeCon EU 2016: Using Traffic Control to Test Apps in Kubernetes
KubeCon EU 2016: Using Traffic Control to Test Apps in KubernetesKubeCon EU 2016: Using Traffic Control to Test Apps in Kubernetes
KubeCon EU 2016: Using Traffic Control to Test Apps in KubernetesKubeAcademy
 
DCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep diveDCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep diveMadhu Venugopal
 
Production FS: Adapt or die - Claudia Beresford & Tiago Scolar
Production FS: Adapt or die - Claudia Beresford & Tiago ScolarProduction FS: Adapt or die - Claudia Beresford & Tiago Scolar
Production FS: Adapt or die - Claudia Beresford & Tiago ScolarParis Container Day
 
KubeCon EU 2016: A Practical Guide to Container Scheduling
KubeCon EU 2016: A Practical Guide to Container SchedulingKubeCon EU 2016: A Practical Guide to Container Scheduling
KubeCon EU 2016: A Practical Guide to Container SchedulingKubeAcademy
 
Leveraging the Power of containerd Events - Evan Hazlett
Leveraging the Power of containerd Events - Evan HazlettLeveraging the Power of containerd Events - Evan Hazlett
Leveraging the Power of containerd Events - Evan HazlettDocker, Inc.
 
Living the Nomadic life - Nic Jackson
Living the Nomadic life - Nic JacksonLiving the Nomadic life - Nic Jackson
Living the Nomadic life - Nic JacksonParis Container Day
 
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeAcademy
 
Managing GCP Projects with Terraform (devfest Pisa 2018)
Managing GCP Projects with Terraform (devfest Pisa 2018)Managing GCP Projects with Terraform (devfest Pisa 2018)
Managing GCP Projects with Terraform (devfest Pisa 2018)Giovanni Toraldo
 
Load Balancing 101
Load Balancing 101Load Balancing 101
Load Balancing 101HungWei Chiu
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014Amazon Web Services
 
Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014
Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014
Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014Yunong Xiao
 
Nomad, l'orchestration made in Hashicorp - Bastien Cadiot
Nomad, l'orchestration made in Hashicorp - Bastien CadiotNomad, l'orchestration made in Hashicorp - Bastien Cadiot
Nomad, l'orchestration made in Hashicorp - Bastien CadiotParis Container Day
 
[Devconf.cz][2017] Understanding OpenShift Security Context Constraints
[Devconf.cz][2017] Understanding OpenShift Security Context Constraints[Devconf.cz][2017] Understanding OpenShift Security Context Constraints
[Devconf.cz][2017] Understanding OpenShift Security Context ConstraintsAlessandro Arrichiello
 
Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Jace Liang
 
15 kubernetes failure points you should watch
15 kubernetes failure points you should watch15 kubernetes failure points you should watch
15 kubernetes failure points you should watchSysdig
 

Tendances (20)

Why I love Kubernetes Failure Stories and you should too - GOTO Berlin
Why I love Kubernetes Failure Stories and you should too - GOTO BerlinWhy I love Kubernetes Failure Stories and you should too - GOTO Berlin
Why I love Kubernetes Failure Stories and you should too - GOTO Berlin
 
Paris container day june17
Paris container day   june17Paris container day   june17
Paris container day june17
 
Network OS Code Coverage demo using Bullseye tool
Network OS Code Coverage demo using Bullseye toolNetwork OS Code Coverage demo using Bullseye tool
Network OS Code Coverage demo using Bullseye tool
 
KubeCon EU 2016: Using Traffic Control to Test Apps in Kubernetes
KubeCon EU 2016: Using Traffic Control to Test Apps in KubernetesKubeCon EU 2016: Using Traffic Control to Test Apps in Kubernetes
KubeCon EU 2016: Using Traffic Control to Test Apps in Kubernetes
 
From a cluster to the Cloud
From a cluster to the CloudFrom a cluster to the Cloud
From a cluster to the Cloud
 
DCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep diveDCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep dive
 
Production FS: Adapt or die - Claudia Beresford & Tiago Scolar
Production FS: Adapt or die - Claudia Beresford & Tiago ScolarProduction FS: Adapt or die - Claudia Beresford & Tiago Scolar
Production FS: Adapt or die - Claudia Beresford & Tiago Scolar
 
KubeCon EU 2016: A Practical Guide to Container Scheduling
KubeCon EU 2016: A Practical Guide to Container SchedulingKubeCon EU 2016: A Practical Guide to Container Scheduling
KubeCon EU 2016: A Practical Guide to Container Scheduling
 
Leveraging the Power of containerd Events - Evan Hazlett
Leveraging the Power of containerd Events - Evan HazlettLeveraging the Power of containerd Events - Evan Hazlett
Leveraging the Power of containerd Events - Evan Hazlett
 
Living the Nomadic life - Nic Jackson
Living the Nomadic life - Nic JacksonLiving the Nomadic life - Nic Jackson
Living the Nomadic life - Nic Jackson
 
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
 
Managing GCP Projects with Terraform (devfest Pisa 2018)
Managing GCP Projects with Terraform (devfest Pisa 2018)Managing GCP Projects with Terraform (devfest Pisa 2018)
Managing GCP Projects with Terraform (devfest Pisa 2018)
 
Load Balancing 101
Load Balancing 101Load Balancing 101
Load Balancing 101
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
 
Helm intro
Helm introHelm intro
Helm intro
 
Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014
Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014
Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014
 
Nomad, l'orchestration made in Hashicorp - Bastien Cadiot
Nomad, l'orchestration made in Hashicorp - Bastien CadiotNomad, l'orchestration made in Hashicorp - Bastien Cadiot
Nomad, l'orchestration made in Hashicorp - Bastien Cadiot
 
[Devconf.cz][2017] Understanding OpenShift Security Context Constraints
[Devconf.cz][2017] Understanding OpenShift Security Context Constraints[Devconf.cz][2017] Understanding OpenShift Security Context Constraints
[Devconf.cz][2017] Understanding OpenShift Security Context Constraints
 
Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology
 
15 kubernetes failure points you should watch
15 kubernetes failure points you should watch15 kubernetes failure points you should watch
15 kubernetes failure points you should watch
 

Similaire à Kubernetes Navigation Stories – DevOpsStage 2019, Kyiv

[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with PrometheusOpenStack Korea Community
 
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on CloudDayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on CloudJung-Hong Kim
 
Cloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit KubernetesCloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit KubernetesQAware GmbH
 
DCEU 18: Docker Container Networking
DCEU 18: Docker Container NetworkingDCEU 18: Docker Container Networking
DCEU 18: Docker Container NetworkingDocker, Inc.
 
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...Patrick Chanezon
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpNathan Handler
 
Scaling Docker Containers using Kubernetes and Azure Container Service
Scaling Docker Containers using Kubernetes and Azure Container ServiceScaling Docker Containers using Kubernetes and Azure Container Service
Scaling Docker Containers using Kubernetes and Azure Container ServiceBen Hall
 
Drupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google Cloud
Drupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google CloudDrupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google Cloud
Drupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google CloudDropsolid
 
A hitchhiker‘s guide to the cloud native stack
A hitchhiker‘s guide to the cloud native stackA hitchhiker‘s guide to the cloud native stack
A hitchhiker‘s guide to the cloud native stackQAware GmbH
 
A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17
A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17
A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17Mario-Leander Reimer
 
Kubernetes - training micro-dragons without getting burnt
Kubernetes -  training micro-dragons without getting burntKubernetes -  training micro-dragons without getting burnt
Kubernetes - training micro-dragons without getting burntAmir Moghimi
 
Developer Experience Cloud Native - Become Efficient and Achieve Parity
Developer Experience Cloud Native - Become Efficient and Achieve ParityDeveloper Experience Cloud Native - Become Efficient and Achieve Parity
Developer Experience Cloud Native - Become Efficient and Achieve ParityMichael Hofmann
 
Running MongoDB Enterprise on Kubernetes
Running MongoDB Enterprise on KubernetesRunning MongoDB Enterprise on Kubernetes
Running MongoDB Enterprise on KubernetesAriel Jatib
 
Real World Lessons on the Pain Points of Node.JS Application
Real World Lessons on the Pain Points of Node.JS ApplicationReal World Lessons on the Pain Points of Node.JS Application
Real World Lessons on the Pain Points of Node.JS ApplicationBen Hall
 
DevOpsDays Taipei 2017 從打鐵到雲端
DevOpsDays Taipei 2017 從打鐵到雲端DevOpsDays Taipei 2017 從打鐵到雲端
DevOpsDays Taipei 2017 從打鐵到雲端Hung-Yen Chen
 
DockerCon EU '17 - Dockerizing Aurea
DockerCon EU '17 - Dockerizing AureaDockerCon EU '17 - Dockerizing Aurea
DockerCon EU '17 - Dockerizing AureaŁukasz Piątkowski
 
Assisted-Installer-DevConf-US-2021
Assisted-Installer-DevConf-US-2021Assisted-Installer-DevConf-US-2021
Assisted-Installer-DevConf-US-2021Nir Magnezi
 
"Look Ma, no hands! Zero Touch Provisioning for OpenShift" DevConf.US 2021
"Look Ma, no hands! Zero Touch Provisioning for OpenShift" DevConf.US 2021"Look Ma, no hands! Zero Touch Provisioning for OpenShift" DevConf.US 2021
"Look Ma, no hands! Zero Touch Provisioning for OpenShift" DevConf.US 2021Freddy Rolland
 
Tensorflow in Docker
Tensorflow in DockerTensorflow in Docker
Tensorflow in DockerEric Ahn
 
Docker for mac & local developer environment optimization
Docker for mac & local developer environment optimizationDocker for mac & local developer environment optimization
Docker for mac & local developer environment optimizationRadek Baczynski
 

Similaire à Kubernetes Navigation Stories – DevOpsStage 2019, Kyiv (20)

[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
 
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on CloudDayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
 
Cloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit KubernetesCloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit Kubernetes
 
DCEU 18: Docker Container Networking
DCEU 18: Docker Container NetworkingDCEU 18: Docker Container Networking
DCEU 18: Docker Container Networking
 
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
 
Scaling Docker Containers using Kubernetes and Azure Container Service
Scaling Docker Containers using Kubernetes and Azure Container ServiceScaling Docker Containers using Kubernetes and Azure Container Service
Scaling Docker Containers using Kubernetes and Azure Container Service
 
Drupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google Cloud
Drupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google CloudDrupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google Cloud
Drupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google Cloud
 
A hitchhiker‘s guide to the cloud native stack
A hitchhiker‘s guide to the cloud native stackA hitchhiker‘s guide to the cloud native stack
A hitchhiker‘s guide to the cloud native stack
 
A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17
A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17
A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17
 
Kubernetes - training micro-dragons without getting burnt
Kubernetes -  training micro-dragons without getting burntKubernetes -  training micro-dragons without getting burnt
Kubernetes - training micro-dragons without getting burnt
 
Developer Experience Cloud Native - Become Efficient and Achieve Parity
Developer Experience Cloud Native - Become Efficient and Achieve ParityDeveloper Experience Cloud Native - Become Efficient and Achieve Parity
Developer Experience Cloud Native - Become Efficient and Achieve Parity
 
Running MongoDB Enterprise on Kubernetes
Running MongoDB Enterprise on KubernetesRunning MongoDB Enterprise on Kubernetes
Running MongoDB Enterprise on Kubernetes
 
Real World Lessons on the Pain Points of Node.JS Application
Real World Lessons on the Pain Points of Node.JS ApplicationReal World Lessons on the Pain Points of Node.JS Application
Real World Lessons on the Pain Points of Node.JS Application
 
DevOpsDays Taipei 2017 從打鐵到雲端
DevOpsDays Taipei 2017 從打鐵到雲端DevOpsDays Taipei 2017 從打鐵到雲端
DevOpsDays Taipei 2017 從打鐵到雲端
 
DockerCon EU '17 - Dockerizing Aurea
DockerCon EU '17 - Dockerizing AureaDockerCon EU '17 - Dockerizing Aurea
DockerCon EU '17 - Dockerizing Aurea
 
Assisted-Installer-DevConf-US-2021
Assisted-Installer-DevConf-US-2021Assisted-Installer-DevConf-US-2021
Assisted-Installer-DevConf-US-2021
 
"Look Ma, no hands! Zero Touch Provisioning for OpenShift" DevConf.US 2021
"Look Ma, no hands! Zero Touch Provisioning for OpenShift" DevConf.US 2021"Look Ma, no hands! Zero Touch Provisioning for OpenShift" DevConf.US 2021
"Look Ma, no hands! Zero Touch Provisioning for OpenShift" DevConf.US 2021
 
Tensorflow in Docker
Tensorflow in DockerTensorflow in Docker
Tensorflow in Docker
 
Docker for mac & local developer environment optimization
Docker for mac & local developer environment optimizationDocker for mac & local developer environment optimization
Docker for mac & local developer environment optimization
 

Dernier

All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445ruhi
 
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night StandHot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Standkumarajju5765
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Call Girls in Nagpur High Profile
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.soniya singh
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...Neha Pandey
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.CarlotaBedoya1
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.soniya singh
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024APNIC
 
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)Delhi Call girls
 

Dernier (20)

Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night StandHot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
 
Russian Call Girls in %(+971524965298 )# Call Girls in Dubai
Russian Call Girls in %(+971524965298  )#  Call Girls in DubaiRussian Call Girls in %(+971524965298  )#  Call Girls in Dubai
Russian Call Girls in %(+971524965298 )# Call Girls in Dubai
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
 

Kubernetes Navigation Stories – DevOpsStage 2019, Kyiv

  • 1. THE FOLLOWING CONTAINS CONFIDENTIAL INFORMATION. DO NOT DISTRIBUTE WITHOUT PERMISSION. Kubernetes Navigation Stories DevOpsStage 2019
  • 2. Director of Infrastructure Engineering at thredUP Senior Engineering Manager at Hotwire Roman Chepurnyi Staff Software Engineer at thredUP Senior Software Engineer at Toptal Oleksii Asiutin
  • 3. 3
  • 4. 4 ThredUP Technology 100% in k8s since mid-2018 ● 70 Software Engineers ● 5 Infrastructure Engineers ● 50 applications ● 100 EC2 nodes Stack ● NodeJS, react ● Ruby, .NET, Java ● RabbitMQ, SQS ● Redis ● MySQL Aurora
  • 7. 7 Life after Kubernetes migration ● Fixing shortcuts and gaps ○ IAM ○ Secrets management ● Developers experience ○ Staging environment ○ Local development ● Infrastructure optimization ○ Auto-scaling ○ Spot Instances ○ Security ○ Networking
  • 8. 8 Authentication Hey Infra team, I need an access to k8s cluster Oh my
  • 9. 9 Auth mechanisms Singed certificates for everyone Openid Connect aws-iam-authenticator ?
  • 11. 11 AWS-IAM-Authenticator – kubeconfig Cluster prod – Role read-only Cluster stage – Role developer Cluster dev – Role admin kubeconfig
  • 12. 12 AWS-IAM-Authenticator – kubeconfig generation dev dev lead infra team kubeconfig generation service IAM identity: john-smith Kubeconfig for dev IAM identity: lara-jones Kubeconfig for dev-lead prod stage dev + group kubeconfig IAM user group
  • 14. 14 Hashicorp Vault https://www.vaultproject.io/ Init Container App Container shared in-memory volume app secrets https://github.com/cruise-automation/daytona k8s Pod
  • 15. 15 SOPS – Secrets OPerationS https://github.com/mozilla/sops Supported formats: YAML JSON .env # secrets.production.yaml app_secrets: db_username: cart_service db_password: supersecret
  • 16. 16 SOPS – Encryption $ sops -e --kms <AWS-KMS-ARN> secrets.production.yaml app_secrets: db_username: ENC[AES256_GCM,data:KuhPWLhijVc/9wa6,iv:V7YS/QglsuYwpmBcTZjOwFz8p10yt+qOcRgg+/OL4Uo=,tag :jchhWABpUVYK4kpRKlrYPQ==,type:str] db_password: ENC[AES256_GCM,data:TWjWb4up6nx+gSk=,iv:VoI9vnYrIdYxjTmSsqFzbXZ9z8LsZp4ud8LgVocxGAs=,tag :PVNKEAq3OvWGiUSmM3aHpw==,type:str] sops: kms: - arn: AWS-KMS-ARN created_at: '2019-09-26T09:00:30Z' enc: AQICAHhGGWsaRwq5wtMieLutm2hnsC2WqAifhQ6HgfjDUdbvpQE5pwGLIOabNseXxCnNWo0YAAAAfjB8BgkqhkiG 9w0BBwagbzBtAgEAMGgGCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMhPJ/IHKNPgmqzN8vAgEQgDvTzDYH71MH x5nGWHjzNjpNDjnTw3pgS8IPf26qVhcdrO7Uv1g7yjKsJIVdcD00/hSNCgg6+KgulNgHmw== gcp_kms: []
  • 17. 17 SOPS – helm template and secrets storage
  • 18. 18 SOPS – Deployment $ sops -d -i ./helm/env/secrets.production.yaml $ helm upgrade --install --wait --timeout 600 -f ./helm/env/secrets.production.yaml -f ./helm/env/production.yaml app_name ./helm/app_name
  • 19. 19 Staging Environments prepare persistent layer weekly DB snapshot store helm charts config override - values - tags - secrets - dependencies web-client content checkout backend setup CD deploy
  • 20. 20 Staging Environment $git checkout -b devopsstage $git push -u origin devopsstage wait 4-5 min use https://devopsstage.threduptest.com/
  • 21. 21 Local development When your service has a lot of dependencies (MySQL, Redis, RabbitMQ and 5 other services)
  • 22. 22 Local Development macbook: Thredup $ git clone git@github.com:thredup/node-proxy.git Cloning into 'node-proxy'... ... macbook: Thredup $ cd node-proxy/ macbook: node-proxy (master) $ npm install added 6 packages from 8 contributors and audited 6 packages in 0.595s found 0 vulnerabilities macbook: node-proxy (master) $ npm test > proxy@1.0.0 test ~/Thredup/node-proxy ... macbook: node-proxy (master) $ npm start > proxy@1.0.0 start > node server.js
  • 23. 23 Local Development with Docker macbook: Thredup $ docker run -it -v ${PWD}:/app -p 3000:3000 node:12-alpine sh / $ apk add --no-cache mysql-dev / $ npm install / $ npm test / $ npm start > proxy@1.0.0 start > node server.js
  • 24. 24 Local Development with Docker Compose version: "3.7" services: web: image: node:12-alpine volumes: - ./:/app ports: - "3000" environment: REDIS_HOST: "127.0.0.1" mysql: image: ... ... redis: image: ...
  • 25. 25 Local Development with Docker Compose macbook: Thredup $ docker-compose up -d … macbook: Thredup $ docker-compose exec web sh / $ npm install / $ npm test / $ npm start > proxy@1.0.0 start > node server.js
  • 26. 26 Local Development with Docker Compose And then you need another service as a dependency ;-) ...and another one … docker-compose.yaml ~ 330 lines MySQL DB ~25Gb
  • 27. 27 Local Development with Docker Compose And you need to keep it UP TO DATE
  • 28. 28 Dynamic Staging Env Local development - Telepresence https://www.telepresence.io/ Service A Service A Service B Service C Service D
  • 29. 29 Local development with Telepresence macbook: Thredup $ telepresence --swap-deployment deployment-name --expose 3000 --method container --docker-run --rm -it -v ${PWD}:/app 000000000001.dkr.ecr.us-east-1.amazonaws.com/cart:latest ... ... / $ npm install / $ npm test / $ npm start > proxy@1.0.0 start > node server.js
  • 30. 30 Horizontal Pod Autoscaling (HPA) ● Do not over-provision ● Be ready for traffic spikes metrics: - type: External external: metricName: trace.rack.request.hits metricSelector: matchLabels: env : production service : some-service targetAverageValue: 10
  • 31. 31 HPA lessons learned offender pods: request 1 core use 3+ cores on start response time spikes autoscaling pattern
  • 32. 32 HPA lessons learned add warmup script update deployment strategy
  • 33. 33 Cluster autoscaler ● overflow capacity in production ● utilize spot instances
  • 34. 34 Spot instances and AZRebalance ● spot termination works https://github.com/mumoshu/kube-spot-termination-notice-handler ● except when instance is terminated by Availability Zone Terminating EC2 instance: i-0e685dc2a84b65f63 Cause:CauseAt 2019-07-18T06:09:59Z instances were launched to balance instances in zones us-east-1a us-east-1e with other zones resulting in more than desired number of instances in the group. At 2019-07-18T06:11:30Z an instance was taken out of service in response to a difference between desired and actual capacity, shrinking the capacity from 4 to 3. At 2019-07-18T06:11:30Z instance i-0e685dc2a84b65f63 was selected for termination.
  • 35. 35 Spot instances and AZRebalance metadata: creationTimestamp: 2017-10-12T16:28:23Z generation: 2 name: m4xlarge spec: image: 405610825889/harden-k8s-x.14-debian-stretch-amd64-hvm-ebs-2019-08-16 machineType: m4.2xlarge maxPrice: "0.20" maxSize: 30 minSize: 5 role: Node rootVolumeSize: 100 subnets: - us-east-1a - us-east-1c - us-east-1e suspendProcesses: - AZRebalanceapiVersion: kops/v1alpha2 kind: InstanceGroup
  • 36. Confidential 36 Container vulnerability scan https://github.com/arminc/clair-scanner https://snyk.io/blog/top-ten-most-popular-docker-images-each-contain-at-least-30-vulnerabilities/
  • 37. Confidential 37 Container runtime security https://falco.org https://snyk.io/blog/top-ten-most-popular-docker-images-each-contain-at-least-30-vulnerabilities/
  • 39. 39 Service Mesh ● Visibility ● Simple configuration ● Security (policies, mTLS)
  • 40. 40 What’s next ● Finish Istio rollout ● More security ● Knative builds ● Have fun!

Notes de l'éditeur

  1. [Roman] Let me introduce Oleksii - staff engineer at ThredUP. Oleksii is an infrastructure enthusiast, he is an co-organizer of monthly devops digest on dou.ua, he likes sportcars and runs instagram account dedicated to cooking [Olek] Thank you Roman. Roman is a Director of our distributed Infrastructure team. I'd say Roman is a leader, he manages us in a way we can bring innovations in our company platform. Before thredUP Roman worked at one of the biggest hotel discounts aggregator – Hotwire. He lives in California Roman is as confident navigating Kubernetes as navigating a sailing boat in San Francisco bay during weekends. Great to have Roman at the helm! I know it personally.
  2. Switching to case studies. Think abot how to do it.
  3. [Olek] In: access Mid: danger of shared root key Out: granular permissions Okay, here was a brief introduction, it's over and now it comes the navigation stories itself. Like Roman told us one day you wake up and realize you migrated your infra to k8s and yeah it's cool. But during the migration you cut corners and now it's probably the time to review and fill some gaps. Lots of us been in this situation - Hey Infra team, I need an access to a Kubernetes cluster. Really? What are you going to do there? When we created our k8s clusters we used shared admin certificate inside the team. And on early stages we gave it also to Engineers who asked for an access. Okay, here it is, but please, use it carefully. Aha. And then, you know... Guys, Checkout is down, where is our checkout service? Guys? Oh, I might have deleted it on prod instead of dev, ouch. So we need to organize users in groups maybe and give them granular permissions per cluster.
  4. [Olek] In: granular access Mid: certs, openid - no Out: aws-auth - yes, review For authorization we use RBAC, it's defacto standard for k8s now. We can create user groups and separate permissions with it. We've reviewed multiple authentication mechanisms for users. We started with shared root certificate as I said before, and we realized that it would be hard to create a separate certificate for each user (mainly because k8s does not support certificate revocation policy). After that we've reviewed openid connect mechanism. It works fine and it's good, but the downside for us was that our single sign-on provider does not provided user groups support with openid, so it's possible to authorize a user but you can not get groups and we need it for our ACL Finally we stopped at the tool which name nowadays is aws-iam-authenticator. The days we implemented auth in k8s its name was heptio-authenticator. Nowadays it's the default auth method in AWS EKS and GCE and Azure have the same tools for their platforms. Lets briefly review how it works.
  5. [Olek] Our kubectl auth config uses tokens, which are gnerated by client aws-auth binary. The token is generated base on your AWS credentials and contains a cluster name and a role. For simplicity lets assume a role represents user group here. So if you're cluster admin you specify adin role, if you're read-only user you have a different one. Then you send your API token to the k8s server and the server has webhook configured to talk to a daemonset, it checks if user is allowed to use the role from token. If everything is okay user is successfully authenticated and proper user groups is assigned for it's session.
  6. [Olek] So basically for every our cluster user needs a proper IAM role arn. For example for prod cluster user can access with read-only permissions but for development cluster the user has admin role. And it's actually not what our engineers should care about maintaining their local kubeconfig.
  7. [Olek] So we created a service which generates kubeconfig based on user AWS IAM credentials. So for now a user executed one-liner shell script in a terminal and then our engineer ha a cronjob installed which generates or re-generates kubeconfig periodically. Why did we implement this as a cronjob? Time to time we can update our group hierarchy, add or remove users from groups and in case of a cronjob these changes are deployed to users machines automatically. [CONCLUSION]. What did our engineers get from it? Everyone has a kubectl with kubeconfig fully managed by infrastructure team. And Infra team has a visibility and control in terms of identity and access management. So we applied IAM best practices into k8s auth management in that way.
  8. [Olek] So here is our secrets management evolution path. It looks a little bit strange from the first view but let me explain why is that.
  9. [Olek] we setup Hashicorp Vault, we love it and it’s super-cool and gives you everything you need: Secrets management, good security level, infra perks. Here is how we work with it, so we have init container which grabs all necessary secrets and puts it into shared volume, then main service container reads it from the volume and initializes it’s env vars with secrets values. There is even an open-source project for the init container called Daytona. We have Vault setup in our clusters and it can be used by our Engineers but in fact it didn’t get a good spread. Maybe it’s because Engineers didn’t have enough time to dive into it, maybe it’s because of our not so good guides. We succeeded in setting it up but we failed in spreading it and make our colleagues to use it. Our Engineers did not add secrets to Vault and did not use it. So we started to investigate further.
  10. [Olek] and we stopped at SOPS project which is Secrets OPerationS. It’s a simple and flexible tools for managing secrets. What it does is text files encryption and decryption with support of YAML, JSON and .env formats. It supports AWS, GCE and Azure key management systems and old-straight PGP encryption. Here is an example of a yaml file containing database credentials for a service.
  11. [Olek] And here is how this file looks after encryption, so we have key by key encryption instead of the whole file.
  12. [Olek] We deploy our services with helm package manager and for helm release we specify both unencrypted values with generic release configuration and sops encrypted values which are used to create secrets. You can see an example of helm template for secrets creation.
  13. [Olek] So to deploy a service all you need to do is decrypt your github stored secrets first, and then run a helm release. This solution get good adoption among our engineers and it turned out to be more popular than vault. It might be simpler also, sops turned out to be more developer friendly in thredUP. That way we moved from fault-intolerant de-synchronized and non-manageable manual secret creation to fully predictable and monitored secret management solution, filling one more migration gap.
  14. All helm charts are available.. We can use them to run on-demand staging setup
  15. advantages : 1) always up-to-date with latest code and data 2) scalable
  16. [Olek] Okay, so we just told you how we manage dynamic stagings so engineers can present results of their work to other coworkers. But where do engineers spend most of their working hours? It's local development, when you write a code on your laptop, when you run the tests and do debugging.
  17. [Olek] And when we talk about local development the real basic workflow is just to clone a git repo, install dependencies and run the service (lets assume it's a web application). Here is an example of doing that way with nodejs. BUT it's not that simple in real world, right?
  18. [Olek] When you install a service it might has native extensions in dependencies. And in that case you might need to install specific libraries on your machine. It's okay if there is a good guide on how to do it and if it's libraries don't have conflicts with another service libraries, and another, and because we have this trendy microservices architecture – and another serivce libraries. It becomes cumbersome to setup it on local machine and ... it's good we have such thing as Docker. So you create a Docker container from nodejs image mapping your codebase and ports to work with, install all neccessary libraries and do the same stuf as you did locally. And everything is fine, you are good to go. Not really.
  19. [Olek] So we moved from literally operating system native development to containerized development, what's next? It's probably convenient to use docker-compose to setup service dependencies. Usually it's a database, a caching layer, queues, workers.
  20. [Olek] Then you run it, it works, it’s convenient to use it locally.
  21. [Olek] Until your docker-compose file becomes 300+ lines long and your local database is 25Gb heavy.
  22. [Olek] Why is it hard and unconvenient? Because you have to keep your local env up to date, because it consumes a lot of resources (we do have powerful laptops but even they have problems with resource consumption time to time). And if you have some issues with some service it’s hard to