Deploying software and controlling infrastructure quickly and safely is a hard task.
In this talk, Brice Fernandes, Customer Success Engineer at Weaveworks, discusses GitOps, an operational model for Kubernetes and beyond to speed up development, while retaining extremely strong security guarantees. Brice describes and shows several open source tools developed at Weaveworks to support this approach. You will have a good idea of how to use the GitOps principles to create software pipelines that are fast, safe, and reproducible, while creating clear and high quality audit trails.
Check out the full presentation on YouTube: https://youtu.be/QdCwUUtcj4I
8. 8
GitOps is...
An operation model
Derived from CS and operation knowledge
Technology agnostic (name notwithstanding)
9. 9
GitOps is...
An operation model
Derived from CS and operation knowledge
Technology agnostic (name notwithstanding)
A set of principles (Why instead of How)
10. 10
GitOps is...
An operation model
Derived from CS and operation knowledge
Technology agnostic (name notwithstanding)
A set of principles (Why instead of How)
Although
Weaveworks
can help
with how
11. 11
GitOps is...
An operation model
Derived from CS and operation knowledge
Technology agnostic (name notwithstanding)
A set of principles (Why instead of How)
A way to speed up your team
25. 25
1 The entire system is described declaratively.
2 The desired system state is versioned
3 Approved changes to the desired state are
automatically applied to the system
4 Software agents ensure correctness
and alert on divergence
37. Typical CICD pipeline
Continuous Integration
Cluster API
Continuous Delivery/Deployment
Container
Registry
CI
Code
Repo
Dev RW
CI credsGit creds
RW
CR creds3
RO
RW
API creds
CR creds1
Shares credentials cross several logical security boundaries.
Boundary
RO RW
Container
Registry (CR)
creds2
38. Cluster API
GitOps pipeline
Container
Registry
CI
Code
Repo
Dev RO
CR creds2
CI credsGit creds
RO
Deploy
CR creds3
RO
RW
Config repo
creds
CR creds1
Credentials are never shared across a logical security boundary.
RW RW
RW
Cluster API
creds
Canonical desired
state store
Config Repo
39. Cluster API
GitOps pipeline
Container
Registry
CI
Code
Repo
Dev RO
CR creds2
CI credsGit creds
RO
Deploy
CR creds3
RO
RW
Config repo
creds
CR creds1
Credentials are never shared across a logical security boundary.
RW RW
RW
Cluster API
creds
Operator RW Config Repo
40. Operator
Cluster API
GitOps pipeline
Container
Registry
CI
Code
Repo
Dev RO
CR creds2
CI credsGit creds
RO
Deploy
CR creds3
RO
RW
Config repo
creds
CR creds1
Credentials are never shared across a logical security boundary.
RW RW
RW
Cluster API
creds
RW Config Repo
Process & constraints
enforcement
41. Operator
Cluster API
GitOps pipeline
Container
Registry
CI
Code
Repo
Dev RO
CR creds2
CI credsGit creds
RO
Deploy
CR creds3
RO
RW
Config repo
creds
CR creds1
Credentials are never shared across a logical security boundary.
RW RW
RW
Cluster API
creds
RW Config Repo
Exceptional auditing
and attribution*
42. 42
● Trivialises rollbacks
● Exceptional auditing and attribution*
● Separation of concerns
● No crossing security boundary
● Process & constraints enforcement
● Great Software ↔ Human collaboration point
● Easy to validate for correctness (Policies)
● System can self heal
Why should we care?
43. 43
● Trivialises rollbacks
● Exceptional auditing and attribution*
● Separation of concerns
● No crossing security boundary
● Process & constraints enforcement
● Great Software ↔ Human collaboration point
● Easy to validate for correctness (Policies)
● System can self heal
Why should we care?
* If you’ve secured your Git repositories properly
45. Move from access to cluster to access to
repository.
...So how to secure your repository?
Moving the burden of security
45
46. Mitigating user impersonation
46
1. Enforce Strong Identity in VCS (GitHub/GitLab)
with GPG Signed Commits *
2. Use Physical GPG Keys to increase security
3. Run GPG-Validating Code in CI
47. 47
Signing each commit is [a bad idea]. It just
means that you automate it, and you make the
signature worth less. It also doesn't add any real
value, since the way the git DAG-chain of SHA1's
work, you only ever need one signature to make
all the commits reachable from that one be
effectively covered by that one. So signing each
commit is simply missing the point.
* A word from above
53. 53
1) Estimated time needed to fix
prod software bugs ~60% less
time after switching
2) Estimated time to respond to
customer requests ~43% less
time after switching
3) Service uptime went to 100%
from 99%
Customer tools: Jenkins, GKE and
Weave Cloud. Apps: web, ML.
The GitOps effect
72. 72
What if you could roll back your sales processes?
What if you could create a pull request on your
organisation policy?
What if you could get it reviewed and approved in an
afternoon?
What if your entire business could be reproduced the
same way a git repository is cloned?
74. 74
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
*“stress-reduced”
76. 76
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
*“stress-reduced”
78. 78
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
*“stress-reduced”
80. 80
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
*“stress-reduced”
83. 83
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
*“stress-reduced”
90. 90
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
*“stress-reduced”
91. 91
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
92. 92
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
Dozens of changes per day
with a very small team
93. 93
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
Dozens of changes per day
with a very small team
Incredibly fast
regression response
94. 94
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
Dozens of changes per day
with a very small team
Incredibly fast
regression response
Permissive approach
to production access
95. 95
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
Dozens of changes per day
with a very small team
Incredibly fast
regression response
Permissive approach
to production access
Excellent developer experience
96. 96
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
Dozens of changes per day
with a very small team
Incredibly fast
regression response
Permissive approach
to production access
Excellent developer experience
Stress-free on-call*
97. 97
Kubernetes operator (Flux, Open Source)
Multiple clusters (staging and prod)
CD into staging
Promotion from staging to prod
Kubernetes
Automated diff tools
(*diff operators, Open Source)
Dashboard definitions in Git
(Grafanalib, Open Source)
Alert definitions in git
Read-only access to production
for all developers
Gated, PR-driven changes to
production
⇒
< 30 minute total disaster recovery
Dozens of changes per day
with a very small team
Incredibly fast
regression response
Permissive approach
to production access
Excellent developer experience
Stress-free on-call*
*“stress-reduced”
98. Where to find out more
98
Search for “Weaveworks GitOps” in your favourite search engine
Take a look at our opensource work on https://github.com/weaveworks
Questions?
Weaveworks
@weaveworks
https://weave.works
Brice Fernandes
@fractallambda
brice@weave.works