A talk about DevOps that I gave at a SysARmy meetup while visiting MuleSoft's Buenos Aires DevOps team. I've been thinking a lot recently about what DevOps is, what it means to be a DevOps Engineer (or in my case a DevOps Engineering Manager). Putting this together was really helpful to clarify some ideas I've been kicking around.
2. A little about me…
• Managing DevOps at MuleSoft
• Senior SRE at Netflix
• Lead SRE at Domino’s Pizza
3. …and what I’m about…
• Running large scale Linux installations (100s to
10000s of instances)
• Maintaining 99.99+ uptime for critical business
services
• Building Systems Engineering teams
6. Proposed Def. 1
Adjective: A buzzword substituted anywhere you
might see the words ‘operations’ or ‘systems’. e.g.:
’operations engineer’ becomes ‘devops engineer’,
‘systems monitoring’ becomes ‘devops monitoring’,
etc.
16. What happened here?
• Dev and Ops have totally different priorities.
• Opposition of priorities leads to inevitable conflict.
• Since neither team is responsible for the whole
project lifecycle, communication breaks.
17. What do?
• This model is flawed by design.
• Ops used to hold all the cards because they
controlled the infrastructure.
• With public cloud, tables are turned.
• If Ops doesn’t start a new conversation, they (we)
will fade into obsolescence.
19. Proposed Defs. 2, 3
Noun: A methodology attempting to close the gap
between Dev and Ops by making everyone
responsible for the whole project lifecycle.
Adjective: A descriptor for staff responsible for
DevOps.
23. Caution…
The goal of DevOps is to make everyone aware and
accountable for the whole product lifecycle, including
release and maintenance.
If we’re not careful, a DevOps engineer becomes an
Ops engineer, and the whole mission is corrupted.
24. Protip
We can take a page from DevOps’ older brother,
Agile.
The path to success as a ‘DevOps Engineer’ is to act
as a DevOps coach for the product team.
25. Centralized vs. Embedded
• I’ve seen both.
• The only problem with a centralized DevOps team
is that it’s actually an Operations team.
• To succeed, it’s absolutely imperative that the
DevOps engineer be part of the product team.
26. Embedding DevOps
• Attend the Dev team’s meetings
• Help with architecture and design (as early as
possible!)
• Assist with troubleshooting and building proof-of-
concept systems
• Generally, be a value-add to the dev effort
27. What about Ops?
Someone still needs to do all that systems & networking
stuff, though…
• Continuous Integration
• Production Changes
• Infrastructure Provisioning and Configuration
• Monitoring & Alerting
• Incident Response
28. Dividing the Kingdom
The right division of responsibility for these tasks is
going to vary depending on your business and your
teams.
Typically, businesses that are more ‘enterprise-y’ need
firmer processes and more division between ops and
dev.
29. Continuous Integration
Continuous Integration (CI) and continuous delivery
(CD) reduce project risk. DevOps is usually well
positioned to help set up and manage these
services.
• Continuous Integration: get notified early when a
change is committed that breaks some downstream
build.
• Continuous Delivery: keep the master branch in a
deployable state at all times, and deploy often.
30. Production Changes
Some change process is always necessary, even if
the process is no process.
• Consumer firms can be really light, like Netflix:
• Devs deploy, changes are audited (project
Chronos) but not approved by anyone.
• Enterprise firms will need change controls,
approvals, etc.
31. Infrastructure Management
The most critical thing is to know exactly what is
running, where it’s at, and what it’s for. Never
provision or support a service you don’t
understand!
To accomplish this, configuration management is
crucial. Develop your templates, commit them to
source control, version them, maintain them.
Public cloud makes it possible to use templates for
infrastructure elements, (e.g. AWS CloudFormation).
32. Monitoring and Alerting
This is a huge topic; we could easily do a few hours just on
monitoring and alerting. Here’s the TL;DR:
• Treat monitoring as testing for your running services &
systems
• Cover several different levels of abstraction
• Aggregate the event streams to one gateway and manage
them there
• The most ‘devops-y’ thing to do is to have developers set
up, receive and respond to their own alerts.
33. Incident Management
Things go wrong. Who responds, how do they handle
it, how do you escalate?
• Process will be entirely dependent on the needs
of the business
• At minimum, though, there should be an
understanding about on-call, escalations,
response process, and investigation process.
38. Takeaways
• DevOps is something a whole organization does.
• The purpose of it is to avoid conflicts caused by the
old division of responsibility.
• No ‘DevOps Engineer’ can do this by themselves.
• ‘DevOps Tools’ are great, but using them doesn’t
mean you’re doing DevOps.
39. The Real Takeaway
If you share responsibility for the entire project with the
whole team, everything else will fall into place.