6. build the right thing
every business idea is
a hypothesis until you
get user feedback
7. corollary: don’t waste money on the wrong thing
Standish Group: how often features are used
8. constant user connection
by releasing everyday:
your users can be delighted by new stuff all the time
your users get the feeling you are reacting to what they want
9. ability to move quickly
react to the market
explore new revenue streams
10. better aligned people
development & operations close to the business
IT no longer perceived as
the bottleneck
12. time
uh oh
problem happens
wheeew
problem solved
TTR (time to recover)
time to figure
out cause of
problem
time to
fix
problem
>>
adapted from John Allspaw
15. fast, automated feedback on the
production readiness of your
applications every time there is a
change
whether code, infrastructure, configuration or database
Jez Humble
16. continuous delivery
software always production ready
releases tied to business needs, not IT constraints
minimize the lead time from idea to live
concept to cash
time
small feature chunks
17. systems thinking is [a philosophy] based on the
belief that the component parts of a system can best
be understood in the context of relationships with
each other and with other systems, rather than in
isolation.
18. value stream mapping
process tool for improving the flow from a customer request to
the fulfillment (concept to cash)
emphasis is on improving the whole not just the parts
uses quantitative data to identify waste and inefficiency
originally developed by toyota to assist in the improvement of
manufacturing and supply-chain processes
24. full production pipeline
automated implementation of your system’s
build, deploy, test, approval processes
visibility
traceability
compliance
25. treat everything like code
check in, automate, test in CI, promote in deployment pipeline
• database: DDL & static data
• deployment automation
• infrastructure/configuration mgmt
• monitoring configuration
27. adhere to the test pyramid
Adapted from Mike Cohn (Automated Test Pyramid)
and Lisa Crispin & Janet Gregory (Agile Testing)
28. trunk based development
P1
P1 P2
P3
P4
P4
P3 P4 P5
G1
P2
G2
P3 G3
G1 G2 G3 G4
G5 G6
B1 B2
G1
P1
B1
B1
P1-2
G2
G2
P3
B2
B2
G3
G3
P4 P5
P4-5
G4
G4 G5 G6
P2
Professor Plum
Mainline
Reverend Green
branches discourage refactoring
branches delay integration and hide risk
merging wastes time and is tedious
29. trunk based development
feature toggles let you deploy incomplete features
turned off
branch-by-abstraction lets you make architectural
changes
30. consistency from development to production
accidental
inconsistency
necessary
inconsistency
>>
deployment process
environment configuration
testing tools
31. pull itops onto the delivery team
sit together: biz dev, qa & sysadmin
share KPIs for stability and change
same story wall and iterations
32. concerns
reliability & stability
compliance & traceability
releasing 10 times/day
• don’t need to, just keep your code always production releasable
complexity of my systems
• its about continuous improvement. start with low hanging fruit
it will take investment
• yes it will, but it will also start paying dividends fairly quickly
Going to talk about what, why and how
Want you to walk away with good pic of CD
And some ideas & questions for your organization
Aol story
- 2 month release delay to a project
- dev was done on windows, QA and prod were Solaris so deployment slow and error-prone
big impact to delivery: if it takes days or weeks to deploy to qa, that means the feedback from qa is massively delayed
- system we were replacing took a weekend to deploy. That meant several people from ops & dev … missing beautiful, well overcast, weekends in london
- conan the deployer script – started out telling people what to do, then automated the steps
that was great for them and it made a difference to the business being able to execute
Must be a better way … Lunch with vp ops for yahoo. Asked him about purchase of flickr ….
Flickr story
So what are the key characteristics of what flickr (and facebook and other web2.0 companies) are doing …
So what they are doing is pushing very small chunks into production. Changes so small that they release 10 times a day.
So what are the business drivers that led flickr and other web2.0 companies to this approach?
you don’t know if the hypothesis is correct until you run it against real users in production.
flickr started off as a gaming site with a little side feature that allowed people to store and share photos
another company started off making basic interpreters and then discovered the money was in o/s
chances are high that your idea is wrong. every day you spend building it before trialing the more money you are likely wasting
Some companies can afford to build the wrong thing.
Well, maybe they can afford it.
users will want to come back
even if you aren’t solving their particular problem, they can see you improving things all the time
We all live in markets that change incredibly quickly. It is critical to our survival that we can move fast.
Facebook expects every new developer to push code to production their first day on the job. If they can do that, imagine how fast they can react to change.
Beyond moving quickly, Facebook has another desire. They want to ensure their devs are close to the users …
If you are delivering quickly, your dev & ops people are going to need to be side-by-side your business people. They need to be organized by business goals, just like the business people. This aligns their day-in and day-out work to the business, not an abstract IT sub-department. And this allows them to deliver fast. Dev & Ops folks make many business decisions every day, not just technical decisions. If they are close to the business, if they understand the business they can make smarter decisions. You would be shocked how smart techies can be, even in business.
Some people (enterprisey ones) worry that doing this will damage reliability and stability. people depend on our apps to do their jobs, not just to view vacation pictures. this is exactly the objection yahoo raised with flickr. so back to our story. flickr responded with a challenge. let's compare the data on production uptime for the past year on all yahoo properties with flickr. who had the best uptime? turned out that flickr did … why?
if you release every 3 months there is a huge delta and that means large risk
usual reaction is to lock down, add more signoffs and release less frequently
but that is wrong. you should release more frequently for 2 reasons:
1) you have shorter MTTR
Explain pic
2) If something is difficult, do it early and often. Then you will get very good at it.
the only measure of done is live
people talk about 30% done or dev-complete, but it is usually misleading
done, done done, done done done
to everyone in delivery process: dev, tester, dba, sysadmin, ba, etc
from: compilation, unit tests, functional tests, integration tests, performance tests, etc
fast feedback turns out to be the secret. When you achieve that you …
move away from a model where releases go out monthly or quarterly or whatever. To a model where you are constantly delivering value to production from the beginning of the project.
2 beneficial effects of fast, automated feedback
- your software is always production ready. you should be able to press a button and release your software to production at any time
- and crucially, when you press that button, it should be a decision of the business, we shouldn’t be constrained by thinking you have to wait for the next release window in a month. it should be possible for the business to say they want a feature, it gets built in a week or two and then they can push it into production. and this is a big change, a strategic benefit for the business. if you are the business and you see a market opportunity and can react and get something out in a couple weeks, then you have incredible strategic advantage over your competitors.
requires big changes in an organization. so how do we approach this?
So what is the first step?
Show of hands – who here has done a value stream map? This came out of Toyota back in the late 90’s. It can really help an organization look for friction points.
Start by mapping how your organization delivers, who is involved, how do changes get promoted downstream, etc
Collect metrics: lead time, process time, % complete and accurate (how often are stories ready for dev when they get to dev? Bugs found in ready for QA?)
This will tell you where your constraints are in your delivery process. Remember that the reason uat is slow might be because of something done in coding (a completely different step)
Find the bottlenecks, improve them, iterate
This is about continuous improvement. You are never done.
I’m only going to talk about the part from coding to release. Now that might not be the constraint in your process. It might be how you manage requirements and ideas. You should look at your highest priority constraints first.
Let’s assume your constraint is in here (yellow boxes). How do you improve the delivery process.
back to the year 2000. non-techies don’t worry this is the only code quiz in my presentation.
TW’s most important 4 lines of code – what is it?
It says that every 60 seconds check source control for changes and if they exist then build and test
Who here is doing CI?
Who’s doing trunk based development?
Does integrated mean finished?
No we aren’t. There are a lot of things that need to happen before production.
If you look at this pipeline though you will notice a key characteristic. The further to the left you are the faster the feedback. The absolute fastest is a local dev build. Then comes CI.
So we want to move as much stuff from the right to the left as possible. That will give us the best fast feedback.
And we want to make moving through the pipeline as frictionless as possible so that we can do it as quickly as possible. So we will want to automate the pipeline as much as possible …
this gives you: you know who did what, when, where and why. from requirements to dev to test to production.
you can see for every checkin which stage of the delivery process it passed: compile, package, publish, unit, smoke, regression,
you can see what other components it integrates with and depends on
you can see for all changes: code, db, config, infra, whether it breaks something
operations people can see what is coming down the pipeline
testers can see what is ready to deploy
So what are key practices that support this:
Database migrations
Infrastructure as code
Automated monitoring
naming machines as an antipattern
do nothing manually. use automation to ensure they are identical
even logging into a server is an antipattern
consider phoenix server concept – never maintain, just redeploy
This slide may start a holy war …
Hands up if you do CI (keep hands up)
Put your hands down if you don’t check into mainline at least once a day
Put your hands down if you use feature branches that aren’t merged into mainline daily
One company, the first task in a story always was to create a flag to turn off functionality
Remember to delete flags!!!!
While keeping systems releasable the whole time!!!!
We deployed this morning with one of our more Enterprise clients. They’re actually getting better, but they’re not done with their CD journey yet. The deploy was at 4 AM – my alarm went off at 3. And one of the first things we said as we dialed in was “I wonder what the surprise will be!” – because production isn’t the same as staging by a long shot.
DevOps is a hot topic these days. We think its more of a culture, a set of principles than a methodology – but its important. You can start really easily by getting your dev team to invite the ops team to lunch.