SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
DevOpsDays Austin 2013
@ernestmueller| @bazaarvoice
2012: A Release Odyssey
Hi, I’m Ernest Mueller from Bazaarvoice here in Austin. We’re the biggest SaaS company you’ve never heard of;
our primary application is for the collection and display of user generated content – for example, ratings and
reviews – and a lot of the biggest Internet retailers use our solution on their sites for that purpose. We pushed
out more than 1bn reviews last Cyber Monday. I’m going to tell you how we went from releasing our code once
every ten weeks to once a week in a pretty short time.
The Monolith! Bazaarvoice Conversations, aka PRR, has 15,000 files and 4.9M lines of code, the oldest from Feb
2006, and that’s not counting UI versions, customer config, or operations code repos (all of which get released
along with it). Written by generations of coders, including outsourcer partners.
It runs across 1200 hosts in 4 datacenters; Rackspace, and AWS East, West, and Ireland.
So by any measure this was a large legacy system.
BV had gone agile and said “Let’s release more quickly too! All the cool kids are doing it! We’re doing two week
sprints, so let’s release biweekly - go!
They tried it two weeks after a big ten-week release, and PRR v5.1 launched on January 19th, 2012.
Whoops, it’s not that easy - 44 client tickets logged, mass hysteria. “Let’s not do that again!”
Enter yours truly on January 30th. “You’re hired! We want biweekly releases in a month. With zero user facing
downtime. Failure is not an option! Go!”
It wasn’t just an irrational need for speed, the product organization wanted to get faster A/B testing, more
piloting, etc. and the engineering team wanted the benefits of a more continuous flow as well.
Careful analysis of the situation was warranted. Luckily a SWAT team had been analyzing the problem already.
The two major impediments, which are frequently encountered factors in legacy implementations:
• Lack of automation in testing - testing was a huge burden and couldn’t be done sufficiently in the time
allotted
• Poor SCM code discipline - checkins continuing up to the release
Path One - Testing! We hired up QA automation people and set them to work. We set the expectation, backed
up strongly by the product team, that the development teams had to stop and do three testing sprints. We have
a standard four-environment setup - dev, QA, staging, production.
JUnit testing and CIT testing in TeamCity was ramped up.
A selenium-based “Testmaster” system was used to improve the level of regression automation to safe levels.
More importantly perhaps, a new discipline of not running all the tests all the time - feature/story in dev,
regression in QA, smoke testing in staging and production
Branching - changed over to a trunk/release branch model, splits off every 2 weeks, no commits to branch
without going through a code freeze break process. Process enforcement via wiki!
Trunk goes to dev twice daily, branch goes to QA, when labeled “verified” it goes to staging and then to
production.
We also had a team write a feature flagging system, like the cool kids use, so we could launch features dark and
then enable them later.
We made the rule that all new features must be launched dark.
We couldn’t fix a couple things in time.
Our Solr indexes are 20 GB and reindexing and distributing them, while doing a zero downtime deployment and
keeping replication lag down needed more engineering.
And our build and deploy system was pretty bad. It’s buzzword compliant - svn, TeamCity, maven, yum,
puppet, rundeck, noah, but it’s actually a bit of spaghetti mess in a big crufty bash framework; builds take more
than an hour and deploys take 3+ hours.
We got a delay of game due to our IPO and then were “no go” March 1. We were under a lot of management
pressure to ship, but tests weren’t passing and at the new go/no-go meeting the dev managers sucked it up and
declared “no go.”
First biweekly release - PRR 5.2 went out on March 6, 5 days late. 5 issues were reported by customers.
5.3 went out March 22, 1 issue reported. 5.4 went out April 5, zero issues reported. I kept in depth release
metrics - number of checkins, number of process faults, number of support tickets - and they showed consistent
improvement.
It took a lot of collaboration and good old fashioned project management. Product, QA, DevOps, various
engineering teams, Support, and other stakeholders had to all get on the same page.
We didn’t really change tooling besides adding the feature flagging - still Confluence, JIRA, and all our other
tools - just using them more effectively.
http://www.flickr.com/photos/senorwences/2366892425/
And the release train kept spinning. We had one major disaster on May 17, when a major architectural change to
our product feeds went out in a release and generated 28 client reported issues (from a nice rolling average of .
5). We enhanced our process to link each svn checkin to a ticket and put together a page requiring per-ticket
signoff from the release and started tracking more quality metrics. This got us consistently smooth releases
through the summer of 2012.
But we weren’t done there. We wanted to totally pwn the old way, and the next step was weekly releases. There
were still some parts of the process that were manual and painful, and we were still having some “misses”
causing production issues. “If it’s painful, do it more often” is a message that some folks still balk at when
confronted with, but it is absolutely true.
This was a lot easier - the QA team worked in the background to get the test coverage numbers up and then we
said to the teams, “We’re going weekly in two weeks... Same process otherwise.”
Version 6.7 launched on September 27, a week after 6.6. Client reported issues stemming from a code release
average around zero since that time.
Solr index distribution was automated; they get regenerated before, shipped out to the data centers, brought up
to date, and then swapped in during releases.
Solr reindexing automation went live October 18, 2012.
Then we trained the developers to take over the release process.
We skipped some releases during Black Friday, but are shipping PRR 9.0 this week (in most of our absence!).
As I mentioned, our build and deployment is already automated (somewhat sketchily) with TeamCity, puppet,
Rundeck, and noah.
Our next step in killing off the old way is in progress by renovating our build system - moving to git with gerrit
for code reviewing, and upgrading our TeamCity installation so it can be API controlled - and fixing the crappy
CIT tests that have been languishing there. We have trouble currently with failing CIT because we don’t block
people on it, because the failures are intermittent. We’ll get build and CIT running fast (current 1 hour build 40
minute CIT).
After that we will get rid of the bash-spaghetti deployment system we have and making deploys faster and
better (current 3 hours). We’re removing the separate staging roll (staging = production because it’s client
facing) and go to continuous deployment off trunk to our QA system. Some of this is technology-faster and
some is process-faster - having to promote up four environments, when it takes 4 hours per, and when staging
and production have to happen in maintenance windows, is slow.
And eventually... Continuous deployment. The cloud kids get to start there, but it takes some heavy lifting to
get a large, established system there. But that’s the sequel, 2013: A Release Odyssey.
And that’s my story!
Hit me up at theagileadmin.com
And thanks to 2001: A Space Odyssey for all the screen caps I used as part of this presentation.

Contenu connexe

Tendances

James Christie CAST 2014 Standards – promoting quality or restricting competi...
James Christie CAST 2014 Standards – promoting quality or restricting competi...James Christie CAST 2014 Standards – promoting quality or restricting competi...
James Christie CAST 2014 Standards – promoting quality or restricting competi...James Christie Christie
 
How To Introduce Cloud Based Load Testing to Your Jenkins Continuous Delivery...
How To Introduce Cloud Based Load Testing to Your Jenkins Continuous Delivery...How To Introduce Cloud Based Load Testing to Your Jenkins Continuous Delivery...
How To Introduce Cloud Based Load Testing to Your Jenkins Continuous Delivery...Jennifer Finney
 
Integrated Dev And Qa Team With Scrum
Integrated Dev And Qa Team With ScrumIntegrated Dev And Qa Team With Scrum
Integrated Dev And Qa Team With ScrumEthan Huang
 
Continuous Deployment: The Dirty Details
Continuous Deployment: The Dirty DetailsContinuous Deployment: The Dirty Details
Continuous Deployment: The Dirty DetailsMike Brittain
 
Using JMeter in CloudTest for Continuous Testing
Using JMeter in CloudTest for Continuous TestingUsing JMeter in CloudTest for Continuous Testing
Using JMeter in CloudTest for Continuous TestingSOASTA
 
Introduction to Continuous Delivery (BBWorld/DevCon 2013)
Introduction to Continuous Delivery (BBWorld/DevCon 2013)Introduction to Continuous Delivery (BBWorld/DevCon 2013)
Introduction to Continuous Delivery (BBWorld/DevCon 2013)Mike McGarr
 
Principles and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyPrinciples and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyMike Brittain
 
從限制理論看 DevOps
從限制理論看 DevOps從限制理論看 DevOps
從限制理論看 DevOpsWilliam Yeh
 
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGroup
 
Comparing Agile QA Approaches to End-to-End Testing
Comparing Agile QA Approaches to End-to-End TestingComparing Agile QA Approaches to End-to-End Testing
Comparing Agile QA Approaches to End-to-End TestingKatie Chin
 
Continuous Delivery, Continuous Integration
Continuous Delivery, Continuous Integration Continuous Delivery, Continuous Integration
Continuous Delivery, Continuous Integration Amazon Web Services
 
有了 Agile,為什麼還要有 DevOps?
有了 Agile,為什麼還要有 DevOps?有了 Agile,為什麼還要有 DevOps?
有了 Agile,為什麼還要有 DevOps?William Yeh
 
Continuous Testing in DevOps
Continuous Testing in DevOpsContinuous Testing in DevOps
Continuous Testing in DevOpsTechWell
 
Agile Load Testing In The Real World
Agile Load Testing In The Real WorldAgile Load Testing In The Real World
Agile Load Testing In The Real WorldSOASTA
 
Team wide testing
Team wide testingTeam wide testing
Team wide testingEthan Huang
 
Continuous Delivery Distilled
Continuous Delivery DistilledContinuous Delivery Distilled
Continuous Delivery DistilledMatt Callanan
 
Microservices testing in distributed systems
Microservices testing in distributed systemsMicroservices testing in distributed systems
Microservices testing in distributed systemsIsa Vilacides
 
Building Evolvable Infrastructure
Building Evolvable InfrastructureBuilding Evolvable Infrastructure
Building Evolvable Infrastructurekiefdotcom
 
Continuous Testing: Preparing for DevOps
Continuous Testing: Preparing for DevOpsContinuous Testing: Preparing for DevOps
Continuous Testing: Preparing for DevOpsSTePINForum
 

Tendances (20)

James Christie CAST 2014 Standards – promoting quality or restricting competi...
James Christie CAST 2014 Standards – promoting quality or restricting competi...James Christie CAST 2014 Standards – promoting quality or restricting competi...
James Christie CAST 2014 Standards – promoting quality or restricting competi...
 
How To Introduce Cloud Based Load Testing to Your Jenkins Continuous Delivery...
How To Introduce Cloud Based Load Testing to Your Jenkins Continuous Delivery...How To Introduce Cloud Based Load Testing to Your Jenkins Continuous Delivery...
How To Introduce Cloud Based Load Testing to Your Jenkins Continuous Delivery...
 
Integrated Dev And Qa Team With Scrum
Integrated Dev And Qa Team With ScrumIntegrated Dev And Qa Team With Scrum
Integrated Dev And Qa Team With Scrum
 
Continuous Deployment: The Dirty Details
Continuous Deployment: The Dirty DetailsContinuous Deployment: The Dirty Details
Continuous Deployment: The Dirty Details
 
Using JMeter in CloudTest for Continuous Testing
Using JMeter in CloudTest for Continuous TestingUsing JMeter in CloudTest for Continuous Testing
Using JMeter in CloudTest for Continuous Testing
 
Introduction to Continuous Delivery (BBWorld/DevCon 2013)
Introduction to Continuous Delivery (BBWorld/DevCon 2013)Introduction to Continuous Delivery (BBWorld/DevCon 2013)
Introduction to Continuous Delivery (BBWorld/DevCon 2013)
 
Principles and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyPrinciples and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at Etsy
 
從限制理論看 DevOps
從限制理論看 DevOps從限制理論看 DevOps
從限制理論看 DevOps
 
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
 
Comparing Agile QA Approaches to End-to-End Testing
Comparing Agile QA Approaches to End-to-End TestingComparing Agile QA Approaches to End-to-End Testing
Comparing Agile QA Approaches to End-to-End Testing
 
Continuous Delivery, Continuous Integration
Continuous Delivery, Continuous Integration Continuous Delivery, Continuous Integration
Continuous Delivery, Continuous Integration
 
有了 Agile,為什麼還要有 DevOps?
有了 Agile,為什麼還要有 DevOps?有了 Agile,為什麼還要有 DevOps?
有了 Agile,為什麼還要有 DevOps?
 
Continuous Testing in DevOps
Continuous Testing in DevOpsContinuous Testing in DevOps
Continuous Testing in DevOps
 
Agile Load Testing In The Real World
Agile Load Testing In The Real WorldAgile Load Testing In The Real World
Agile Load Testing In The Real World
 
Team wide testing
Team wide testingTeam wide testing
Team wide testing
 
Continuous Delivery Distilled
Continuous Delivery DistilledContinuous Delivery Distilled
Continuous Delivery Distilled
 
A Tale of Two Apps
A Tale of Two AppsA Tale of Two Apps
A Tale of Two Apps
 
Microservices testing in distributed systems
Microservices testing in distributed systemsMicroservices testing in distributed systems
Microservices testing in distributed systems
 
Building Evolvable Infrastructure
Building Evolvable InfrastructureBuilding Evolvable Infrastructure
Building Evolvable Infrastructure
 
Continuous Testing: Preparing for DevOps
Continuous Testing: Preparing for DevOpsContinuous Testing: Preparing for DevOps
Continuous Testing: Preparing for DevOps
 

Similaire à 2012 - A Release Odyssey

Scaling Up Lookout
Scaling Up LookoutScaling Up Lookout
Scaling Up LookoutLookout
 
DevOps Transformations
DevOps TransformationsDevOps Transformations
DevOps TransformationsErnest Mueller
 
DOES15 - Ernest Mueller - DevOps Transformations At National Instruments and...
DOES15 - Ernest Mueller - DevOps Transformations At National Instruments and...DOES15 - Ernest Mueller - DevOps Transformations At National Instruments and...
DOES15 - Ernest Mueller - DevOps Transformations At National Instruments and...Gene Kim
 
Continuous Testing 2016
Continuous Testing 2016Continuous Testing 2016
Continuous Testing 2016Karim Fanadka
 
Serverless in production, an experience report (Going Serverless, 28 Feb 2018)
Serverless in production, an experience report (Going Serverless, 28 Feb 2018)Serverless in production, an experience report (Going Serverless, 28 Feb 2018)
Serverless in production, an experience report (Going Serverless, 28 Feb 2018)Domas Lasauskas
 
Innovate Better Through Machine data Analytics
Innovate Better Through Machine data AnalyticsInnovate Better Through Machine data Analytics
Innovate Better Through Machine data AnalyticsHal Rottenberg
 
Continuous Performance Testing: The New Standard
Continuous Performance Testing: The New StandardContinuous Performance Testing: The New Standard
Continuous Performance Testing: The New StandardTechWell
 
Continuous Delivery in a Legacy Shop—One Step at a Time
Continuous Delivery in a Legacy Shop—One Step at a TimeContinuous Delivery in a Legacy Shop—One Step at a Time
Continuous Delivery in a Legacy Shop—One Step at a TimeTechWell
 
Case Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software DevelopmentCase Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software DevelopmentCA Technologies
 
Case Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software DevelopmentCase Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software DevelopmentCA Technologies
 
From 0 to DevOps in 80 Days [Webinar Replay]
From 0 to DevOps in 80 Days [Webinar Replay]From 0 to DevOps in 80 Days [Webinar Replay]
From 0 to DevOps in 80 Days [Webinar Replay]Dynatrace
 
Care and feeding notes
Care and feeding notesCare and feeding notes
Care and feeding notesPerrin Harkins
 
Metrics driven dev ops 2017
Metrics driven dev ops 2017Metrics driven dev ops 2017
Metrics driven dev ops 2017Jerry Tan
 
6 ways DevOps helped PrepSportswear move from monolith to microservices
6 ways DevOps helped PrepSportswear move from monolith to microservices6 ways DevOps helped PrepSportswear move from monolith to microservices
6 ways DevOps helped PrepSportswear move from monolith to microservicesDynatrace
 
Continuous Deployment
Continuous DeploymentContinuous Deployment
Continuous DeploymentBrian Henerey
 
Lessons Learned from Migrating Legacy Enterprise Applications to Microservices
Lessons Learned from Migrating Legacy Enterprise Applications to MicroservicesLessons Learned from Migrating Legacy Enterprise Applications to Microservices
Lessons Learned from Migrating Legacy Enterprise Applications to MicroservicesVMware Tanzu
 
Continuous, continuous, continuous
Continuous, continuous, continuousContinuous, continuous, continuous
Continuous, continuous, continuousMichele Orselli
 
Continuous Build To Continuous Release - Experience
Continuous Build To Continuous Release - ExperienceContinuous Build To Continuous Release - Experience
Continuous Build To Continuous Release - ExperienceRaja Soundaramourty
 

Similaire à 2012 - A Release Odyssey (20)

Scaling Up Lookout
Scaling Up LookoutScaling Up Lookout
Scaling Up Lookout
 
DevOps Transformations
DevOps TransformationsDevOps Transformations
DevOps Transformations
 
DOES15 - Ernest Mueller - DevOps Transformations At National Instruments and...
DOES15 - Ernest Mueller - DevOps Transformations At National Instruments and...DOES15 - Ernest Mueller - DevOps Transformations At National Instruments and...
DOES15 - Ernest Mueller - DevOps Transformations At National Instruments and...
 
Continuous Testing
Continuous TestingContinuous Testing
Continuous Testing
 
Continuous Testing 2016
Continuous Testing 2016Continuous Testing 2016
Continuous Testing 2016
 
Serverless in production, an experience report (Going Serverless, 28 Feb 2018)
Serverless in production, an experience report (Going Serverless, 28 Feb 2018)Serverless in production, an experience report (Going Serverless, 28 Feb 2018)
Serverless in production, an experience report (Going Serverless, 28 Feb 2018)
 
Innovate Better Through Machine data Analytics
Innovate Better Through Machine data AnalyticsInnovate Better Through Machine data Analytics
Innovate Better Through Machine data Analytics
 
Continuous Performance Testing: The New Standard
Continuous Performance Testing: The New StandardContinuous Performance Testing: The New Standard
Continuous Performance Testing: The New Standard
 
Continuous Delivery in a Legacy Shop—One Step at a Time
Continuous Delivery in a Legacy Shop—One Step at a TimeContinuous Delivery in a Legacy Shop—One Step at a Time
Continuous Delivery in a Legacy Shop—One Step at a Time
 
Case Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software DevelopmentCase Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software Development
 
Case Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software DevelopmentCase Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software Development
 
From 0 to DevOps in 80 Days [Webinar Replay]
From 0 to DevOps in 80 Days [Webinar Replay]From 0 to DevOps in 80 Days [Webinar Replay]
From 0 to DevOps in 80 Days [Webinar Replay]
 
Continuous integration at CartoDB March '16
Continuous integration at CartoDB March '16Continuous integration at CartoDB March '16
Continuous integration at CartoDB March '16
 
Care and feeding notes
Care and feeding notesCare and feeding notes
Care and feeding notes
 
Metrics driven dev ops 2017
Metrics driven dev ops 2017Metrics driven dev ops 2017
Metrics driven dev ops 2017
 
6 ways DevOps helped PrepSportswear move from monolith to microservices
6 ways DevOps helped PrepSportswear move from monolith to microservices6 ways DevOps helped PrepSportswear move from monolith to microservices
6 ways DevOps helped PrepSportswear move from monolith to microservices
 
Continuous Deployment
Continuous DeploymentContinuous Deployment
Continuous Deployment
 
Lessons Learned from Migrating Legacy Enterprise Applications to Microservices
Lessons Learned from Migrating Legacy Enterprise Applications to MicroservicesLessons Learned from Migrating Legacy Enterprise Applications to Microservices
Lessons Learned from Migrating Legacy Enterprise Applications to Microservices
 
Continuous, continuous, continuous
Continuous, continuous, continuousContinuous, continuous, continuous
Continuous, continuous, continuous
 
Continuous Build To Continuous Release - Experience
Continuous Build To Continuous Release - ExperienceContinuous Build To Continuous Release - Experience
Continuous Build To Continuous Release - Experience
 

Plus de Ernest Mueller

AlienVault USM Anywhere: Building a Security SaaS in AWS in Six Months
AlienVault USM Anywhere: Building a Security SaaS in AWS in Six MonthsAlienVault USM Anywhere: Building a Security SaaS in AWS in Six Months
AlienVault USM Anywhere: Building a Security SaaS in AWS in Six MonthsErnest Mueller
 
The DevOps Panel - Innotech Austin CD Summit
The DevOps Panel - Innotech Austin CD SummitThe DevOps Panel - Innotech Austin CD Summit
The DevOps Panel - Innotech Austin CD SummitErnest Mueller
 
Lean Security - LASCON 2016
Lean Security - LASCON 2016Lean Security - LASCON 2016
Lean Security - LASCON 2016Ernest Mueller
 
Lean Security - OWASP Austin March 2016
Lean Security - OWASP Austin March 2016Lean Security - OWASP Austin March 2016
Lean Security - OWASP Austin March 2016Ernest Mueller
 
Lean Security - RSA 2016
Lean Security - RSA 2016Lean Security - RSA 2016
Lean Security - RSA 2016Ernest Mueller
 
DevOps State of the Union 2015
DevOps State of the Union 2015DevOps State of the Union 2015
DevOps State of the Union 2015Ernest Mueller
 
App Assessments Reloaded
App Assessments ReloadedApp Assessments Reloaded
App Assessments ReloadedErnest Mueller
 
Metrics Driven Development and DevOps - Agile 2014
Metrics Driven Development and DevOps - Agile 2014Metrics Driven Development and DevOps - Agile 2014
Metrics Driven Development and DevOps - Agile 2014Ernest Mueller
 
CloudAustin Black Friday 2013
CloudAustin Black Friday 2013CloudAustin Black Friday 2013
CloudAustin Black Friday 2013Ernest Mueller
 
DevOps and Cloud at NI
DevOps and Cloud at NIDevOps and Cloud at NI
DevOps and Cloud at NIErnest Mueller
 
Business model driven cloud adoption - what NI is doing in the cloud
Business model driven cloud adoption -  what  NI is doing in the cloudBusiness model driven cloud adoption -  what  NI is doing in the cloud
Business model driven cloud adoption - what NI is doing in the cloudErnest Mueller
 
Inside Microsoft Azure
Inside Microsoft AzureInside Microsoft Azure
Inside Microsoft AzureErnest Mueller
 
PIE - The Programmable Infrastructure Environment
PIE - The Programmable Infrastructure EnvironmentPIE - The Programmable Infrastructure Environment
PIE - The Programmable Infrastructure EnvironmentErnest Mueller
 
Why the cloud is more secure than your existing systems
Why the cloud is more secure than your existing systemsWhy the cloud is more secure than your existing systems
Why the cloud is more secure than your existing systemsErnest Mueller
 

Plus de Ernest Mueller (20)

DevOps at a Distance
DevOps at a DistanceDevOps at a Distance
DevOps at a Distance
 
AlienVault USM Anywhere: Building a Security SaaS in AWS in Six Months
AlienVault USM Anywhere: Building a Security SaaS in AWS in Six MonthsAlienVault USM Anywhere: Building a Security SaaS in AWS in Six Months
AlienVault USM Anywhere: Building a Security SaaS in AWS in Six Months
 
Intro to DevOps
Intro to DevOpsIntro to DevOps
Intro to DevOps
 
The DevOps Panel - Innotech Austin CD Summit
The DevOps Panel - Innotech Austin CD SummitThe DevOps Panel - Innotech Austin CD Summit
The DevOps Panel - Innotech Austin CD Summit
 
Lean Security - LASCON 2016
Lean Security - LASCON 2016Lean Security - LASCON 2016
Lean Security - LASCON 2016
 
Lean Security - OWASP Austin March 2016
Lean Security - OWASP Austin March 2016Lean Security - OWASP Austin March 2016
Lean Security - OWASP Austin March 2016
 
Lean Security - RSA 2016
Lean Security - RSA 2016Lean Security - RSA 2016
Lean Security - RSA 2016
 
DevOps State of the Union 2015
DevOps State of the Union 2015DevOps State of the Union 2015
DevOps State of the Union 2015
 
DevOps 101
DevOps 101DevOps 101
DevOps 101
 
App Assessments Reloaded
App Assessments ReloadedApp Assessments Reloaded
App Assessments Reloaded
 
Metrics Driven Development and DevOps - Agile 2014
Metrics Driven Development and DevOps - Agile 2014Metrics Driven Development and DevOps - Agile 2014
Metrics Driven Development and DevOps - Agile 2014
 
The DevOps Centipede
The DevOps CentipedeThe DevOps Centipede
The DevOps Centipede
 
Mobile and the Cloud
Mobile and the CloudMobile and the Cloud
Mobile and the Cloud
 
CloudAustin Black Friday 2013
CloudAustin Black Friday 2013CloudAustin Black Friday 2013
CloudAustin Black Friday 2013
 
Cloud Monitoring
Cloud MonitoringCloud Monitoring
Cloud Monitoring
 
DevOps and Cloud at NI
DevOps and Cloud at NIDevOps and Cloud at NI
DevOps and Cloud at NI
 
Business model driven cloud adoption - what NI is doing in the cloud
Business model driven cloud adoption -  what  NI is doing in the cloudBusiness model driven cloud adoption -  what  NI is doing in the cloud
Business model driven cloud adoption - what NI is doing in the cloud
 
Inside Microsoft Azure
Inside Microsoft AzureInside Microsoft Azure
Inside Microsoft Azure
 
PIE - The Programmable Infrastructure Environment
PIE - The Programmable Infrastructure EnvironmentPIE - The Programmable Infrastructure Environment
PIE - The Programmable Infrastructure Environment
 
Why the cloud is more secure than your existing systems
Why the cloud is more secure than your existing systemsWhy the cloud is more secure than your existing systems
Why the cloud is more secure than your existing systems
 

Dernier

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Dernier (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

2012 - A Release Odyssey

  • 1. DevOpsDays Austin 2013 @ernestmueller| @bazaarvoice 2012: A Release Odyssey Hi, I’m Ernest Mueller from Bazaarvoice here in Austin. We’re the biggest SaaS company you’ve never heard of; our primary application is for the collection and display of user generated content – for example, ratings and reviews – and a lot of the biggest Internet retailers use our solution on their sites for that purpose. We pushed out more than 1bn reviews last Cyber Monday. I’m going to tell you how we went from releasing our code once every ten weeks to once a week in a pretty short time.
  • 2. The Monolith! Bazaarvoice Conversations, aka PRR, has 15,000 files and 4.9M lines of code, the oldest from Feb 2006, and that’s not counting UI versions, customer config, or operations code repos (all of which get released along with it). Written by generations of coders, including outsourcer partners. It runs across 1200 hosts in 4 datacenters; Rackspace, and AWS East, West, and Ireland. So by any measure this was a large legacy system.
  • 3. BV had gone agile and said “Let’s release more quickly too! All the cool kids are doing it! We’re doing two week sprints, so let’s release biweekly - go! They tried it two weeks after a big ten-week release, and PRR v5.1 launched on January 19th, 2012. Whoops, it’s not that easy - 44 client tickets logged, mass hysteria. “Let’s not do that again!”
  • 4. Enter yours truly on January 30th. “You’re hired! We want biweekly releases in a month. With zero user facing downtime. Failure is not an option! Go!” It wasn’t just an irrational need for speed, the product organization wanted to get faster A/B testing, more piloting, etc. and the engineering team wanted the benefits of a more continuous flow as well.
  • 5. Careful analysis of the situation was warranted. Luckily a SWAT team had been analyzing the problem already. The two major impediments, which are frequently encountered factors in legacy implementations: • Lack of automation in testing - testing was a huge burden and couldn’t be done sufficiently in the time allotted • Poor SCM code discipline - checkins continuing up to the release
  • 6. Path One - Testing! We hired up QA automation people and set them to work. We set the expectation, backed up strongly by the product team, that the development teams had to stop and do three testing sprints. We have a standard four-environment setup - dev, QA, staging, production.
  • 7. JUnit testing and CIT testing in TeamCity was ramped up. A selenium-based “Testmaster” system was used to improve the level of regression automation to safe levels. More importantly perhaps, a new discipline of not running all the tests all the time - feature/story in dev, regression in QA, smoke testing in staging and production
  • 8. Branching - changed over to a trunk/release branch model, splits off every 2 weeks, no commits to branch without going through a code freeze break process. Process enforcement via wiki! Trunk goes to dev twice daily, branch goes to QA, when labeled “verified” it goes to staging and then to production.
  • 9. We also had a team write a feature flagging system, like the cool kids use, so we could launch features dark and then enable them later. We made the rule that all new features must be launched dark.
  • 10. We couldn’t fix a couple things in time. Our Solr indexes are 20 GB and reindexing and distributing them, while doing a zero downtime deployment and keeping replication lag down needed more engineering. And our build and deploy system was pretty bad. It’s buzzword compliant - svn, TeamCity, maven, yum, puppet, rundeck, noah, but it’s actually a bit of spaghetti mess in a big crufty bash framework; builds take more than an hour and deploys take 3+ hours.
  • 11. We got a delay of game due to our IPO and then were “no go” March 1. We were under a lot of management pressure to ship, but tests weren’t passing and at the new go/no-go meeting the dev managers sucked it up and declared “no go.”
  • 12. First biweekly release - PRR 5.2 went out on March 6, 5 days late. 5 issues were reported by customers. 5.3 went out March 22, 1 issue reported. 5.4 went out April 5, zero issues reported. I kept in depth release metrics - number of checkins, number of process faults, number of support tickets - and they showed consistent improvement.
  • 13. It took a lot of collaboration and good old fashioned project management. Product, QA, DevOps, various engineering teams, Support, and other stakeholders had to all get on the same page. We didn’t really change tooling besides adding the feature flagging - still Confluence, JIRA, and all our other tools - just using them more effectively. http://www.flickr.com/photos/senorwences/2366892425/
  • 14. And the release train kept spinning. We had one major disaster on May 17, when a major architectural change to our product feeds went out in a release and generated 28 client reported issues (from a nice rolling average of . 5). We enhanced our process to link each svn checkin to a ticket and put together a page requiring per-ticket signoff from the release and started tracking more quality metrics. This got us consistently smooth releases through the summer of 2012.
  • 15. But we weren’t done there. We wanted to totally pwn the old way, and the next step was weekly releases. There were still some parts of the process that were manual and painful, and we were still having some “misses” causing production issues. “If it’s painful, do it more often” is a message that some folks still balk at when confronted with, but it is absolutely true.
  • 16. This was a lot easier - the QA team worked in the background to get the test coverage numbers up and then we said to the teams, “We’re going weekly in two weeks... Same process otherwise.” Version 6.7 launched on September 27, a week after 6.6. Client reported issues stemming from a code release average around zero since that time. Solr index distribution was automated; they get regenerated before, shipped out to the data centers, brought up to date, and then swapped in during releases. Solr reindexing automation went live October 18, 2012. Then we trained the developers to take over the release process. We skipped some releases during Black Friday, but are shipping PRR 9.0 this week (in most of our absence!).
  • 17. As I mentioned, our build and deployment is already automated (somewhat sketchily) with TeamCity, puppet, Rundeck, and noah. Our next step in killing off the old way is in progress by renovating our build system - moving to git with gerrit for code reviewing, and upgrading our TeamCity installation so it can be API controlled - and fixing the crappy CIT tests that have been languishing there. We have trouble currently with failing CIT because we don’t block people on it, because the failures are intermittent. We’ll get build and CIT running fast (current 1 hour build 40 minute CIT).
  • 18. After that we will get rid of the bash-spaghetti deployment system we have and making deploys faster and better (current 3 hours). We’re removing the separate staging roll (staging = production because it’s client facing) and go to continuous deployment off trunk to our QA system. Some of this is technology-faster and some is process-faster - having to promote up four environments, when it takes 4 hours per, and when staging and production have to happen in maintenance windows, is slow.
  • 19. And eventually... Continuous deployment. The cloud kids get to start there, but it takes some heavy lifting to get a large, established system there. But that’s the sequel, 2013: A Release Odyssey.
  • 20. And that’s my story! Hit me up at theagileadmin.com And thanks to 2001: A Space Odyssey for all the screen caps I used as part of this presentation.