Axa Assurance Maroc - Insurer Innovation Award 2024
Leaving the platform: branching for independent systems at thetrainline
1. Leaving the Platform
Branching for independent
products at thetrainline.com
Owain Perry and Matthew Skelton
London Continuous Delivery #londoncd
25 September 2013
2. • Owain Perry
– Software architect at thetrainline.com
– @owainperry
– http://owainperry.com
• Matthew Skelton
– Build and deployment architect at thetrainline.com
– @matthewpskelton
– http://matthewskelton.net
18. Thank you
http://engineering.thetrainline.com/ - blog
Thanks to Matt Richardson (@Squire_Matt) and #londoncd meetup group
@owainperry
@matthewpskelton
@WinPkgMgt – Windows Package Management
http://blog.lastminute.com/wp-content/uploads/Tube1.jpg - Lastminute.com
http://www.e993.com/ 4303216821_e47ea5315e_z.jpg – Kamiya Satoshi
http://www.3dwallz.com/wp-content/uploads/2013/06/Natural-pond-Windows-7-Desktop-Wallpaper.jpg
http://www.atwistedspoke.com/wp-content/uploads/2010/07/big-cycle.jpg
http://www.mebpersoneli.com/upload/news/bu-ders-saatleri-artik-saate-dustu52c13dc392.png
http://thejosevilson.com/wp-content/uploads/2012/05/responsibility.jpg - Jose Vilson
http://www.candymania.com/images/uploads/quizzes/Candymania_8-1_BabyBottlePop_-_Whacky_Words_-_Gobbledygook.jpg
http://www.visualphotos.com/photo/2x3686630/baguette_rolls_of_different_sizes_957602.jpg
Notes de l'éditeur
About us
What are we going to talk aboutA bit of history on the codebase How we branch and release at the moment. Current thinking on where we are headingImplications of this The journey of how are we going to get there?
BackgroundLargest vendor of UK rail tickets Largest UK travel booking website Started online ticket sales in 1999£1.3 Billion turnover 30+ Million online customers Development team of 150+ in two locations (London and Bangalore)Parts of the code base goes back up 9+ yearsMajor redevelopment of core systems in .Net started 7 years ago.Net mostly (1,2,3,4)A couple of VB6 apps (very core and expensive to change)4 to 5 million lines of code (depending on what andhow you count it)
Platform scheme is natural organic evolutionWe adopted the platform release approach 4 years (or so) agoConway's Law between teams To deliver complete features to customers across multiple teams in multiple geographies, synchronisation of was hardOut of hours deployments in the past (with less automation) is expensive and tiring for humans Treat the platform as one system and test accordingly A collection of integrated components All built in independently with multiple CI pipelines All tested together NFT Regression IntegrationLots of TDD With a large legacy code base retro fitting integration tests is hard We have lots of “end to end” integration tests Heavyweight Not as fast a we like Brittle to change
Platform Deployment (the old/existing scheme)We release (everything) with regular 6 week drops, whether it has changed or not“Out of hours” deployments. Lots of deployment automation – fastVMWareChef Outages reduced from 6 hours (2011) to 17 mins (Sept 2013) with some ‘blue-green’-ish techniquesStill quite manual post deployment testing To deliver complete features to customers across multiple teams in multiple geographies, synchronisation was hardOut of hours deployments in the past (with less automation) is expensive and tiring for humans Treat the platform as one system and test accordingly A few patch releases as required post the platform release
Branching at the momentWe have 3 releases in flight Production Next release (in test) In development (we don’t use trunk, master, mainline)Branch ahead of time, so it’s ready for use One single ‘platform’ version - with multiple components all at their individual builds different versions The collection of components version numbers makes the platform versionAll components share the same branch name across all reposThis is quite hard to manage (at the moment)
Implications of thisBig regression cyclesPatch releases Long cycle times (6 + 5) weeks Time to fix defects (classic cost-of-change graph)
Business – reduce cycle time Lower cost of testing Value for money more quickly (ROI)Faster is cycle time, delivering business value to customers as quickly as possible. Move away from big 6 week heartbeat releases to smaller more frequent releases and deployments (Excludes mobile apps )With a large legacy code base retro fitting integration tests is hard Lots of state stored in databases – makes testing harder. We have lots of “end to end” integration tests - Heavyweight - Not as fast a we like - Brittle to change Opens up options about doing multi-variate testing (MVT)Rolling back changes (based on MVT)Swift bug fixes Supports faster cycle times for smaller items.As a Developer, I can keep a small, recent change in my head, it’s simpler to diagnose an issue compared to a large change from 12 weeks ago
Implement Continuous DeliveryHow do you do this with a large legacy code base?How to move change direction of an ‘oil tanker‘?How to remove risk on a legacy code base without all the right attributes?Continuous Delivery for key parts of the thetrainline.com systems and services (not necessarily everything, at least not soon!)
Organisational changesMoving away from ‘any developer, any code, anywhere’Moving to product (component) aligned teams Code and build ownership should increaseThis allows the code bases to “get more loving”Cross cutting people across teams as requiredSocial contract between components as much as a technical service contractI will test my component , you Mr consumer trust me. I will honour all my contracts to be consistent unless I advertise a breaking change. I will clean up my mess Spotify‘tribes’ is a similar idea Working WITH Conway’s Law, not against it
I will clean up my mess New approach for teams to adopt , removing reliance on external support Is this DevOps?!?
Deployment changesBring component NFT into the CI pipelines Component level regression testing Reduce the scale of the platform testsNeed to freeze the smaller deployments before a platform release NFT cycles PreProd cycles Check for breaking changes with the new functionalityStop deployment of any new functionality during this period? Deploy everything during heartbeat releaseOver time, the strength of the heartbeat diminishesPerhaps the heartbeat remains only for ‘core’ or ‘critical’ components or subsystems
Versions, Semantic Versioning (SemVer in .NET world)“Semantics (from Greek: σημαντικός sēmantikós)[1][2] is the study of meaning. ”Semantic versioning getting meaning into the version numberA.B.C.D A increases if breaks behaviour for a consumerB increases for a new feature or group of features C increases for a patch or bug fix D is a build number We recommend each platform release is considered a breaking changeThere is too much complexity to know if it is or not. The platform version is not currently semantically versionedWhat happens to the platform version in the future The version number for a component can either be independent or use the platform version
Branching for independent componentsDevelop on master (trunk for you Svnguys) Only branch if required based on breaking changes Run CI on the branch , to see if anything breaks, if yes , keep the branch Feature Toggles for future delivery dates: develop early, test early, deploy early (toggle off), ‘release’ is toggle onThis is for changes which are isolated to one subsystem or group of related subsystems
Old and new branching schemes
The futureSmaller bits are easier to bake. New projects are taking some new approaches(web services, front end UI’s)Faster pace of deployments Ideally heading towards in office hours daily releases