Based on the Legacy CodeRetreat - Daniel Prager, Tomasz Janowski and I ran this all day workshop at this year's LASTconf.
Get ready to level up at refactoring at LAST Conference's first Refactoring Developer workshop. Inspired by Code Retreat, we have run a similar session, for the basics of agile development, at LAST Conference for the past few years. We have felt that it's Important to support learning in technical disciplines that are extremely important in agile software development.
Too many Agile and DevOps initiatives are stymied by code bases that are hard to change and understand.
While disciplined teams who rigorously practice pair programming, test-driven design (TDD) and other technical Agile practices avoid producing new legacy code in the first place, cleaning up a pre-existing mess is notoriously difficult and dangerous. Without the safety net of excellent automated test coverage, the risk of breaking something else as you refactor is extremely high. Also, code that wasn't designed and written with testability in mind makes it really difficult to get started. So most don't even try ...
In the Refactoring workshop developers learn how to build an initial safety net before applying multiple refactorings, and have lots of fun along the way!
To read more about how to run a classic CodeRetreat, I recommend this blogpost: https://medium.com/seek-blog/coderetreat-at-seek-clean-code-vs-comfort-zone-cfb1da64909d
2. Your Facilitators
Dan Prager
Agile Consultant @ Skillfire & ZXM
@agilejitsu
Victoria Schiffer
Delivery Manager @ SEEK
@Erdbeervogel
Tomasz Janowski
Technical Lead @ REA Group
@janowskit
4. Ice-breaker
1. Pair up and invent a groovy handshake with
five distinct movements
2. Practice your handshake
3. Whip round the room and each pair does a
demo
5. Plan for the Day
Session A: 11 am — 12:30 pm
● Welcome & Ice-breaker
● Sprint 0: Golden Master testing (Victoria)
Lunch
Session B: 1:30 – 3 pm
● Sprint 1: Remove Duplication & Improve Names (Dan)
● Sprint 2: Extract to Pure Functions (Tomasz)
Afternoon tea
Session C: 3:30 – 5 pm
● Sprint 3: Add a Feature (Tomasz)
● Sprint 4: Group Debrief
6. What is Legacy Code?
“Legacy Code is code without tests.”— Michael Feathers
equivalently
“Legacy Code is code you are afraid to change.”— J B Rainsberger
Reflection: “What are some of the pros of Legacy Code?”
7. Comparison to Fundamentals Immersion
Similarities to Fundamentals / Regular Code Retreat
● One problem for the day
● Lots of sprints, with a different learning focus each sprint
● Pair programming, and rotate partners each sprint
● Huge focus on fast feedback through automated testing
Differences
● Instead of a spec, we start with a steaming pile of legacy code
● We do not delete our code at the end of each sprint!
● Emphasis on Golden Master testing technique
● Use of TDD is now selective, BUT still highly recommended
9. Why and What of Golden Master Testing
● Why it’s hard to get started to refactor legacy code
● What Golden Master testing is and how it helps us get started with Legacy Code
● Why Build your own Golden Master infrastructure
● What about TDD and unit testing?
10. Why it’s hard to get started to refactor
legacy code
● Lack of tests
● ‘Fear’ of changing anything
● Low confidence in not breaking anything
● Not sure what will break, which behaviours might be
impacted by the change
11. "In nearly every legacy system, what the system does is more
important than what it’s supposed to do.”
“A characterisation test is a test that characterizes the actual
behaviour of a piece of code"
— Michael C. Feathers, Working Effectively with Legacy Code
→ It isn’t right or wrong, it simply exists
Why a ‘Golden Master’?
12. “A gold master test is a regression test for complex, untested
systems that asserts a consistent macro-level behavior.”
— Jake Worth
→ black box test
What’s a Golden Master Test?
13. “Mastering, a form of audio post production, is the process of
preparing and transferring recorded audio from a source
containing the final mix to a data storage device (the master);
the source from which all copies will be produced.
‘Golden Master’ Analogy
Mastering requires critical listening; however, software tools exist to facilitate the process. Results still
depend upon the intent of the engineer, the accuracy of the speaker monitors, and the listening
environment. Mastering engineers may also need to apply corrective equalization and dynamic
compression in order to optimise sound translation on all playback systems. It is standard practice to
make a copy of a master recording, known as a safety copy, in case the master is lost, damaged or
stolen.” (Wikipedia)
Golden Record: NASA
14. ● Every Golden Master is unique to the specific legacy
system and its behaviours
● Right level of output data captured and compared with
● Might need to scrub data (e.g. if using copy of production
database in your golden master)
Why Build Your Own Golden Master
Infrastructure?
16. Working with Golden Master
Run Code/Test
Save as Golden
Master
Run Code/Test
Capture Data/Output
Compare to Golden
Master
Equal?
1st Run Subsequent Runs
Capture Data/Output
Prompts Decision: Re-Baseline Golden Master OR roll-back
Keep Refactoring.
No!
Yes!
17. Steps to building a Golden Master
1. Get the code to run
2. Randomly generate inputs, leading to plentiful output
● Fix the random number seed, leading to repeatable output
3. Capture the output of every run
● The output of the unmodified legacy system is your “Golden Master” (save it somewhere)
● Options for output capture: Roll your own “print” function and do a careful find & replace, or
○ Monkey patch “print”, or
○ Redirect stdout
4. Compare the captured output of each run to the Golden Master output
● Report success or failure → every failure prompts decision
● Report line(s) that don’t match, actual vs expected
5. Extension: Incorporate two golden-master tests in your unit test suite
● Line length of current run vs Golden Master length
● Expected empty diff (all lines match) vs Mis-matched lines
20. Over to you!
Build your own Golden Master
● Pair up!
● Get the Legacy Code in your programming language of
choice http://github.com/lastconf2018/trivia
● Build your own Golden Master!
22. Leaning on the Golden Master
Focus of the Sprint
1. Find code duplications and poor-namings and fix them
2. Keep re-running the Golden Master test to confirm that you haven’t changed the
system behaviour
Stretch activities
● Find and fix a subtle defect in the Legacy code
● Introduce a unit test
23. With Duplication
def current_category():
if places[current_player] == 0: return 'Pop'
if places[current_player] == 4: return 'Pop'
if places[current_player] == 8: return 'Pop'
if places[current_player] == 1: return 'Science'
if places[current_player] == 5: return 'Science'
if places[current_player] == 9: return 'Science'
if places[current_player] == 2: return 'Sports'
if places[current_player] == 6: return 'Sports'
if places[current_player] == 10: return 'Sports'
return 'Rock'
Refactored
def current_category():
position = places[current_player]
if position in [0,4,8]: return 'Pop'
if position in [1,5,9]: return 'Science'
if position in [2,6,10]: return 'Sports'
return 'Rock'
● Undo old cut & pastes!
● New code? “Rule of three”
24. Extract code snippets to a well-named function
; This code appears in several places (duplication)
...
places[current_player] = places[current_player] + roll
if places[current_player] > 11:
places[current_player] = places[current_player] - 12
...
def move_current_player_forward():
places[current_player] = places[current_player] + roll
if places[current_player] > 11:
places[current_player] = places[current_player] - 12 # the board wraps around
Now we have (in a few places):
...
move_current_player_forward()
...
25. Kent “XP” Beck’s four rules of Simple Design
Our Priorities
1. passes its tests
2. remove duplication
3. Improve names
4. Fewer functions, classes, etc.
27. Extract to Pure Functions
● What they are
● Why they’re good: easy to unit test
● Pure functions + Commands + Glue
● Example(s)
Emphasis of the sprint:
1. Start adding unit tests around your pure functions
2. Stretch: TDD your pure functions
3. Keep using the Golden Master as a back-stop
28. Pure Functions
A pure function is a function where the
return value is only determined by its input
values, without observable side effects.
29. Pure Functions
● A pure function can only access what you pass it, so it’s easy to see
its dependencies.
● When a function accesses some other program state, such as a
global variable, it is no longer pure.
● A given invocation of a pure function can be replaced by its result.
There’s no difference between “add(2,3)” and “5”.
This property is called referential transparency
30. Testing
● To test a pure function, you declare the values that will act as
arguments and pass them to the function. The output value needs
to be verified against the expected value.
● No context to set up, no current user or request
● No side effects to mock or stub
● Testing doesn’t get more straightforward than this
32. Memoization
● Pure functions always return the same output for the same input
● We only need to compute the output once for given inputs
● Caching and reusing the result is called memoization and can only
be done safely with pure functions
43. Add a feature (or two)
Use TDD to add a new feature. Your choice of
● New question categories: History, Politics, Religion
● Support more than 6 players
● Add an optional Polish or German translation
● Invent and add a new rule
Emphasis of the sprint
● Re-incorporating TDD for new features: don’t just lean on the Golden Master
● Re-baselining your Golden Master
45. Split into two or three groups
How can we apply what we have learned in the “real world” of Work?
● Where might we use Golden Master testing?
● What additional adaptations would we need?
● What about other techniques?
● Cultural issues