A year ago, our software development team ended up in a funk. Simply put, we had some bugs in our processes, relationships and environment that were preventing us from being the best team we could be. So we did what any good dev team does when it encounters bugs: we deconstructed the problems, determined the root causes and implemented some fixes. I’ll share our story and discuss the lessons we learned along the way. You’ll take away ideas and tools that can help you explore these critical, but often tricky, topics in order to prepare your team to really scale.
1. Lessons Learned in an Introspective Year
Rebooting the Team
Fran Fabrizio
IT Director, Minnesota Population Center
Twitter: @franfabrizio Email: fran@umn.edu
2. Motivation for Rebooting the Team
DISCLAIMER
This is not a talk specific to scalability.
So, why am I here?
It goes like this...
2
12. Motivation for Rebooting the Team
SCALABILITY EDITION!
The Surge organizers contacted me and
said “We came across your talk about
rebooting your team. We’re filling out our
organizational scalability track - would
your talk work well there?”
This was an interesting question. I
embarked upon a “pondering walk”...
12
17. In the next 50 minutes we’ll review our
case study, including:
• Why we need reboots
• Gut feelings --> specific symptoms
• The collaborative process used to get to
the root causes
• Leveraging the insight to build consensus
for change and action
• Where we are and what we learned
• Q&A
17
18. Audience Takeaway
Techniques, tools, activities, processes,
ideas, sparks.... stuff you can use to help
figure out what’s ailing your team, whether
your team needs a reboot, and how to
collaboratively enact lasting change.
Not going to be a silver bullet - I’m giving
you design patterns. You will have
homework!
18
20. What’s a reboot?
A conscious decision to engage in deeper,
more radical change than just incremental
improvements.
A reboot typically impacts staff structure,
work processes and communication
patterns for your team and organization.
20
21. Why Do We Need Reboots?
21
DEV TEAM POWER METER100% 0%
Team firing on all cylinders, shipping
great code, everyone contributing, team
greater than sum of its parts!
22. Why Do We Need Reboots?
22
DEV TEAM POWER METER100% 0%
Enlightened organizations keep
the needle at 100% by proactively
anticipating the changes in their
environment and responding to
them gracefully over time.
23. Why Do We Need Reboots?
23
DEV TEAM POWER METER100% 0%
Most organizations are not that enlightened.
The needle begins to slide as the org is slow
to respond to changes in their environment.
24. Why Do We Need Reboots?
24
DEV TEAM POWER METER100% 0%
By the time there is awareness and consensus
for change, the amount of change needed is often
too great to achieve incrementally.
25. Wetware Reboots are Hard!
• Wetware is wonderfully, messily analog,
nondeterministic, and mysterious - it
doesn’t respond predictably to change.
• Reboots are costly and disruptive - we
wouldn’t do them if we didn’t have to
• An engineer’s wetware skillset is typically
less developed than their software/
hardware skills
25
26. The Dev Manager’s Mission
Something feels wrong. The team’s not
working as well as it used to. You’re not
quite sure what it is yet, or more
importantly why it’s happening, or how to
fix it. But you’re the one everyone’s
looking to for a fix.
How do you get to the whats, whys, and
hows? You need...
26
27. Organizational Debugging
A framework for turning observed behaviors
and stakeholder input into a clear
understanding of where your team has
deficiencies and how to address them.
1.Observe behavior and get stakeholder views
2.Distill into themes
3.Dig until you converge on root causes
4.Execute action plans for each root cause
27
28. Do we need a reboot?
Symptoms Smells Root Causes Actions
Concrete
Abstract
Diagnosis Treatment
Decision
Point
Themes
29. How is this Different than
Normal Management?
29
It’s amplified.
30. How to Approach a Reboot
• Respect the day job
• Change is scary. Be consistent,
overcommunicate, and check in often
• You are a facilitator: listen more than talk
• HRT - Humility, Respect and Trust - is
essential. (from the book Team Geek)
30
31. Why HRT is so Important
• When organizational debugging is done
collaboratively and with HRT, it produces
momentum for change.
• When it’s not, it produces resentment and
friction, and gets in the way of the organization
executing needed change.
31
33. Observing the Symptoms
• Recall this typically starts as a gut feeling
that “something’s not right”
• Start by writing down a list of symptoms
that are giving you this gut feeling.
• Then put on your facilitator hat and start
asking others (inside and outside your
team) targeted but open-ended questions
33
34. Good Questions to Ask
• How did your week go?
• What are your biggest pain points right
now?
• What do you think are the most important
things for us to be doing / thinking about?
• Do you need anything from me?
• How are things with <insert customer>?
34
35. Listen Mindfully
As you’re having these conversations,
listen for symptoms...
“Well, I spent the first half of the week setting up
my dev environment on my new system, and then
I had to put out a lot of fires on that one project.
By the time I had any breathing room it was Friday
and I couldn’t get any time with Mark, so I didn’t
do any new feature development on my main
project. Have you heard from the product team? I
was wondering whether they want that new UI
widget now, or if they want us to work on
optimizing the query performance first?...” 35
36. Avoid
• What do you think is wrong with the team?
– Sets off alarm bells and people feel compelled
to answer even if they weren’t thinking it
• Try to avoid diving into solutions just yet
– Suggestions fine, but there’s risk of treating
symptoms and not the root cause at this stage.
• Don’t go too deep
–First pass over your stakeholders, just getting
a feel for things right now. Should feel casual.
36
37. You’ve been committed to things without your knowledge
Ship dates slip
Expected to do 5 things at once
No 1:1 meetings
Small changes take longer than they should
Setting up a dev environment takes 3 days
No one person knows how to do a full deployment
Every deployment results in a big mess
People miss key pieces of info
Sick time is spiking
You get a pit in your stomach when you walk in the door
New requirements appear late in projects
Staff working on similar problems not collaborating
I can’t move code between projects easily.
Staff members reluctant to share knowledge
Documentation is out of date
Status meetings turn into bitchfests
People lament the quality of their work
People are quitting!
Can’t say no to anything
Customers don’t trust what IT says.
Estimates are unreliable
Projects getting later and later
Secondary projects fall through the cracks
Build is always broken!
You’re on a death march, and everyone knows it.
People are struggling with their tools.Every decision is made by committee
No consensus on priorities
Nobody knows what ‘done’ means
Team focusing on peripheral issues
37
Job descriptions no longer match reality
Surprises are common
Uptime is decreasing
Recruitment is getting more difficult
Long term goals aren’t getting closer
Exploratory work has stopped
I cannot determine the status of our systems.
I’m doing all the same things I was a year ago.
Vacations are disruptive
It’s too quiet!
38. Grouping Symptoms
• Go back to your symptom list and see if
they group into related themes.
• Themes are “anchors” for discussion -
rather than focusing on specific
symptoms, which can get bogged down in
the weeds
38
39. You’ve been committed to things without your knowledge
Ship dates slip
Expected to do 5 things at once
No 1:1 meetings
Small changes take longer than they should
Setting up a dev environment takes 3 days
No one person knows how to do a full deployment
Every deployment results in a big mess
People miss key pieces of info
Sick time is spiking
You get a pit in your stomach when you walk in the door
New requirements appear late in projects
Staff working on similar problems not collaborating
I can’t move code between projects easily.
Staff members reluctant to share knowledge
Documentation is out of date
Status meetings turn into bitchfests
People lament the quality of their work
People are quitting!
Can’t say no to anything
Customers don’t trust what IT says.
Estimates are unreliable
Projects getting later and later
Secondary projects fall through the cracks
Build is always broken!
You’re on a death march, and everyone knows it.
People are struggling with their tools.Every decision is made by committee
No consensus on priorities
Nobody knows what ‘done’ means
Team focusing on peripheral issues
39
Job descriptions no longer match reality
Surprises are common
Uptime is decreasing
Recruitment is getting more difficult
Long term goals aren’t getting closer
Exploratory work has stopped
I cannot determine the status of our systems.
I’m doing all the same things I was a year ago.
Vacations are disruptive
It’s too quiet!
Teams are too silo’ed
40. You’ve been committed to things without your knowledge
Ship dates slip
Expected to do 5 things at once
No 1:1 meetings
Small changes take longer than they should
Setting up a dev environment takes 3 days
No one person knows how to do a full deployment
Every deployment results in a big mess
People miss key pieces of info
Sick time is spiking
You get a pit in your stomach when you walk in the door
New requirements appear late in projects
Staff working on similar problems not collaborating
I can’t move code between projects easily.
Staff members reluctant to share knowledge
Documentation is out of date
Status meetings turn into bitchfests
People lament the quality of their work
People are quitting!
Can’t say no to anything
Customers don’t trust what IT says.
Estimates are unreliable
Projects getting later and later
Secondary projects fall through the cracks
Build is always broken!
You’re on a death march, and everyone knows it.
People are struggling with their tools.Every decision is made by committee
No consensus on priorities
Nobody knows what ‘done’ means
Team focusing on peripheral issues
40
Job descriptions no longer match reality
Surprises are common
Uptime is decreasing
Recruitment is getting more difficult
Long term goals aren’t getting closer
Exploratory work has stopped
I cannot determine the status of our systems.
I’m doing all the same things I was a year ago.
Vacations are disruptive
It’s too quiet!
Ops Tools Deficient
41. You’ve been committed to things without your knowledge
Ship dates slip
Expected to do 5 things at once
No 1:1 meetings
Small changes take longer than they should
Setting up a dev environment takes 3 days
No one person knows how to do a full deployment
Every deployment results in a big mess
People miss key pieces of info
Sick time is spiking
You get a pit in your stomach when you walk in the door
New requirements appear late in projects
Staff working on similar problems not collaborating
I can’t move code between projects easily.
Staff members reluctant to share knowledge
Documentation is out of date
Status meetings turn into bitchfests
People lament the quality of their work
People are quitting!
Can’t say no to anything
Customers don’t trust what IT says.
Estimates are unreliable
Projects getting later and later
Secondary projects fall through the cracks
Build is always broken!
You’re on a death march, and everyone knows it.
People are struggling with their tools.Every decision is made by committee
No consensus on priorities
Nobody knows what ‘done’ means
Team focusing on peripheral issues
41
Job descriptions no longer match reality
Surprises are common
Uptime is decreasing
Recruitment is getting more difficult
Long term goals aren’t getting closer
Exploratory work has stopped
I cannot determine the status of our systems.
I’m doing all the same things I was a year ago.
Vacations are disruptive
It’s too quiet!
Overpromising / Underdelivering
42. Themes that We Found
Unfulfilled team members
Bad office vibe
Routine things are complicated
Something’s always on fire
Lack of trust in the developers
Every day is a surprise party
Ops Tools Deficient
Single points of failure abound
Overpromising/Underdelivering
Leadership is distracted
The team is too silo’ed
42
46. Peeling off the Layers
• “IT isn’t good at estimates” really meant
“We need to work towards more
transparency and communication with our
customers and management.”
• Deeper problem: communication
disconnect between dev, management,
and our customers. We need to treat this
problem, not the “IT isn’t good at
estimates” symptom.
46
47. The (Not So) Big Secret!
When you dig into a wetware problem,
you’re always going to find at least one of
these things:
TRUST
But “We suck at communicating”
is not actionable.
Get to more specific root causes.
COMMUNICATION PROCESS
47
48. Engaging the Stakeholders
To get the insight you need, people must be
engaged in a way that’s meaningful to them.
Tailor approach, content and detail.
Understand their motivations, what
they can uniquely contribute, and
meet them there!
48
49. Engaging the Dev Team
• Their motivations:
– Autonomy, Mastery and Purpose
– Daniel Pink’s 2009 TED talk
www.danpink.com/ac/ted-talk/
– Build awesome stuff with awesome people
• What they can contribute:
–They do root cause analysis all the time
–Evidence-based view of the world
–And of course, deep understanding of tech bits
49
50. Engaging the Dev Team
• Your Goals:
–Give the team as much ownership of the
reboot as possible
• Only fair, this is where the brunt of reboot change
will land
–Give visibility to long term strategy and
external dependencies on the team
• “State of the Outside World”
50
51. Dev Team Engagement Tactics
• Create dedicated time/space to foster group
dialogue about the themes and root causes
– Focus Friday
– Forums/chat rooms
• “Squishy” book discussion. Example:
Team Geek
• Meet privately with individuals regularly
(should be doing thisalready).
51
52. Engage Management
• Their motivations: Exploiting market
opportunities, mitigating risk, efficient use
of resources - “macro things”
• What they can contribute:
–org strategy
–deep understanding of what other areas of the
org are doing
–better awareness of external opportunities and
pressures
52
53. Engage Management
• Your goals: Provide visibility to issues,
incorporate their big picture perspectives,
obtain support for change
• Example Activities:
– Roundtables - ask the “macro” questions
– Vision & perception interview with key people
– Use concise, powerful tools, such as...
53
56. Engaging the Customer
• Their Motivations:
–Want solutions which solve their problems
–Want to have transparency into the process
• What they can contribute:
–User centric design and prioritization
–More honest feedback - they don’t have as
many relationships or as much baggage to
protect
56
57. Engaging the Customer
• Your Goals:
–Bring the customer closer into the team
–Get them to help your org prioritize
–Make them happy :-)
57
58. Customer Engagement Tactics
• Roundtables with our customer teams
• Retrospectives after each major release
• Example tool: Mapping the
communication flow
58
60. Converging on Root Causes
Dev Team Insight: “We get a lot of requests that pull
us away from our core projects.”
Management Insight: “We want to help campus folks
who are doing demographic research, but we don’t
have a process for vetting or prioritizing requests.”
Customer Insight: “The MPC agreed to do my
research web site, but when I call over there it seems
that there’s always something more pressing.”
Suggests root causes might be
Lack of Focus on Mission
60
Why do we think “Something’s Always on Fire?”
61. Some of our Root Causes
• Lack of Focus on Core Mission
• Team Poorly Aligned with Strategy
• Operational Deficits
• Lack of Customer-Developer
Transparency
61
62. Getting to Solutions
• For each root cause, collaboratively
design an action plan that mixes easy
wins with longer-term fixes
Quick sampling of our reboot action plans...
62
63. Solutions: Lack of Focus on Mission
• Early Wins:
– Defined the mission!
– Made all hidden work visible to management
– Outsourced or killed fringe projects and other
distractions and aligned effort to the core mission
• Long-term Work
–Align projects to org vision, not vice-versa
–Team structure realignment
63
64. Solutions: Poor Alignment with Strategy
• Early Wins:
– Product Vision document
– Some early hires not tied to specific projects
• Long-term Work
– Conway’s Law and its application
64
65. Conway’s Law
“Organiza)ons
which
design
systems
are
constrained
to
produce
designs
which
are
copies
of
the
communica)on
structures
of
these
organiza)ons.”
Melvin
Conway
1968
65
67. Solutions: Operational Deficits
• Early Wins:
–Evolve tools: SVN → Git, TeamCity → Jenkins,
Chef, Vagrant
–Management support for 20% tech debt time
–HipChat
• Long-term Work:
–Ongoing technical debt reduction
–System Monitoring API
67
68. Solutions: Lack of Customer-Developer
Transparency
• Early Wins:
–Formalize a quarterly planning process
–Open work tracking tool to customers
• Long-term Work:
–Refactoring the communications model
68
70. Measuring Outcomes
• Probably the most important one is
qualitative: “Hey, things feel better now!”
• Quantitative indications will eventually
come. Focus on measures that have
meaning to your org’s culture (KPIs?)
• Share metrics and results widely to keep
momentum going
70
72. Where are we now?
• We feel good about...
–Management support for the development team
–Long term vision and goals
–Awareness of the need to prioritize/focus efforts
–Communication patterns
• We are still working on...
–Acting holistically within a project-specific funding model
–Aligning the staff to best support the desired product
–More effective use of ops tools to automate routine work
72
73. Lessons Learned
• Know who you are as an organization
– History, context, strengths, weaknesses, constraints, strategy
• An Org-first AND People-first approach is possible
• People are messy
– HRT underlies successful org change
– Assume that everyone has good intentions
– Don’t hide behind technology - wetware issues require face
time
• Change is scary. Expect resistance. This is a process of
influence - you can’t make others change directly.
• Don’t fear going against the grain - do what’s right for
your org, don’t be a slave to any particular process,
system, methodology
73
74. Lessons Learned
Avoid reboots if at all possible!
74
Incorporate your newfound tools and techniques into your
daily workflow to pay more attention to how change is
impacting your team, engage your colleagues, and
respond to it proactively before you need another reboot!
They’re expensive, disruptive and tricky to execute.
Instead...
75. Thank You!
Continue the conversation...
Twitter: @franfabrizio
Email: fran@umn.edu
Special thanks to Peter Clark (@pclark) for his
contributions to an earlier version of this presentation.
75