What if we designed our organizations like we design our systems? Applying scalability principles that we know from building large-scale distributed systems, as well as practical lessons learned at eBay and Google, this session covers how we can design and evolve our engineering organizations to scale.
4. Universal
Scalability Law
System throughput is limited by
โข Contention
o Queueing on a shared resource, O(N)
โข Coherence
o Coordination and communication between all nodes, O(N2)
http://www.perfdynamics.com/Manifesto/USLscalability.html
5. Universal
Scalability Law
โข Implications
o Find ways to remove contention points
o Find ways to reduce or eliminate coordination overhead
o Increased N ๏ more contention, more coherence
โข ๏จ Multicore processor design
o Fast to stay within a core
o Expensive to synchronize across cores
โข ๏จ Distributed system design
o Sharding
o Eventual Consistency
6. โWhat if we designed our
organizations like we design
our systems?โ
8. Small
โServiceโ Teams
โข Amazon โ2 Pizzaโ Teams
o No team should be larger than can be fed by 2 large pizzas
o Typically 3-5 people
o Mix of junior and senior people
โข Team == Component | Service
o Clear, well-defined area of responsibility
o Single service or set of related services
o Minimal, well-defined โinterfaceโ
โข Applying the Universal Scalability Law
o Reduce N within teams
o Well-defined responsibilities reduce synchronization / coordination points
between teams
9. End-to-End
Ownership
โข Teams own their roadmap
โข No separate maintenance or sustaining engineering
team
โข Engineers own service from design to deployment
to retirement
10. Team
Anti-Patterns
โข Skill-based teams
o Based around tiers or technologies (e.g., front-end team, application
team, DBA team, Ops team)
o (-) Every project crosses many team boundaries
o (-) No end-to-end ownership of the system
o (-) No end-to-end ownership of the customer experience
โข Project-based teams
o Form ad-hoc team for a particular project, then disband
o (-) No long-term ownership of code, product, service
o (-) Encourages short-term approach instead of sustainable technical debt
11. Team
Anti-Patterns
โข Large teams
o (-) Teams larger than 6-8 should be split
o (-) Communication and coordination overhead makes it increasingly
difficult to sustain velocity
13. Autonomy and
Accountability
โข Give teams autonomy
โข Freedom to choose technology, methodology, working environment
โข Responsibility for the results of those choices
โข Make teams self-sufficient
โข Team has inside it all skill sets needed to do the job
โข Depend on other teams for supporting services
โข Hold team accountable for *results*
โข Give a team a goal, not a solution
โข Let team own the best way to achieve the goal
14. Autonomy and
Accountability
โข Clear โcontractโ provided to other teams
โข Functionality: agreed-upon scope of responsibility
โข Service levels and performance
15. Decisionmaking
Anti-Patterns
โข Single authority
o Decisions made or approved by single person (CTO?)
o (-) Single bottleneck / contention point
o (-) Single point of failure
o (-) Unsustainable for decisionmaker
o (-) Discourages autonomy, ownership, growth
โข Unanimity / Consensus
o Decisions made or approved by โeveryoneโ
o (-) Constant need for coordination / coherence
o (-) Increasingly ineffective / counterproductive as organization grows
o (-) Discourages autonomy, ownership, growth
17. Effective
Global Teams
โข Local Ownership
o Well-defined area of responsibility
o Clean interface with the rest of the organization
โข Individual teams are co-located
o High-bandwidth communication within a team
o Minimal coordination across teams
18. Global Team
Anti-Patterns
โข Anti-Pattern: Split Teams Over Geographies
o (-) Constant need for coordination over time zones
o (-) Local conversations become disruptive rather than helpful
o (-) No local pride of ownership
โข Anti-Pattern: Remote Team as Job Shop
o (-) Constant need for management and task assignment
o (-) Resentment between first-tier and second-tier sites
o (-) No local pride of ownership
o Ex. eBay remote offices vs. Google remote offices
19. Distributed
Teams
โข Fully distributed *OR* fully co-located
o Distributed teams rely on virtual proximity (chat, hangouts, IRC)
o Co-located teams rely on physical proximity (co-working)
โข Anti-Pattern: โMostlyโ co-located
o (-) Co-located majority ends up determining communication methods
o (-) Remote individuals left out, less able to contribute, less productive