The document discusses Google's engineering culture and infrastructure. It provides an overview of Google's practices around code review, team programming using tools like Gerrit, and the engineering pipeline. It also shares personal stories from software engineers and principles for balancing process with creativity.
1. Engineering Culture & Infrastructure
Team Programming, Code Review, Robotization, Pipeline and Feedback
2016.12.29
Schubert Zhang
2. Engineering infrastructure (technologies, processes, tools, and culture) that
enable engineers to innovate and release software continuously with agility,
quality, and productivity.
This keynote gives an overview of the overall architecture, workflow, and scale.
3. Agenda
• What? Why?
• Google Practices
• Code Review
• Gerrit, Team Programming
• Big-Picture of Engineering Pipeline
• Personal Stories of being a Software Engineer
6. 因为
我们是科技⼯工程师,我们的⽬目标是
通过 Innovation & Creation,不不断为⽤用户创造新的价值
我们希望我们的公司和团队是个 Innovation Factory
“We take our jobs to be innovators and we are failing if we are not innovating quickly enough!”
— Eric Schmidt, Google CEO
我完全清楚那种被技术质量量债务压的喘不不过⽓气来的感受,在那种状态下,⼀一切创新性的想法都会被遏制,以免不不⼩小⼼心破坏了了脆弱的产品。
— Patrick Copeland,
the senior director of Engineering Productivity and the top of the testing food chain at Google
8. Balance
Process-heavy vs. Process-less
Burdensome and anti-creative Culture vs. Heroic Culture
(contrary to the creative nature of innovation) (unable to repeatedly deliver)
There needs to be balanced!
We need to focus on staying airborne for the long term.
We need to motivate smart minds to solve hard problems and deliver rich features to customers.
“Uniform Workflow” 与 “Freeform” 的平衡! To be agile!
10. The entire product team is responsible for quality, and is judged
on their ability to enable innovation, anticipate problems, make
plans, and implement high quality software.
Under a common-sense workflow, teams adopt processes that
are in their own self interest and that allow them to focus on
innovation.
11. • development teams write good tests and do more review because
they care about the products
• more time to spend writing / innovating features and less in later
phases debugging to fix bugs.
• socially code review and check-in practice.
13. The infrastructure and workflow
with the following principles
• Speed: All test, review and analysis systems need to return results very fast. If it takes too long, engineers will
either ignore or not bother looking for that data.
• Feedback: The focus of test, review systems must be on high quality feedback. We want engineers to keep
code at production quality at all times, not adding time to fix code that was broken earlier.
• Simplicity: Engineers should not have to understand how the underlying build and test systems work. All data
and feedback must be easy to understand, integrated into commonly-used productivity tools, and presented
in a workflow that allows them to take appropriate action.
• Extensible: The infrastructure pipeline / loop / architecture makes a common sense framework, teams or
individuals can add new tools or plugins at any stage.
• Enjoy: Don’t make me think, just enjoy it.
• By reducing the window of opportunity for bad code to go unnoticed, overall debugging and bug isolation
time is radically reduced. The net result is that the engineering teams no longer sink hours into debugging
build problems and test failures.
14. We need an approach that provided developers nearly
instant feedback on every code check-in.
打造⼀一个“⽣生产线式”的、“可复制”的创新研发⼯工⼚厂。
15. Imagined Basic System for Engineering
let it web-based
with user experience
let it robotization
with instant feedback
19. Single Large Code Repository
• Google’s monolithic cloud repository provides a common source
of truth for tens of thousands of developers around the world.
• Foundation of many of Google’s developer workflows.
• Pros: unified versioning, extensive code sharing, simplified
dependency management, atomic changes, large-scale
refactoring, collaboration across teams, flexible code ownership,
code visibility, clear tree structure providing implicit namespacing.
• Cons: Heavy in-house development of the support infrastructure
systems, complexity.
• Distributed over 10 Data-Center: Paxos to guarantee consistency
across replicas (Bigtable -> Spanner)
• Most traffic originates from Google’s distributed build-and-test
systems. (Powerful cloud-based build system and test
infrastructure, tools.)
Google 在 1999 年年基于 Perforce 建⽴立的中⼼心化 Repository 发展⾄至今。那时还没有 Git,⽽而今天 Git 已经成为标准,并享受 Git 的便便利利和习惯。
20. Google Code Infrastructure
Other stuff:
Mercurial
GFLAG / GMOCK / GTEST for C++
Bazel Build System: http://www.bazel.io
…
Google CodeStyle Guide
Google Code Repository System in Cloud
Perforce -> Piper
(Workflow, ACL)
On Google Infrastructure, Bigtable -> Spanner
Heuristics
CitC local client
Heuristics
CitC local client
Heuristics
CitC local client
Code Writing
Critique
Web-based Code Review
Instant Feedback
Code Search
Codestyle
Checker
Codebase
Health
(Rosie)
Maintainer tools
Refaster, ClangMR,
Clipper …
Static
Analysis
(Tricorder)
Code Review Tools
(Mandrian->Citique)
Automatic Build
Infrastructure
Automatic Test
Infrastructure
Code Browsing
Search, Editing
Tools
Unit-Test
Verification
test
coverage
world-wide developers
Google “presubmit” infrastructure provides automated testing and analysis of changes before they are added to the codebase.
21. Presubmit Workflow
Automatic Testing, Analysis, and Code Review
Culture: All code is reviewed before being committed to the repository.
No code, for any product, for any project, gets checked in until it gets a positive review.
Many automatic checks and verification are run presubmit.
LGTM: “Looks Good To Me”
and automatic analysis,
automatic testing …
22. Trunk-based Development
(one source tree, no branching)
• With the exception of a few core systems, all of Google works from a single source tree and from head. changes are made to
repos in a single, serial ordering
• Developers work on a consistent view of codebase
• Avoids the painful merges that often occurs for long-lived branches
• Development on branches is unusual, branches are typically used for releases.
• Release branches are cut from a specific revision of the repository.
• Bug fixes and enhancements that must be added to a release are typically developed on mainline, then cherry-picked into the
release branch.
• Use of long-lived branches with parallel development on the branch and mainline is exceedingly rare.
• When new features are developed, both new and old code paths commonly exist simultaneously, controlled through the use of
conditional flags, through configuration. (eg. GFLAG, refers to RFC1122)
• A small set of very low-level core libraries uses a mechanism similar to a development branch to enforce additional testing
before new versions are exposed to client code. (类似我们的常规做法,做充⾜足的测试后再⼊入库)
23. Code Review Principles
• All change lists (CLs) must be reviewed
• First thing they check about readability.
• Documentation for classes and functions and module.
• Code will be linted (automatically) to make it error free.
• Check whether code written use best practices and use low resources.
• Reusability
• Any CL can be reviewed by any engineer at Google.
• The author may get review comments from the guys who reviewed file.
• Once review done he will get LGTM.
• Each directory / projects has a list of owners
• It’s not a process, it’s a habit, or culture, and they are all proud of it.
24. Open and Collaborative Culture
• Over 99% of code is visible to all full-time Google engineers.
• Anyone can use, study, review, comment or contribute any code.
• Code Ownership (Tree based)
• Cross team collaboration
• The fact that most Google code is available to all Google developers
has led to a culture where some teams expect other developers to
read their code rather than providing them with separate user
documentation.
26. Originally Code Review by email
then Mondrian from 2006, a web-based Code Review Workbench
27. Organization
• Flat & Autonomous
• Bottom-up
• Social
• Teams are aligned along business lines / focus areas / scrums
• Projects live and die based on free-market Darwinism, projects must
produce value to survive.
28. For Your Reading
• Google’s Innovation Factory: Testing, Culture, And Infrastructure
• Why Google Stores Billions of Lines of Code in a Single Repository
• Book: How Google Test Software
• Quora: What is Google's internal code review policy/process?
• Quora: How does Google manage to solve the compilation, build, code review process?
• GoogleTechTalks on Youtube: Mondrian, Code Review On The Web
• GoogleTechTalks on Youtube: Using Gerrit to enhance your Git
• What we learned from Google: code reviews aren’t just for catching bugs
32. • Catch bugs (the latest value, trivial)
• Shared code style
• Build stability
• Social (with social psychology)
• Knowledge sharing (mentoring,
learning, and avoid SPoF)
• Early feedback
• Team engagement
• Qualitative code selection
33. Pitfalls and Mistakes of Inexperienced Reviewers
(bad experiences, cause a lot of trouble)
• To find all bugs (⼼心理理压⼒力力)
• Judging code by whether it's what the reviewer would have written.
(hard feelings and frustration, 痛苦和挫败! )
• Feels obligated to say something. (⼼心理理压⼒力力)
• Speed: shouldn't rush through, but ASAP (拖延症)
http://goodmath.scientopia.org/2011/07/06/things-everyone-should-do-code-review/
34. “Code Review” is a misleading
Goal is cooperation and engineering social, not
bug-finding
Co-located teams,
Collaboration, Share, Discuss
Open, Transparency
Social
The most positive result of a Code Review cycle is, in addition to higher code
quality standards, it is about getting different people to look at the version of
the solution and having a constructive discussion around it.
36. Psychological Effects
If you're programming and you know that your coworkers are going to look at
your code, you program differently. You'll write code that's neater, better
documented, and better organized -- because you'll know that people who's
opinions you care about will be looking at your code.
Without review, you know that people will look at code eventually. But because
it's not immediate, it doesn't have the same sense of urgency, and it doesn't
have the same feeling of personal judgement.
37.
38.
39. Pre-launch Review, and review anywhere
to ensure that products answer common sense questions before release
• Is the design secure and customer data private?
• Will the service scale with the anticipated load?
• Does the UI meet standards?
• What are the data center utilization estimates?
• What are the latency estimates?
• …
40. For Your Reading
• Things Everyone Should Do: Code Review
• Things Everyone Should Do: Coding Standards
• What we learned from Google: code reviews aren’t just for catching bugs
• Ways to Make Code Reviews More Effective
• 关于Code Review,你「必须」了了解的⼀一些关键点……
• 为什什么要坚持code review
• 陈⽼老老师|我的“code review”成⻓长之路路
• Microsoft: Code Reviews Do Not Find Bugs - How the Current Code Review Best Practice Slows Us Down
42. Gerrit Practices
Google Android, Wikipedia, Spotify, Baidu, Eclipse, LibreOffice, Openstack,
SAP, HP, Motorola, SONY Mobile, Intel, Qualcomm
Palantir, eBay, 个推, and many Startups
Initially founded by Shawn Pearce (Google) as tool for the Android OS
Development, forked from Google Mandrian
Google
Mondrian
Google
Rietveld and Critique
Gerrit
Google for Android
43. Covers the Benefits of Team Programming
• Catch bugs (the latest value, trivial)
• Shared code style
• Build stability
• Social (with social psychology)
• Knowledge sharing (mentoring,
learning, and avoid SPoF)
• Early feedback
• Team engagement
• Qualitative code selection refers to previous page
50. Gerrit Style vs. Gitlab(Github) Style
Gerrit Github
for a “one-man band’s project”
适合分散的个体 Contributor
适合企业团队和团队间协作
1. fork the project
2. clone the project
3. create commit
4. push to the forked project and
5. create pull request
1. clone the project
2. create commit,
3. push for review
51. BTW: 学习 Github, Gerrit 的抽象设计
并可应⽤用于其他产品设计
• Gerrit’s permission scheme (Project +
Group + Permission)
• Subject: Single or multiple sets of people
identified by Gerrit
• Action: The ability to allow or deny a
specific operation.
• Resource: Single or multiple sets of Gerrit
objects (typically Git reference) that are
controlled by the permission
• Github’s Organization + Team + People
在商户系统中的模型设计借鉴:
52. Review Etiquette (社交礼仪)
• Review as discussion board
• Review each file
• It’s all about the code
• Always answer all comments
• Use code in comments
• One change, one thing
• Use topics
• …
• 认真写注释
• 认真写好 commit message
• jargons for commit message
• [Minor], [Major], [Crucial]
• [RFC] (Request For Comments)
• [WIP] (Work In Progress)
• [TBR] (To Be Released)
• [TBL] (To Be Launched)
• …
60. For Your Reading
• Handbook: Learning Gerrit Code Review
• Book Review: Learning Gerrit Code Review
• Gerrit Code Review - A Quick Introduction
• About Gerrit and it's Google Gene (FanQiang)
• GerritForge: Gerrit Code Review Enterprise
• Andoid - Submit Patchs (FanQiang)
• Android - Life of a Patch (FanQiang)
• GoogleTechTalks on Youtube: Using Gerrit to enhance your Git
• Quora: What is Google's internal code review policy/process?
• A successful Git branching model
• Top 10 tools for code review and 15 Best Code Review Tools for Developers
64. Shenzhen
Office Local
Git Repos
Shanghai
Office Local
Git Repos
sync sync
Shenzhen
Local CI for
Dev & Test
Robots
Shanghai
Local CI for
Dev & Test
Robots
fetch fetch
Developers at Shenzhen Developers at ShanghaiDevelopers Anywhere
Collaborative
work
Collaborative
work
Collaborative
work
Codebase in the Cloud
Git Repos in IDC
(Review Board)
Online Services
IDC
66. For Your Reading
• Static code analysis: do it the right way
• Analysis vs. Preview vs. Incremental Preview in SonarQube
• Google Engineering & Technology
• List of tools for static code analysis