MathWorks has approximately 100 products derived from a single large code base, with over 1,000 developers contributing changes to almost one million source files. Their products are used to develop safety-critical systems, so managing a continuous influx of changes while guaranteeing quality and correctness is challenging. Learn how MathWorks, unable to use a simple model of “component as a directory,” created an elegant system using virtual streams and the Perforce broker in their ongoing efforts to modularize their code base.
2. #
• Background
• Component-Based Development
• Migrating to Streams
• Remaining Challenges
• Questions
3. #
• We are a 3000+ person company dedicated to
accelerating the pace of engineering and science
• We have ~90 products based upon our
core platforms
– MATLAB – The Language of Technical Computing
– Simulink – Simulation and Model-Based Design
4. #
• Unified code base from which full product family is
released twice a year
• Integrating changes from ~1000 developers
• Managing an almost 1 million file code base
6. #
• Components have meta data (XML files) that
describe their key characteristics
– what portions of the SCM namespace they own
• A mixture of directories and files
– what components they depend on
• Each source file is owned by exactly one
component
– Some directories consist of files from different
components
7. #
• Each branch is composed of a collection of
components
– We call this collection the CTB
(or “Components To Build”) list
• Branches are hierarchical
– Children have a proper subset of their parent’s
components
8. #
• Wide-open client views
– Eliminates need to manage client views
– Eliminates constraints with adding/moving files
• Restricted branch views
– Computed from the component metadata
– Minimizes integration records in server database
– Makes merging fast
9. #
• Branch views don’t define the “width” of branches
– They define mappings between related codelines
– With a wide-open client view, stale files remain visible
Mapping Width
Branch Width
“Dead wood”
10. #
• Narrow client views
– Makes certain workflows more cumbersome
• Adding a new component
• Expanding the CTB list to include an existing component
• Changing the files a component owns (its “shape”)
11. #
• Our strategies for managing client and branch
view were not meeting our needs
– Initial efforts worked for small, relatively static branches
– Larger more dynamic branches became a nightmare to
manage
13. #
• Streams replace separate client and branch views
with a unified stream view
• Streams are true hierarchical branches
• Their evolution is recorded as part of depot history
14. #
• Avoid the need for developers to know about or
manipulate client and/or branch views
• Make componentization and refactoring easier
• Keep workflows simple and robust
• Use Perforce efficiently so as to avoid
performance and scalability issues
15. #
• Any change to a stream immediately affects all
subsequent operations involving the implied client
or branch views
• So a developer who changes a stream’s shape
can affect the work-in-progress of all other
developers working in that stream, even if they
haven’t re-synced their workspace
16. #
• Wide-open client views
• The ability to widen a branch (stream) from below
• Stream (shape) changes to be atomic with content
changes
18. #
• Client uses real stream
– Wide open view
“share …”
• Each real stream has a
corresponding virtual
stream
– Holds the limited
component-derived view
19. #
• A broker wrapper dynamically switches the client
to the virtual stream when running commands that
should work in a limited view
– p4 sync, flush, update
– p4 merge, copy, integrate, populate
• Transparent for both CLI and P4V usage
23. #
• The shape of the virtual stream is updated on
submit
– via a change-content trigger
– only when the submission contains metadata changes
• Validates user changes to component metadata
• Virtually no workflow changes
24. #
• Provides a wide client/narrow branch paradigm
without the shortcomings
• No need for developers to know about or
manipulate client or branch or stream views
• Sophistication occurs in the re-computation of the
stream shape at submit time
25. #
• Developed automated tooling for merging
– Temporary virtual stream to bound merge down
– Wide open stream for merge up
• Insulates users from the complexities of merging
with time-varying component definitions (shapes)
27. #
• Test Infrastructure that mirrors production
• Every night at midnight, the test server is rebuild
from the backup of production
• Allows us to test any sort of approach we want
with real live data without impacting production
28. #
• Couldn’t just “p4 populate” the streams
– Too much change inflight
• Used deep renames
Preserved revision history
Enabled incremental migration
– Undocumented command
– Experienced random loss of integration records
31. #
• Rename is powerful, but has many merge issues
– Renames that cross view boundaries
• Old name or new name is out of the destination view
– Complex sequences of changes cannot be merged
atomically
• Rename followed by re-add of old name
– Incorrect automatic merge results
• Unnecessary manual resolves
32. #
John LoVerso
john.loverso@mathworks.com
The Team:
Marc Ullman
Michael Mirman
Karishma Panjwani
Raghuvir Leelasagar
Notes de l'éditeur
I’m John LoVerso from MathWorks.
Share with you
how we used streams to complete moving our development into Perforce
We migrated from a home-grown CI system
to Classic Perforce.
I’ll cover some issues we hoped Streams would solve,
and describe the unique and novel way of using virtual streams that enabled us to be successful
They serve a wide array of markets
Aero, Auto, Bio, Communications, Finance, Medical, etc.
Products are used to develop safety-critical systems
Quality and correctness are paramount
build and test of inbound code changes before they are committed to the branch
As I said, we have a single unified source tree that builds all of our products.
Our code base has grown organically over the years, which means our products have many interdependencies.
To help manage this, have broken our source tree up into proper subsets called components.
A component consists of a unique subset of the source tree.
a collection of subdirectories and individual files.
It owns all the files in that subset.
These subsets can overlap, but each file is owned by exactly one component.
Each of our products is made up of multiple components. We have over 4000 components.
The characteristics of each component are recorded in a metadata file that
is checked into and part of our source tree.
This metadata was initially used to drive our build and release processes.
Our branches are composed of the collection of components for the sources
that developers of that branch need to modify
The branch shape is computed from the union of the component metadata for all the components in that branch
As development happens, the subdirectories included in a component can change.
Additionally, the set of components needed by a branch can change,
for instance to include an additional library.
thus, branches change shape over time.
Our branches are organized in a hierarchy, with a main branch at the top,
integration branches under that, all the way down to leaf branches used by product teams.
All of our branches are proper subsets of the branch above them.
When we started to move our development into Perforce,
we used wide open client views so that developers had freedom in their day to day development.
We used restrictive branch views to restrict the files merged between branches,
to avoid huge leaf branches,
limit the number of integration records,
and make merging faster.
The problem with this approach is that source files that are no longer logically on your branch continue to remain on your branch, and visible.
This is because our branches are constantly changing shape.
For example, you might need to add a component to your branch to make changes to its source code.
Later on, when that component is no longer needed on that branch, it would be removed from the components to build list, to narrow the branch and in order to decrease your build time.
As our branch views are computed from the component meta data,
the set of files merged between branches would be correct.
But the wide open client view would continue to expose the source files
for any component that was ever part of the branch.
We next tried using a narrow client view computed from the component metadata.
Developers ran a script that computed the view and then did the sync for them.
Of course, if they did the sync directly, they would get the wrong results.
Using a narrow client view made for more difficult developer workflows,
such as renames, as you first had to update the view before
you could make any changes that crossed existing component boundaries.
Any refactoring required coordinated changes to your client and the branch view,
and possibly to your parent’s client view in the case of trying to widen your branch from below.
Nightmare to manage for both developers and release engineers
scalability begot componentization,
but while client views seemed well suited to componentization, the reality was that things were painful
At the beginning of 2013, we had less than 30% of our developers using 50 branches in Perforce.
We didn’t think we could move the rest
Streams replace separate client and branch views with a unified stream view.
We thought this would help solve our client and branch view problems.
The branch and client views generated from a stream track together,so that the effective width of a stream (branch) is reduced when the stream view is shrunk
Because stream views are common to all users of a given stream, changing the shape of a stream immediately affects all other users.
That happens asynchronously from the code change that drove the stream shape change.
Because our streams generated limited client views, we had the same problems as when we used classic perforce with a narrow client view. It was not easy to add or move files outside the current stream definition until the stream had been widened.
It made refactoring tasks more difficult.
Because stream views are inherited and bounded by the width of their ancestors, it is impossible to widen a stream “from below” (starting with a leaf stream)
Truth is that streams are far less flexible
Streams inheritance model did not mesh well with our desired workflows.
We realized we needed additional goals to make streams a viable solution for us
Virtual streams provide the ability to limit the view into a regular stream.
In their normal usage, virtual streams don’t help to avoid any of the drawbacks we encountered.
All they can do is further narrow the view of an existing stream.
Working with folks at Perforce,
we realized we can use a combination of regular and virtual streams to meet all of our design goals
“Time Consistent Stream Shapes”
The user’s client uses a stream with wide open view – mapping the entire branch
Enables refactoring workflows
Allows widing the branch from below
Can work even on files not yet on parent steam
Key is to substitute virtual stream only for those operations
That should act on narrow branch
Certain commands are redirected to the virtual stream
For commands like sync, we change the stream in the client spec.
For commands like merge, we change the argument being passed in the command.
This syncs the content from the given change, but uses the shape of the tip. You have to explicitly set StreamAtChange in your client to get the correct shape, and that’s not the default.
Developers can use “p4 submit” when making configuration changes (rather than a special tool)
Key concept:
Committing stream shape changes and associated content changes becomes an atomic operation
All of these characteristics serve to ensure the integrity of our code base
Cost is that submit with a config change takes longer (up to 90 seconds)
Developers can work with wide-open client view
Branches are no wider than needed
When branches are narrowed, abandoned files go out of view
Developers can add or move files outside of current branch width without needing to modify stream view first
New stream views can be qualified without impact to common stream definition
Branches can be widened from below
Stream views automatically remain in sync with the checked in code.
developers don’t even need to think about SCM views – they only need to change the component metadata (which was already a requirement of our build system)
Success
The tradeoff for making day-to-day workflows easier was that we didn’t impact merging between branches.
Merging narrow branches was already complicated; we didn’t directly make it easier.
95% of our users are insulated from this
Real magic is auto merge process
As a result, we’ve made cbd & merging relatively painless process
Users are largely insulated from complexity of this … until we hit P4 merge bugs
Avoid extra integration records for every file on every branch.
Used array of test servers to develop tooling and pre-qualify each deep rename before doing it on production server
Deployed trigger to old depot to start creating the virtual stream even before the deep rename.