People are always happy to see this. Project Managers don’t want to spend all their time mucking about with spreadsheets and status reports. Team members don’t want to be distracted from their work to perform overhead activities. The minimalist approach to measurement is always met with smiles. And then...
People are always happy to see this. Project Managers don’t want to spend all their time mucking about with spreadsheets and status reports. Team members don’t want to be distracted from their work to perform overhead activities. The minimalist approach to measurement is always met with smiles. And then...
The key attributes of useful metrics: (1) The measurement is used by some stakeholder to make decisions at some level. Measurements that are just filed away and never used are merely waste. (2) Level of detail. Each stakeholder can consume and use information at a particular level of detail. An executive will not be able to consume static code analysis statistics about cyclic dependencies. The executive will be able to consume information about code quality at a higher level of abstraction than that. (3) Scope. A team member will care about information pertaining to the team and project; a program manager will care about information pertaining to all the projects in his/her program; an executive will care about information pertaining to the enterprise as a whole. (4) Time frame. The customer or Product Owner needs up to the minute information throughout the project; the program manager needs information pertaining to a release; an executive needs information pertaining to the timeframe of a strategic plan or budget period. The executive can’t consume or use a daily report of iteration progress. The time frame is too small to be meaningful in his/her job.
...we get down to the nitty-gritty about what is “necessary.” The definition of “necessary” includes all stakeholders. Everyone involved in a project must understand and accept that a certain amount of time will be spent taking measurements they, personally, aren’t interested in. We don’t want to waste time measuring and tracking information that nobody uses. We do have to ensure all stakeholders receive the information they need.
Many organizations have experimented with agile methods and a significant minority have moved beyond the proof of concept or pilot project stage in applying agile methods. However, as this is written, no large organizations are using agile as their primary, mainstream software development approach. Agile remains a secondary or alternative approach to software development projects. Your organization probably falls somewhere along a spectrum between the “fully traditional” and “fully agile” extremes. Some of the organizational differences that have implications for project metrics are listed on the slide. For stakeholders whose interest is at the level of a single team or single project, if the organization is “fully agile” then the metrics presented until now will be sufficient (with the addition of financial metrics). If the organization is not “fully agile,” then you may need to provide additional project metrics to ensure the project’s true status is properly understood by all stakeholders, and to help immature agile teams improve their effectiveness. The specifics will vary by circumstances. Different organizations have different problems and are at different levels of maturity with agile and lean thinking and application. We will present a few examples in this presentation, but you may well have to think of metrics that are meaningful in the context of your own situation.
Can we derive a basic set of metrics for agile teams based on the principles of the Agile Manifesto and what we know of project stakeholders’ needs? If working software is the primary measure of progress, then let’s measure the amount of working software the team delivers. The Product Backlog contains a list of features the customer wants to see in the software product. (These are the functional requirements.) As the team delivers each feature, we can count the number of features that have been completed and that are running in the development environment with all tests passing. This number should climb as the team builds up more and more of the software.
Can we derive a basic set of metrics for agile teams based on the principles of the Agile Manifesto and what we know of project stakeholders’ needs? If working software is the primary measure of progress, then let’s measure the amount of working software the team delivers. The Product Backlog contains a list of features the customer wants to see in the software product. (These are the functional requirements.) As the team delivers each feature, we can count the number of features that have been completed and that are running in the development environment with all tests passing. This number should climb as the team builds up more and more of the software.
Can we derive a basic set of metrics for agile teams based on the principles of the Agile Manifesto and what we know of project stakeholders’ needs? If working software is the primary measure of progress, then let’s measure the amount of working software the team delivers. The Product Backlog contains a list of features the customer wants to see in the software product. (These are the functional requirements.) As the team delivers each feature, we can count the number of features that have been completed and that are running in the development environment with all tests passing. This number should climb as the team builds up more and more of the software.
Can we derive a basic set of metrics for agile teams based on the principles of the Agile Manifesto and what we know of project stakeholders’ needs? If working software is the primary measure of progress, then let’s measure the amount of working software the team delivers. The Product Backlog contains a list of features the customer wants to see in the software product. (These are the functional requirements.) As the team delivers each feature, we can count the number of features that have been completed and that are running in the development environment with all tests passing. This number should climb as the team builds up more and more of the software.
This is a metric to help track the “valuable” part of “valuable software.” Earned Business Value (EBV) may be measured in terms of hard financial value based on the anticipated return on investment prorated to each feature or User Story. Alternatively, EBV may be expressed as the relative value of features or User Stories. In either case, the development team must ask the customer or customer proxy to assign a value to each feature or User Story so that there will be a basis for this measurement. If the customer will not or cannot assign a value to each feature or User Story, then the next best thing is to assume the highest priority features are also the highest value features, and track “value” on the basis of the team’s delivery of high priority stories.
Can we derive a basic set of metrics for agile teams based on the principles of the Agile Manifesto and what we know of project stakeholders’ needs? If working software is the primary measure of progress, then let’s measure the amount of working software the team delivers. The Product Backlog contains a list of features the customer wants to see in the software product. (These are the functional requirements.) As the team delivers each feature, we can count the number of features that have been completed and that are running in the development environment with all tests passing. This number should climb as the team builds up more and more of the software.
This is a metric to help track the “valuable” part of “valuable software.” Earned Business Value (EBV) may be measured in terms of hard financial value based on the anticipated return on investment prorated to each feature or User Story. Alternatively, EBV may be expressed as the relative value of features or User Stories. In either case, the development team must ask the customer or customer proxy to assign a value to each feature or User Story so that there will be a basis for this measurement. If the customer will not or cannot assign a value to each feature or User Story, then the next best thing is to assume the highest priority features are also the highest value features, and track “value” on the basis of the team’s delivery of high priority stories.
Sample velocity chart.
This principle suggests three things: (1) Customer satisfaction, (2) early and continuous delivery, and (3) valuable software. Velocity is a measure of the amount of work the team completes per iteration. Features are usually divided into User Stories, and User Stories are sized by the development team in terms of story points. When the customer accepts a story as complete, the team is credited with the number of story points associated with that story. The team receives no “partial credit” for incomplete stories. Therefore, by tracking velocity we are tracking customer satisfaction. Since velocity is calculated in each iteration, and software is demonstrated to the customer in each iteration, by tracking velocity we are tracking “early and continuous delivery.”
This principle suggests three things: (1) Customer satisfaction, (2) early and continuous delivery, and (3) valuable software. Velocity is a measure of the amount of work the team completes per iteration. Features are usually divided into User Stories, and User Stories are sized by the development team in terms of story points. When the customer accepts a story as complete, the team is credited with the number of story points associated with that story. The team receives no “partial credit” for incomplete stories. Therefore, by tracking velocity we are tracking customer satisfaction. Since velocity is calculated in each iteration, and software is demonstrated to the customer in each iteration, by tracking velocity we are tracking “early and continuous delivery.”
Snapshot of static code analysis output for a real project. It looks like the team has gotten carried away with the graphical capabilities of their reporting tool. Too small to read on a slide. Important parts: Test coverage 73%, tests passing 99.1%, most complex packages, least tested methods, some of the statistics in the blue section.
Closeup of statistics from static code analysis shown on the previous slide.
Graphic taken from here: http://hackystat.ics.hawaii.edu/hackystat/docbook/ch10s04.html
EV is the sum of the planned value (PV) of all the work items completed to date. It is not based on actual cost (ACWP), it is based on budgeted cost (BCWP). Therefore, it is useful for seeing schedule variance and budget variance, but does not give us information about cost overruns based on actual costs. Many people include a trend line in their EV charts to track ACWP as well as EV so that the real costs are visible.
Differences between predictive and adaptive planning for purposes of using EVM. The main difference is that the scope is defined at a finer level of granularity with predictive planning than with adaptive planning. With predictive planning, the level of detail gives the illusion of accuracy through false precision. In fact, details cannot be known in advance with a high degree of accuracy. With adaptive planning, the coarse level of detail means low precision. Provided people understand the plan is only approximate, the result may be higher accuracy, but only within a relatively wide margin of error.
Agile development projects follow either an iterative process or a non-iterative process. With an iterative process, the work is divided into equal-length time periods called “iterations” or “sprints.” The team commits to deliver a fixed amount of work in each iteration, chosen by the primary stakeholder according to business priorities. With a non-iterative process, the team works from a prioritized queue of work items and completes the items one at a time. This process is based on the lean manufacturing concepts of “customer pull” and “single-piece flow.” In either case we can usually apply EVM by breaking the costs down into fixed-length time intervals.
The EVM calculations depend on our being able to define a discrete level of effort for each work item. In situations when that is not feasible, EVM may yield inaccurate and misleading results.
The Value Delivery quadrant of the scorecard might look something like this. The example has snapshots of three charts from a spreadsheet program that show metrics relevant to value delivery and release status: Earned Business Value, Running Tested Features, and Release Burndown. It also contains a simple indication of the general status of delivery risks. The example shows a yellow light, which means every issue hasn’t been resolved but there are no critical issues. On the flipchart or whiteboard, write “Delivery Effectiveness” as the title of the upper right-hand quadrant on the scorecard. Ask participants which agile metrics pertain to this category? Possible answers: Burn chart. The bar chart version of the burndown chart is based on the same data as the line version we displayed in the Value Delivery quadrant, but makes visible the team’s effectiveness by correcting for scope changes. The tops of the bars descend smoothly when the team’s velocity is stable. Additional scope is shown at the bottom of each bar in a contrasting color, dropping below the zero line. Velocity chart. This shows the quantity of work the team has completed (through customer acceptance) in each iteration. Note that velocity cannot be compared directly across teams or across projects. Story sizes depend on the particular team, the problem they are solving, and the technical environment of the solution. Different teams may settle on different scales for story sizes and will reach different consensual agreements about how many points a story deserves. What is of interest in the velocity chart are variations, trends, and patterns over time. Those two metrics should be sufficient for a mature agile team operating in a supportive organizational culture. However, if the team is not applying agile methods very well or if the surrounding organization does not understand where the waste lives in its pre-agile methods, then some additional metrics may be appropriate here. Your specific needs will vary and you should come up with solutions tailored to your situation. A couple of the items we’ve discussed might be appropriate: Al Goerner’s Release Progress Report Card will show the gap between value-added work and overhead work. The simple metric, Story Cycle Time, will expose problems with “hangover” – stories started in one iteration and completed in a subsequent iteration.
Here is an example of the Delivery Effectiveness quadrant of a scorecard. This example includes the release burndown in bar chart form. Although the burndown is displayed in the Value Delivery quadrant, the bar chart format highlights the team’s effectiveness more obviously than does the line chart version. The velocity chart shows how consistently the team delivers the quantity of work that has been established as “normal” for this particular team on this particular project under this particular set of circumstances. Excessive variation in velocity indicates a problem with delivery effectiveness. In the example, the team attained a very low velocity in one of the past iterations. Whatever the problem was, it appears that the team has dealt with it successfully, since the pattern has not occurred again and there appears to be no negative trend in velocity. This example also includes the Release Progress Report Card, which may be useful in cases when a team is spending an inordinate amount of time on overhead work as opposed to value-add work. It also includes the Story Cycle Time metric, in this case showing that the team often takes 2 iterations to bring a story to completion. That is a “smell” that calls for further investigation. On the flipchart or whiteboard, write “Software Quality” as the title of the lower left-hand quadrant. Ask participants to name some factors or metrics that pertain to software quality. Most people will probably mention bugs or defects without much quantification of what metrics they are thinking about. Some people might mention customer satisfaction as a quality attribute. Some of the –ilities may come up, as well. These are all good answers. There are also some static code analysis metrics that pertain to code quality.
This example lists several items that might be appropriate under the heading of Software Quality. Customer Satisfaction is not really a “metric.” It may be any sort of feedback, whether formal or informal, that indicates the customer’s level of satisfaction with the code that has been delivered to date, or with his/her interaction with the team. Non-functional requirements are characteristics of a software system that may be seen as quality attributes. For example, “availability” is a quality attribute if there are specific requirements for system availability. Availability is measurable and you can report metrics to show the current level of quality of the system with respect to that requirement. Other quality attributes may only be subjectively determined; for example, “usability.” Most contemporary software development environments include static code analysis features. Some structural attributes of a code base speak to quality in one way or another. Be wary of overdoing it, as many static code analysis tools offer a huge variety of statistics and a wide array of compelling, graphical representations of the data. Only include metrics the team can act upon to improve quality. Defect density is a somewhat crude but widely-used indicator of quality. A “defect” may be defined in whatever way makes sense in your environment. Typically, defects include the bugs that have been reported against code the team has already delivered plus the number of failing tests in the latest build. Express the sum of these values as a ratio over KLOC. Industry norms reported by IBM for applications written in languages like Java, C++, and C# are about 0.362. This gives your team a target to aim for, although one would hope that a disciplined use of agile methods would keep the level very close to zero. Mention that for external-facing status reports (for instance, reports upward in the management hierarchy), the fourth quadrant might be devoted to financial metrics. For inward-facing status reports (information the team will use for its own purposes), the fourth quadrant can be devoted to continuous improvement. Let’s briefly consider financial metrics, bearing in mind that it is a topic of some complexity that cannot be covered very well in a couple of minutes. The chain of command from project manager to program manager to CIO will be interested in how each level of management below his/her own is handling the budget. Corporate IT departments are treated as cost centers. As such, they are allocated a fixed budget, usually on an annual basis. Managers are considered good financial managers if they burn their budget allocation smoothly over the course of the fiscal year and end up on zero. Obviously, this is not a very businesslike view of financial management. Yet, it may be a requirement that you report up the chain on how your portion of the IT budget is burning. It is easy enough to include this information on the scorecard. A more interesting financial metric might be the project’s ROI. We use the term ROI loosely in this context. It is not possible to base the ROI calculation on real numbers, since the business sponsor of an IT project is gambling on the value of the project. He/she has calculated an expected return somehow, but however he/she calculated it, it remains a somewhat subjective and hopeful number. We cannot know whether the new system yielded any value until after the fact, when we can collect data on it in production and in the context of the business operations it supports. Even then, we cannot know what proportion of the ROI is attributable to the software as such and what proportion is a result of general business process improvement or marketing activities. Despite the limitations, a report of ROI may be a useful communication tool in a mixed environment where the agile teams must prove their worth to a skeptical organization. The next couple of slides provide some general background information on applying Throughput Accounting principles to agile software development projects.
A scorecard for the team to use might include a section on continuous improvement opportunities. Any team can find ways to improve its effectiveness; and let’s face it, most teams that claim to be using agile methods today are not really very disciplined about pushing the envelope. Any areas the team decides it wants to work on can be included. By displaying these on the scorecard, the team has a visible reminder of its commitment to continuous improvement and of which specific areas the team members have chosen to focus on just now. The results of any improvements in the team’s working style will eventually be reflected in the other quadrants of the scorecard, as well as in the quality of the code they deliver. Of the examples shown, “build frequency” can be taken directly from the continuous integration server; “escaped defects” (those that get past the development team to be discovered later, possibly by users or by a QA group) can be determined from production support tickets or bug reports; “use of TDD” can be directly measured on Java projects by a new Eclipse plug-in under development at the University of Hawaii called “Zorro.” Use of TDD may also be inferred indirectly form some of the other metrics, such as cyclomatic complexity, structural complexity, and defect density. “Big-bang refactorings” may be reported by team members or noticed in tell-tale trends in velocity metrics. Ask participants what other areas of agile practice their teams might want to consider as opportunities for improvement.
The next few slides provide some examples of scorecards that other companies have come up with. They are provided as examples only, and we won’t spend much time on them.
Nothing much to note here except that this is a good example of a “fancy” agile tracking tool. Beware of the allure of pretty graphics. It’s easy to be led astray and start including a lot of unnecessary data on information radiators and status reports. This can cause the useful information to get lost in the noise.
This is a screenshot from Serena’s agile project tracking tool. Like other scorecards, it divides the display into sections that focus on particular aspects of the project. The product presents different views of the same data depending on what you’re interested in. The example shows a Release Status View. The scorecard omits information that isn’t especially pertinent to release status. That helps keep the scorecard small and reduces visual clutter so people can see what they need to see.