Jairus Hihn and Tim Menzies present research on improving software cost estimation models. Their data shows that current NASA estimates are often off by 55-75% and schedules slip by 24%. They have developed a new estimation tool called 2CEE that uses machine learning on historical NASA data to generate cost estimates. 2CEE automatically identifies relevant past projects and factors to build custom models, outperforming existing methods. They plan to continue validating 2CEE with NASA and improving it based on feedback to potentially replace current NASA estimation tools.
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Sas 07 Costing
1. Software Development Cost:
How Much? You Sure?
Executive Summary
Jairus Hihn Tim Menzies
Jet Propulsion Laboratory Lane Department of Computer Science
California Institute of Technology West Virginia University
jhihn@jpl.nasa.gov tim@menzies.us
NASA Software Assurance Symposium,
Morgantown, WV
September 25-27, 2007
1
SAS_07_Software Development_Cost_Hihn/Menzies
2. IMPORTANCE/BENEFITS
Data at JPL indicates that
flight software planned effort grows by
75% from Initial Confirmation
55% from Confirmation Review
Schedule slips by 24% from Confirmation Review
Allocated budgets are seriously out of line with software team
estimates
The products of this research task will enable the
ability to
improve our performance against these metrics
Develop ‘reasonable’ estimates for IV&V resource allocation
to verify NASA mission software
SAS_07_Software Development_Cost_Hihn/Menzies 2
3. RELEVANCE to NASA
NASA even at the center level has very limited
knowledge of its actual software cost performance
and will be doomed to repeat the past if cannot learn
the lessons from its past
NASA must establish the capability to
Update estimates quickly as designs evolve
Have sufficient basis of estimate to defend reasonable
software cost estimates
Understand the risk and uncertainty within our estimates and
budgets
SAS_07_Software Development_Cost_Hihn/Menzies 3
4. TASK Problem
“Cost” is a quality issue
If development costs is under-
estimated…
… developers will be forced
into many quality-threatening
cost-cutting measures.
Can you think of anything
you can do ...
… that harms a project more…
… than allocating insufficient
resources?
SAS_07_Software Development_Cost_Hihn/Menzies 4
5. Problem: Are we using the right models?
A major reason for poor software cost estimation:
… NASA’s managers don’t have information they need
Not enough relevant data
Current costing models are brittle and improperly tuned
e.g. “officially”, COCOMO’s
tuning parameters vary
2.5 <= a <= 2.94
0.91 <= b < 1.01
Which is nothing like what
we see with real NASA data,
3.5 <= a <= 14
0.65 <= b <= 1
Conclusion: we have to
take far more care when building our cost models
SAS_07_Software Development_Cost_Hihn/Menzies 5
6. APPROACH: validated, relevant
models that handle uncertainty
Change the rules of the game
Old way: reuse someone else’s cost model
New way: build relevant and validated models,
Used with uncertainty “what-if” queries
Relevant: NASA historical data = table
Rows are projects (some of which aren’t relevant)
Columns are project features (some which don’t
matter to you)
Row pruners: automatically find relevant projects
Column pruners: automatically dump
uninformative features
Validated:
In studies with real NASA data
This automatic method out-performed
existing state-of-the-art methods
Uncertainty & “what-ifs”
Planned projects aren’t “one thing”
But a range of possible “things”
Explore the “things” looking for the range of possibilities
SAS_07_Software Development_Cost_Hihn/Menzies 6
7. Accomplishments
2CEE (Windows application)
21st century estimation 50% to 70%
New high-water mark in cost probability range
estimation methods
Generates space of possible
estimates . TRL=7
Validated on NASA data
We have tried 158
ways to build cost
models
Rejected 154 methods
Find the four
row/column pruners
that matter
Out-performs existing
cost models
Deployed at NASA
Currently, under-
going a 12 month trial
SAS_07_Software Development_Cost_Hihn/Menzies 7
8. Major Accomplishments
Published 11 papers so far
Including publications in IEEE TSE and IEEE Software
Obtained, cleaned, translated 124 records into a
COCOMO II data set
2cee an effort estimation and model tuning tool
Estimator and Expert mode
Runs in Windows
SAS_07_Software Development_Cost_Hihn/Menzies 8
9. Next Steps
Minor tool clean up and get it released.
Currently being validated in real world setting
Checking performance against current JPL estimation tools
Update training materials
Deliver two training workshops at JPL and one other
center
Deliver data set
Final report and it is all over.
Run in parallel with current models over the next year
to evaluate 2cee for acceptance as JPL supported effort
estimation tool
Submit at least one more paper to a major journal
SAS_07_Software Development_Cost_Hihn/Menzies 9
10. Come see our demo it will knock your socks off !!!
SAS_07_Software Development_Cost_Hihn/Menzies 10