ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
Dvorak.dan
1. Cleared for unlimited release: CL#08-3913
NASA Study
Flight Software Complexity
Sponsor: NASA OCE Technical Excellence Program
JPL Task Lead: Dan Dvorak
GSFC POC: Lou Hallock
JSC POC: Pedro Martinez, Brian Butcher
MSFC POC: Cathy White, Helen Housch
APL POC: Steve Williams
HQ Sponsor: Adam West
NASA Advisors: John Kelly, Tim Crumbley
2/11/2009 Flight Software Complexity 1
2. Task Overview
Flight Software Complexity
Origin Growth in Codein Size izforr RoboticU and eHuman Missions
G row th C od e S e fo M ann e d and n mann d M ission s
10000000
Robotic
unm anned
Chief engineers identified cross- 1000000 Human
m anned
E x pon. (unm anned)
cutting issues warranting further 100000
NCSL (Log scale)
E x pon. (m anned)
K N C S L (l o g s c a
study 10000 1969 Mariner-6 (30)
1975 Viking (5K)
1977 Voyager (3K)
1000
Brought software complexity issue 100
1989 Galileo (8K)
1990 Cassini (120K)
1997 Pathfinder (175K)
to Baseline Performance Review 10
1999 DS1 (349K)
2003 SIRTF/Spitzer (554K)
2004 MER (555K)
Charter
2005 MRO (545K)
1
1968
1970
1972
1974
1976
1978
1980
1982
1984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
1968 Apollo (8.5K)
1980 Shuttle(470K)
Ye a r o f M issio n 1989 ISS (1.5M)
Bring forward deployable technical
and managerial strategies to Areas of Interest
effectively address risks from 1. Clear exposé of growth in NASA
growth in size and complexity of FSW size and complexity
flight software 2. Ways to reduce/manage complexity
in general
Initiators / Reviewers
3. Ways to reduce/manage complexity
Ken Ledbetter, SMD Chief Engineer
of fault protection systems
Stan Fishkind, SOMD Chief Engineer
Frank Bauer, ESMD Chief Engineer 4. Methods of testing complex logic for
George Xenofos, ESMD Dep. Chief Engineer safety and fault protection provisions
Flight Software Complexity 2
3. Growth Trends in NASA Flight Software
Note log Growth in CodeoSize efor M an n e d an d Uand nHumann s
G ro w th in C d e S iz fo r Robotic n man e d M issio Missions
scale 10000000
unm anned
Robotic
1000000 m anned
Human
E x pon. (unm anned)
100000 E x pon. (m anned)
scale)
NCSL N(Log(lo g sca
NCSL S L scale)
10000 1969 Mariner-6 (30)
K C (Log
1975 Viking (5K)
1000 1977 Voyager (3K)
1989 Galileo (8K)
1990 Cassini (120K)
100 1997 Pathfinder (175K)
1999 DS1 (349K)
10 2003 SIRTF/Spitzer (554K)
2004 MER (555K)
2005 MRO (545K)
1
NCSL = 1968 Apollo (8.5K)
1968
1970
1972
1974
1976
1978
1980
1982
1984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
Non-Comment 1980 Shuttle(470K)
1989 ISS (1.5M)
Source Lines Ye a r o f M issio n
The ‘year’ used in this plot is for a mission is typically the year of launch, or of completion of the primary software.
Line counts are either from best available source or direct line counts (e.g., for the JPL and LMA missions).
The line count for Shuttle Software is from Michael King, Space Flight Operations Contract Software Process Owner, April 2005
Note well: This shows exponential growth
Note well: This shows exponential growth
~10X growth every 10 years
~10X growth every 10 years
Source: Gerard Holzmann, JPL Flight Software Complexity 3
4. Software Growth in Human Spaceflight
JSC
data
G ro w th in S o ftw are S iz e
1400
1244 The Orion (CEV) numbers
The Orion (CEV) numbers
1200
are current estimates.
are current estimates.
1000
To make Space Shuttle
K SLOC
800
650
To make Space Shuttle
600
and Orion comparable,
and Orion comparable,
400
neither one includes
neither one includes
backup flight software
backup flight software
200
8.5 since that figure for Orion
since that figure for Orion
0
A pollo 1968 S pac e S huttle O rion (es t.)
is TBD.
is TBD.
(8500 lines) F lig h t V e h icle
Space Shuttle and ISS estimates
dated Dec. 2007
Source: Pedro Martinez, JSC
Flight Software Complexity 4
5. How Big is a Million Lines of Code?
A novel has ~500K characters
(~100K words × ~5 characters/word)
A million-line program has ~20M characters
(1M lines × ~20 characters/line), or about 40 novels
Source:
Les Hatton, University of
Kent, Encyclopedia of
Software Engineering,
John Marciniak, editor in
chief
Flight Software Complexity 5
6. Size Comparisons of Embedded Software
System Lines of Code
Mars Reconnaissance 545K NASA flight s/w is
NASA flight s/w is
Orbiter not among the
not among the
largest embedded
largest embedded
Orion Primary Flight Sys. 1.2M software systems
software systems
F-22 Raptor 1.7M
Seawolf Submarine Combat 3.6M
System AN/BSY-2
Boeing 777 4M
Boeing 787 6.5M
F-35 Joint Strike Fighter 5.7M
Yes,
Yes,
Typical GM car in 2010 100M really.100
really.100
Flight Software Complexity
Million
Million 6
7. NSF Concerned About Complexity
“As the complexity of current systems
“As the complexity of current systems
has grown, the time needed to develop
has grown, the time needed to develop
them has increased exponentially, and
them has increased exponentially, and
the effort needed to certify them has
the effort needed to certify them has
risen to account for more than half the
risen to account for more than half the
total system cost.
total system cost.
NSF solicitation on cyber-physical systems (Jan. 2009)
Flight Software Complexity 7
8. Complex interactions and high coupling raise
risk of design defects and operational errors
High-risk systems
INTERACTIONS
Linear Complex
High
Dams
Nuclear plant
Power grids Aircraft
Marine transport Chemical plants
COUPLING (Urgency)
Rail transport Space missions
Airways Military early-warning
Junior college
Military actions
Trade schools
Mining
R&D firms
Most manufacturing
Low
Universities
Post Office
Source: Charles Perrow, “Normal Accidents: Living with High-Risk Technologies”, 1984.
Flight Software Complexity 8
10. Why is Flight Software Growing?
“The demand for complex hardware/software systems
“The demand for complex hardware/software systems
has increased more rapidly than the ability to design,
has increased more rapidly than the ability to design,
implement, test, and maintain them. …
implement, test, and maintain them. …
“It is the integrating potential of software that has
“It is the integrating potential of software that has
allowed designers to contemplate more ambitious
allowed designers to contemplate more ambitious
systems encompassing a broader and more
systems encompassing a broader and more
multidisciplinary scope ...”
multidisciplinary scope ...”
Michael Lyu
Michael Lyu
Handbook of Software Reliability Engineering, 1996
Handbook of Software Reliability Engineering, 1996
Flight Software Complexity 10
11. Software Growth in Military Aircraft
Flight software is growing S o ftw are in M ilitary Aircraft
because it is providing an
P ercen t o f F u n ctio n ality P ro vid e
increasing percentage of 90
80
system functionality
70
60
S o ftw are
With the newest F-22 in 50
2000, software controls 40
80% of everything the pilot 30
does 20
10
Designers put functionality 0
in software or firmware 1960 1964 1970 1975 1982 1990 2000
(F -4) (A -7) (F - (F -15) (F -16) (B -2) (F -22)
because it is easier and/or 111)
cheaper than hardware Ye a r o f In tro d u ctio n
“Crouching Dragon, Hidden Software: Software in DoD
Weapon Systems”, Jack Ferguson, IEEE Software, vol. 18,
no. 4, pp.105-107, Jul/Aug, 2001.
Flight Software Complexity 11
12. NASA Missions
Factors that Increase Software Complexity
• Human-rated Missions
– May require architecture redundancy and associated complexity
• Fault Detection, Diagnostics, and Recovery (FDDR)
– FDDR requirements may result in complex logic and numerous potential paths of
execution
• Requirements to control/monitor increasing number of system
components
– Greater computer processing, memory, and input/output capability enables control
and monitor of more hardware components
• Multi-threads of execution
– Virtually impossible to test every path and associated timing constraints
• Increased security requirements
– Using commercial network protocols may introduce vulnerabilities
• Including features that exceed requirements
– Commercial Off the Shelf (COTS) products or re-use code may provide capability that
exceeds needs or may have complex interactions
Source: Cathy White, MSFC
Flight Software Complexity 12
13. About Complexity
• But what is complexity?
• Where does it appear?
• Why is it getting bigger?
10/09/2008 Flight Software Complexity 13
14. Definition
What is Complexity?
• Complexity is a measure of how hard something is to understand or
achieve
– Components — How many kinds of things are there to be aware of?
– Connections — How many relationships are there to track?
– Patterns — Can the design be understood in terms of well-defined patterns?
– Requirements — Timing, precision, algorithms
• Two kinds of complexity:
– Essential Complexity – How complex is the underlying problem?
– Incidental Complexity – What extraneous complexity have we added?
• Complexity appears in at least four key areas:
– Complexity in requirements
“Complexity is a total
– Complexity of the software itself
system issue, not just
– Complexity of testing the system a software issue.”
– Complexity of operating the system – Orlando Figueroa
Flight Software Complexity 14
15. Causes of Software Growth
Expanding Functionality
Command sequencing Source: Bob Rasmussen, JPL
Telemetry collection & formatting
Attitude and velocity control
Aperture & array pointing
Payload management
Fault detection and diagnosis “Flight software is a
Safing and fault recovery
Critical event sequencing system’s complexity
Momentum management
Aero-braking
sponge.”
Fine guidance pointing
Guided descent and landing
Data priority management
Dynamic resource management
Event-driven sequencing
Long distance traversal
Surface sample acquisition & handling
Landing hazard avoidance
Surface mobility and hazard avoidance
Model-based reasoning
Relay communications
Plan repair
Science event detection
Guided ascent
Automated planning and scheduling
Rendezvous and docking
Operation on or near small bodies
Guided atmospheric entry Formation flying
Star identification
Tethered system soft landing Opportunistic science
Robot arm control
Interferometer control and more to come . . .
and many others …
Past Planned Future
Flight Software Complexity 15
16. Scope, Findings, Observations
Requirements
Requirements • Challenging requirements raise downstream complexity (unavoidable)
Complexity
Complexity • Lack of requirements rationale permit unnecessary requirements
System-Level
System-Level • Engineering trade studies not done: a missed opportunity
Analysis &
Analysis & • Architectural thinking/review needed at level of systems and software
Design
Design
• Inadequate software architecture and lack of design patterns
Flight Software
Flight Software
• Coding guidelines help reduce defects and improve static analysis
Complexity
Complexity
• Descopes often shift complexity to operations
Verification &
Verification & • Growth in testing complexity seen at all centers
Validation
Validation • More software components and interactions to test
Complexity
Complexity • COTS software is a mixed blessing
Operations • Shortsighted FSW decisions make operations unnecessarily complex
Operations
Complexity
Complexity • Numerous “operational workarounds” raise risk of command errors
Flight Software Complexity 16
17. Categorized Recommendations
Architecture
R4 More up-front analysis and architecting Link
R5 Software architecture review board Link
R9 Invest in a reference architecture Link
R6 Grow and promote software architects Link
Project Management
R2 Emphasize requirements rationale Link
R3 Serious attention to trade studies Link
R10 Technical kickoff for projects Link
R16 Use software metrics Link
R7 Involve operations engineers early and often Link
Verification
R11 Use static analysis tools Link
Fault Management
R12 Standardize fault management terminology Link
R13 Conduct fault management reviews Link
R14 Develop fault management education Link
R15 Research s/w fault containment techniques Link
Complexity Awareness
R1 Educate about downstream effects of decisions Link
Flight Software Complexity 17
18. Category: Architecture
Recommendation 4
More Up-Front Analysis & Architecting
Finding: Clear trends of increasing complexity in NASA missions
– Complexity is evident in requirements, FSW, testing, and ops
– We can reduce incidental complexity through better architecture
“Point of view is worth 80 IQ points.”
– Alan Kay, 1982 (famous computer scientist)
Recommendation: Spend more time up front in requirements
analysis and architecture to really understand the job and its
solution (What is architecture?)
– Architecture is an essential systems engineering responsibility, and the
architecture of behavior largely falls to software
– Cheaper to deal with complexity early in analysis and architecture
– Integration & testing becomes easier with well-defined interfaces and well-
understood interactions
– Be aware of Conway’s Law
(software reflects the organizational structure that produced it)
Flight Software Complexity 18
19. Architecture Investment “Sweet Spot”
Predictions from COCOMO II model for software cost estimation
120%
F ra c spent t s p n t o n re w rk architecture
100%
10M SLOC Lesson:
Lesson:
Fraction of budgettio n bu dgeon erework o+ + a rc h
Projects that allocate adequately
Projects that allocate adequately
80%
for architecture do better
for architecture do better
1M SLOC
60%
100K SLOC
40%
10K SLOC Trend:
Trend:
20% The bigger the software, the bigger
The bigger the software, the bigger
the fraction to spend on architecture
the fraction to spend on architecture
0%
0% 10% 20% 30% 40% 50% 60% 70%
F ra c tio n b ud g e t s p e nt o n a r c hite c tur e
Fraction of budget spent on architecture
(E q u a tio n s fr o m R e in h o ltz A r c h S w e e tS p o tV 1 .n b )
Note:
Note: KSLO C 1 0 KSLO C 1 0 0 KSLO C 1 0 0 0 KSLO C 1 0 0 0 0
Prior investment in a reference
Prior investment in a reference
architecture pays dividends
architecture pays dividends Source: Kirk Reinholtz, JPL
(R9)
(R9) Flight Software Complexity 19
20. Category: Architecture
Recommendation 5
Software Architecture Review Board
Finding: In the 1990’s AT&T had a standing
Architecture Review Board that examined proposed
software architectures for projects, in depth,
and pointed out problem areas for rework
– The board members were experts in architecture & system analysis
– They could spot common problems a mile away
– The review was invited and the board provided constructive feedback
– It helped immensely to avoid big problems
Recommendation: Create a professional architecture
review board and add architecture reviews as a best
practice (details) Maybe similar to Navigation
Advisory Group (NAG)
Options:
1. Strengthen NPR 7123 re when to assess s/w architecture
2. Tune AT&T’s architecture review process for NASA
3. Leverage existing checklists for architecture reviews [8]
4. Consider reviewers from academia and industry for very large projects
Flight Software Complexity 20
21. Category: Architecture
Recommendation 9
Invest in Reference Architecture & Core Assets
• Finding: Although each mission is unique, they must all address
common problems: attitude control, navigation, data management,
fault protection, command handling, telemetry, uplink, downlink,
etc. Establishment of uniform patterns for such functionality,
across projects, saves time and mission-specific training. This
requires investment, but project managers have no incentive to
“wear the big hat”.
• Recommendation: Earmark funds for development of a reference
architecture (a predefine architectural pattern) and core assets, at
each center, to be led and sustained by the appropriate technical
line organization, with senior management support
Key – A reference architecture embodies a huge set of lessons learned, best
practices, architectural principles, design patterns, etc.
• Options:
1. Create a separate fund for reference architecture (infrastructure investment)
2. Keep a list of planned improvements that projects can select from as their
intended contribution See backup slide on
Flight Software Complexity
reference architecture 21
22. Category: Project Mgmt.
Recommendation 2
Emphasize Requirements Rationale
Finding: Unsubstantiated requirements have caused
unnecessary complexity. Rationale for requirements
often missing or superficial or misused.
Recommendation: Require rationales at Levels 2 and 3
– Rationale explains why a requirement exists
– Numerical values require strong justification (e.g. “99% data
completeness”, “20 msec response”, etc). Why that value rather than
an easier value?
Notes:
Work with systems engineering to provide guidance on rationale
from software complexity perspective.
NPR 7123, NASA System Engineering Requirements, specifies in an
appendix of “best typical practices” that requirements include
rationale, but offers no guidance on how to write a good rationale or
check it. NASA Systems Engineering Handbook provides some
guidance (p. 48).
Flight Software Complexity 22
23. Category: Project Mgmt.
Recommendation 3
Serious Attention to Trade Studies
Finding: Engineering trade studies often not done or
done superficially or done too late
– Kinds of trade studies: flight vs. ground, hardware vs. software vs.
firmware (including FPGAs), FSW vs. mission ops and ops tools
– Possible reasons: schedule pressure, unclear ownership, culture
Recommendation: Ensure that trade studies are
properly staffed, funded, and done early enough
This is unsatisfying because it says
“Just do what you’re supposed to do”
Options:
1. Mandate trade studies via NASA Procedural Requirement
2. For a trade study between x and y,
make it the responsibility of the manager
that holds the funds for both x and y “As the line between systems and
“As the line between systems and
software engineering blurs,
software engineering blurs,
3. Encourage informal-but-frequent trade multidisciplinary approaches and teams
multidisciplinary approaches and teams
studies via co-location (co-location are becoming imperative.”
are becoming imperative.”
universally praised by those who — Jack Ferguson
— Jack Ferguson
Director of Software Intensive Systems, DoD
Director of Software Intensive Systems, DoD
experienced it) IEEE Software, July/August 2001
IEEE Software, July/August 2001
Flight Software Complexity 23
24. Cautionary Note
Some recommendations are common sense, but aren’t
common practice. Why not? Some reasons below.
Cost and schedule pressure
Cost and schedule pressure
– Some recommendations require time and training,
– Some recommendations require time and training,
and the benefits are hard to quantify up front
and the benefits are hard to quantify up front
Lack of Enforcement
Lack of Enforcement
– Some ideas already exist in NASA requirements and local practices, but
– Some ideas already exist in NASA requirements and local practices, but
aren’t followed because of and because nobody checks for them
aren’t followed because of and because nobody checks for them
Pressure to inherit from previous mission
Pressure to inherit from previous mission
– Inheritance can be a very good thing, but “inheritance mentality”
– Inheritance can be a very good thing, but “inheritance mentality”
inhibits new ideas, tools, and methodologies
inhibits new ideas, tools, and methodologies
No incentive to “wear the big hat”
No incentive to “wear the big hat”
– Project managers focus on point solutions for their missions,
– Project managers focus on point solutions for their missions,
with no infrastructure investment for the future
with no infrastructure investment for the future
Flight Software Complexity 24
25. Summary
Big-Picture Take-Away Message
• Flight software growth is exponential, and will continue
– Driven by ambitious requirements
– Accommodates new functions more easily
– Accommodates evolving understanding (easier to modify)
• Complexity is better managed/reduced through …
– Well-chosen architectural patterns, design patterns, and coding guidelines
– Fault management that is dyed into the design, not painted on
– Substantiated, unambiguous, testable requirements
– Awareness of downstream effects of engineering decisions
– Faster processors and larger memories (timing and memory margin)
• Architecture addresses complexity directly
– Confront complexity at the start (can’t test away complexity)
– Architecture reviews (follow AT&T’s example)
– Need more architectural thinkers (education, career path)
– See “Thinking Outside the Box” for how to think architecturally
Flight Software Complexity 25
26. Hyperlinks to Reserve Slides
Other Findings and Recommendations Link
Software Size and Growth Link
Reasons for Growth Link
About Complexity Link
Software Defects and Verification Link
Observations on NASA Software Practices Link
Historical Perspective Link
Architecture and Architecting Link
Software Complexity Metrics Link
Miscellaneous Link
Flight Software Complexity 26
27. Other Findings & Recommendations
R1 Downstream effects of decisions Link
R6 Grow and promote software architects Link
R7 Involve operations engineers early and often Link
R10 Technical kickoff for projects Link
R11 Use static analysis tools Link
R12 Standardize fault protection terminology Link
R13 Conduct fault protection reviews Link
R14 Develop fault protection education Link
R15 Research in software fault containment techniques Link
R16 Use software metrics Link
28. Category: Awareness
Recommendation 1
Education about “effect of x on complexity”
Finding: Engineers and scientists often don’t realize the
downstream complexity entailed by their decisions
– Seemingly simple science “requirements” and avionics designs can
have large impact on software complexity, and software decisions
can have large impact on operational complexity
Recommendations:
– Educate engineers about the kinds of decisions that affect
complexity
• Intended for systems engineers, subsystem engineers, instrument
designers, scientists, flight and ground software engineers, and
operations engineers
– Include complexity analysis as part of reviews
Options:
1. Create a “Complexity Primer” on a NASA-internal web site (link)
2. Populate NASA Lessons Learned with complexity lessons
3. Publish a paper about common causes of complexity
Flight Software Complexity 28
29. Category: Architecture
Recommendation 6
Grow and Promote Software Architects
Finding: Software architecture is vitally important in
reducing incidental complexity, but architecture skills
are uncommon and need to be nurtured
Reference: (what is architecture?) (what is an architect?)
Recommendation: Increase the ranks of software
architects and put them in positions of authority
Analogous to Systems Engineering Leadership Development Program
Options:
1. Target experienced software architects for strategic hiring
2. Nurture budding architects through education and mentoring
(think in terms of a 2-year Master’s program)
3. Expand APPEL course offerings:
Help systems engineers to think architecturally
The architecture of behavior largely falls to software, and systems
engineers must understand how to analyze control flow, data flow,
resource management, and other cross-cutting issues
Flight Software Complexity 29
30. Category: Project Mgmt.
Recommendation 7
Involve Operations Engineers Early & Often
Findings that increase ops complexity:
– Flight/ground trades and subsequent FSW descope decisions often
lack operator input
– Shortsighted decisions about telemetry design, sequencer features,
data management, autonomy, and testability
– Large stack of “operational workarounds” raise risk of command
errors and distract operators from vigilant monitoring
Findings are from a “gripe session
on ops complexity” held at JPL
Recommendations:
– Include experienced operators in flight/ground trades
and FSW descope decisions
– Treat operational workarounds as a cost and risk upper;
quantify their cost
– Design FSW to allow tests to start at several well-known states
(shouldn’t have to “launch” spacecraft for each test!)
Flight Software Complexity 30
31. Category: Project Mgmt.
Recommendation 10
Formalize a ‘Technical Kickoff’ for Projects
Finding: Flight project engineers move from project to
project, often with little time to catch up on technology
advances, so they tend to use the same old stuff
Recommendation:
– Option 1: Hold ‘technical kickoff meetings’ for projects as a way
to infuse new ideas and best practices, and create champions
within the project Michael Aguilar, NESC,
is a strong proponent
• Inspire rather than mandate
• Introduces new architectures, processes, tools, and lessons
• Supports technical growth of engineers
– Option 2: Provide 4-month “sabbatical” for project engineers to
learn a TRL 6 software technology, experiment with it, give
feedback for improvements, and then infuse it
Steps:
1. Outline a structure and a technical agenda for a kickoff meeting
2. Create a well-structured web site with kickoff materials
3. Pilot a technical kickoff on a selected mission
Flight Software Complexity 31
32. Category: Verification
Recommendation 11
Static Analysis for Software
• Finding: Commercial tools for static analysis of source
code are mature and effective at detecting many kinds of
software defects, but are not widely used
– Example tools: Coverity, Klocwork, CodeSonar
• Recommendation: Provide funds for: (a) site licenses of
source code analyzers at flight centers, and (b) local
guidance and support
• Notes:
1. Poll experts within NASA and industry regarding best tools for C, C++,
and Java
2. JPL provides site licenses for Coverity and Klocwork
3. Continue funding for OCE Tool Shed, expand use of common tools
Flight Software Complexity 32
33. Category: Fault Management
Recommendation 12
Fault Management Reference Standardization
• Finding: Inconsistency in the terminology for fault
management among NASA centers and their
contractors, and a lack of reference material for which
to assess the suitability of fault management
approaches to mission objectives.
– Example Terminology: Fault, Failure, Fault Protection, Fault
Tolerance, Monitor, Response.
• Recommendation: Publish a NASA Fault Management
Handbook or Standards Document that provides:
– An approved lexicon for fault management.
– A set of principles and features that characterize software
architectures used for fault management.
– For existing and past software architectures, a catalog of recurring
design patterns with assessments of their relevance and adherence
to the identified principles and features.
Findings from NASA Planetary
Spacecraft Fault Management Workshop
Source: Kevin Barltrop, JPL Flight Software Complexity 33
34. Category: Fault Management
Recommendation 13
Fault Management Proposal Review
• Finding: The proposal review process does not
assess in a consistent manner the risk entailed by a
mismatch between mission requirements and the
proposed fault management approach.
• Recommendation: For each mission proposal
generate an explicit assessment of the match
between mission scope and fault management
architecture. Penalize proposals or require follow-up
for cases where proposed architecture would be
insufficient to support fault coverage scope.
– Example: Dawn recognized the fault coverage scope problem, but
did not appreciate the difficult of expanding fault coverage using
the existing architecture.
– The handbook or standards document can be used as a reference
to aid in the assessment and provide some consistency.
Findings from NASA Planetary
Spacecraft Fault Management Workshop
Source: Kevin Barltrop, JPL Flight Software Complexity 34
35. Category: Fault Management
Recommendation 14
Develop Fault Management Education
• Finding: Fault management and autonomy receives
little attention within university curricula, especially
within engineering programs. This hinders the
development of a consistent fault management
culture needed to foster the ready exchange of
ideas.
• Recommendation: Sponsor or facilitate the addition
of a fault management and autonomy course within
a university program, such as a Controls program.
– Example: University of Michigan could add a “Fault
Management and Autonomy Course.”
Findings from NASA Planetary
Spacecraft Fault Management Workshop
Source: Kevin Barltrop, JPL Flight Software Complexity 35
36. Category: Fault Management
Recommendation 15
Do Research on Software Fault Containment
• Finding: Given growth trends in flight software, and
given current achievable defect rates, the odds of a
mission-ending failure are increasing (see link)
– A mission with 1 Million lines of flight code, with a low residual defect
ratio of 1 per 1000 lines of code, then translates into 900 benign
defects, 90 medium, and 9 potentially fatal residual software defects
(i.e., these are defects that will happen, not those that could happen)
– Bottom line: As more functionality is done in software, the probability of
mission-ending software defects increases (until we get smarter)
• Recommendation: Extend the concept of onboard fault
protection to cover software failures. Develop and test
techniques to detect software faults at run-time and
contain their effects
– One technique: upon fault detection, fall back to a simpler-but-more-
verifiable version of the failed software module
Flight Software Complexity 36
37. Category: Project Mgmt.
Recommendation 16
Apply Software Metrics
• Finding: No consistency in flight software metrics
– No consistency in how to measure and categorize software size
– Hard to assess amount and areas of FSW growth, even within a center
– NPR 7150.2 Section 5.3.1 (Software Metrics Report) requires measures of
software progress, functionality, quality, and requirements volatility
• Recommendations: Development organizations should …
– Seek measures of complexity at code level and architecture level
– Add ‘complexity’ as new software metrics category in NPR 7150.2
– Compare to historical size & complexity for planning and monitoring
– Save flight software from each mission in a repository for undefined future
analyses (software archeology, SARP study)
• Non-Recommendation: Don’t attempt NASA-wide metrics. Better to
drive local center efforts. (See slide)
“The 777 marks the first time The Boeing Company has applied software metrics uniformly across a a new commercial-airplane
“The 777 marks the first time The Boeing Company has applied software metrics uniformly across new commercial-airplane
programme. This was done to ensure simple, consistent communication of information pertinent to software schedules among
programme. This was done to ensure simple, consistent communication of information pertinent to software schedules among
Boeing, its software suppliers, and its customers—at all engineering and management levels. In the short term, uniform
Boeing, its software suppliers, and its customers—at all engineering and management levels. In the short term, uniform
application of software metrics has resulted in improved visibility and reduced risk for 777 on-board software.”
application of software metrics has resulted in improved visibility and reduced risk for 777 on-board software.”
Robert Lytz, “Software metrics for the Boeing 777: a a case study”, Software Quality Journal, Springer Netherlands
Robert Lytz, “Software metrics for the Boeing 777: case study”, Software Quality Journal, Springer Netherlands
Flight Software Complexity 37
38. Category: Verification
Observation
Analyze COTS for Testing Complexity
COTS software is
a mixed blessing
Finding: COTS software provides valuable functionality,
but often comes with numerous other features that are
not needed. However, the unneeded features often
entail extra testing to check for undesired interactions.
Recommendation: In make/buy decisions, analyze
COTS software for separability of its components and
features, and thus their effect on testing complexity
– Weigh the cost of testing unwanted features against the cost of
implementing only the desired features
Flight Software Complexity 38
39. Software Size and Growth
Software Growth in Military Aircraft Link
Size Comparison of Embedded Software Link
Growth in Automobile Software at GM Link
FSW Growth Trend in JPL Missions Link
MSFC Flight Software Sizes Link
GSFC Flight Software Sizes Link
APL Flight Software Sizes Link
40. Flight Software Growth Trend: JPL Missions
JPL
With a vertical axis of size x speed, this chart
data shows growth keeping pace with Moore’s Law
109 MSL
108 MER
Size Pathfinder, MGS, DS1…
107
×
Speed
106 Cassini
(bytes × MIPS)
MO
105 Doubling time < 2 years
Doubling time < 2 years
GLL, Magellan
104 Consistent with Moore’s Law
VGR (i.e., bounded by capability)
103 Viking
1970 1980 1990 2000 2010
Launch Year
Source: Bob Rasmussen, JPL Flight Software Complexity 40
41. MSFC Flight Software Organization (no trend)
SSME - - Space Shuttle Main Engine ~30K SLOC
MSFC SSME Space Shuttle Main Engine ~30K SLOC
C/assembly (1980’s –– 2007)
C/assembly (1980’s 2007)
data
LCT - - Low Cost Technology (FASTRAC engine)
LCT Low Cost Technology (FASTRAC engine)
~30K SLOC C/Ada (1990’s)
~30K SLOC C/Ada (1990’s)
SSFF –– Space Station Furnace Facility ~22K SLOC
S o u rce L in e o f C o d e (S L O C ) H isto ry SSFF Space Station Furnace Facility ~22K SLOC
C (cancelled 1997)
C (cancelled 1997)
70 MSRR –– Microgravity Science Research Rack
MSRR Microgravity Science Research Rack
~60K SLOC C (2001 - - 2007)
60 ~60K SLOC C (2001 2007)
50 UPA –– Urine Processor Assembly ~30K SKOC C
UPA Urine Processor Assembly ~30K SKOC C
K SLOC
(2001 - - 2007)
40 (2001 2007)
30 AVGS DART –– Advanced Video Guidance System
AVGS DART Advanced Video Guidance System
20 for Demonstration of Automated Rendezvous
for Demonstration of Automated Rendezvous
Technology ~18K SLOC C (2002 - - 2004)
10 Technology ~18K SLOC C (2002 2004)
0 AVGS OE – AVGS for Orbital Express ~16 K SLOC
AVGS OE – AVGS for Orbital Express ~16 K SLOC
C (2004 - - 2006)
SSME SSFF UP A AVGS A res A res J- C (2004 2006)
OE FC 2X SSME AHMS –– Space Shuttle Main Engine
SSME AHMS Space Shuttle Main Engine
P ro je ct Advanced Health Management System ~42.5K
Advanced Health Management System ~42.5K
SLOC C/assembly (2006 flight)
SLOC C/assembly (2006 flight)
FC - - Ares Flight Computer estimated ~60K SLOC
FC Ares Flight Computer estimated ~60K SLOC
TBD language (2007 SRR)
TBD language (2007 SRR)
CTC - - Ares Command and Telemetry Computer
CTC Ares Command and Telemetry Computer
estimated ~30K SLOC TBD language (2007 SRR)
estimated ~30K SLOC TBD language (2007 SRR)
Ares J-2X engine initial estimate ~15K SLOC TBD
Ares J-2X engine initial estimate ~15K SLOC TBD
language (2007 SRR)
language (2007 SRR)
Source: Cathy White, MSFC
Flight Software Complexity 41
42. GSFC Flight Software Sizes (no trend)
F S W S iz e fo r G S F C M issio n s
160000
140000
120000
100000
NCS L
80000
60000
40000
20000
0
1997 2001 2006 2009 2009
TR M M MAP S T-5 S DO LR O
Ye a r a n d M issio n
Source: David McComas, GSFC Note: LISA expected to be much larger
Flight Software Complexity 42
43. APL Flight Software Sizes (no trend)
160000
Horizons
140000
TIMED
Messenger
New
MSX
Stereo
120000
100000
Lines o
80000
Contour
60000
NEAR
40000
ACE
20000
0
1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007
L au n ch D ate
Source: Steve Williams, APL
Flight Software Complexity 43
44. Software Defects and Verification
Residual Defects in Software Link
Software Development Process Link
Defects, latent defects, residual defects Link
Is there a limit to software size? Link
45. Technical Reference
Residual Defects in Software
• Each lifecycle phase involves human effort and therefore inserts some defects
• Each phase also has reviews and checks and therefore also removes defects
• Difference between the insertion and removal rates determines defect propagation rate
• the propagation rate at the far right determines the residual defect rate
• For a good industry-standard software process, residual defect rate is typically 1-10 per KNCSL
• For an exceptionally good process (e.g., Shuttle) it can be as low as 0.1 per KNCSL
• It is currently unrealistic to assume that it could be zero….
defect insertion rate
6 23 46 1
2 5 25 2 residual defects
reqs design coding testing after testing
(anomalies)
4 20 26 24
defect removal rate
Propagation of S.G. Eick, C.R. Loader et al., Estimating software fault content before coding,
Proc. 15th Int. Conf. on Software Eng., Melbourne, Australia, 1992, pp. 59-65
residual defects
Flight Software Complexity 45
46. Software Development Process
for Safety- & Mission-Critical Code
1: reduce defect insertion rates
3: reduce risk
require-
ments
design coding testing from residual
software defects
2: increase effectiveness of defect removal
with tool based techniques
requirements model-based design, static source code analysis run-time monitoring
capture and prototyping / formal increased assertion density techniques
analysis tools verification techniques, NASA standard for Reliable C property-based
logic model checking, verifiable coding guidelines testing techniques
code synthesis methods compliance checking tools sw fault containment
strategies
test-case generation from requirements / traceability
Source: Gerard Holzmann, JPL Flight Software Complexity 46
47. How good are state-of-the-art
software testing methods?
• Most estimates put the number of residual
defects for a good software process at 1 to
10 per KNCSL
– A residual software defect is a defect missed in 1 Million lines of code
testing, that shows up in mission operations
– A larger, but unknowable, class of defects is defects caught in
known as latent software defects – these are all unit & integration
defects present in the code after testing that testing (99%)
could strike – only some of which reveal
themselves as residual defects in a given interval
of time.
• Residual defects occur in any severity
category latent defects (1%)
software
– A rule of thumb is to assume that the severity
defects
ratios drop off by powers of ten: if we use 3 missed in
severity categories with 3 being least and 1 most testing
damaging, then 90% of the residual defects will
be category 3, 9% category 2, and 1% category 1
(potentially fatal).
– A mission with 1 Million lines of flight code, with a residual defects (0.1%)
defects
low residual defect ratio of 1 per KNCSL, then that
conservatively: 100-1,000
translates into 900 benign defects, 90 medium, occur in severity 1 defects
and 9 potentially fatal residual software defects flight
(potentially fatal)
(i.e., these are defects that will happen, not those
that could happen) (0.001%)
conservatively: 1-10
Source: Gerard Holzmann, JPL Flight Software Complexity 47
48. Thought Experiment
Is there a limit to software size?
Assumptions:
• 1 residual defect per 1,000 lines of code (industry average)
• 1 in every 100 residual defects occur in the 1st year of operation
• 1 in every 1000 residual defects can lead to mission failure
• System/software methods are at current state of the practice (2008)
1.0 certainty of failure
beyond this size
probability
of system
0.5
failure beyond this size
code is more likely to
fail than to work
0.0
code size
spacecraft commercial 50M 100M in NCSL
software software
time
Long-term trend: increasing code size with each new mission
Flight Software Complexity 48
50. Impediments to Software Architecture
within NASA
•• Inappropriate modeling techniques
Inappropriate modeling techniques
–– “Software architecture is just boxes and lines”
“Software architecture is just boxes and lines”
–– “Software architecture is just code modules”
“Software architecture is just code modules” As presented by
– “A layered diagram says ititall”
– “A layered diagram says all” Prof. David Garlan (CMU) at
•• Misunderstanding about role of architecture NASA Planetary Spacecraft
Misunderstanding about role of architecture
in product lines and architectural reuse Fault Management Workshop,
in product lines and architectural reuse
– “A product line is just a reuse library” 4/15/08
– “A product line is just a reuse library”
•• Impoverished culture of architecture design
Impoverished culture of architecture design
–– No standards for arch description and analysis
No standards for arch description and analysis
–– Architecture reviews are not productive
Architecture reviews are not productive
–– Architecture is limited to one or two phases
Architecture is limited to one or two phases
–– Lack of architecture education among engineers
Lack of architecture education among engineers
•• Failure to take architecture seriously
Failure to take architecture seriously
– “We always do ititthat way. It’s cheaper/easier/less risky
– “We always do that way. It’s cheaper/easier/less risky
to do ititthe way we did ititlast time.”
to do the way we did last time.”
– “They do itita certain way ‘out there’ so we should too.”
– “They do a certain way ‘out there’ so we should too.”
– “We need to reengineer ititfrom scratch because the
– “We need to reengineer from scratch because the
mission is different from all others.”
mission is different from all others.”
Flight Software Complexity 50
51. Observations
Poor Software Practices within NASA
No formal documentation of requirements
Little to no user involvement during requirements definition
Rushing to start design & code before requirements are understood.
Wildly optimistic beliefs in re-use (especially when it comes to costing and planning).
Planning to use new compilers, operating systems, languages, computers for the first time as if they
were proven entities.
Poor configuration management (CM)
Inadequate ICDs
User interfaces left up to software designers rather than prototyping and baselining as part of the
requirements
Big Bang Theory: All software from all developers comes together at end and miraculously works
Planning that software will work with little or no errors found in every test phase.
Poor integration planning (both SW-to-SW and SW-to-HW) (e.g., no early interface/integration testing)
No pass/fail criteria at milestones (not that software is unique in this). Holding reviews when artifacts are
not ready.
Software too far down the program management hierarchy An illustrative but
to have visibility into its progress
incomplete list of poor
Little to no life-cycle documentation software practices
Inadequate to no developmental metrics collected/analyzed observed in NASA.
No knowledgeable NASA oversight John Hinkle, LaRC
Flight Software Complexity 51
53. History
NATO Software Engineering Conference 1968
• This landmark conference, which introduced the term
“software engineering”, was called to address “the
software crisis”.
• Discussions of wide interest:
– problems of achieving sufficient reliability in software systems
– difficulties of schedules and specifications on large software projects
– education of software engineers
Quotes from the 1968 report: “I am concerned about the current growth of
“There is a widening gap between systems, and what I expect is probably an
exponential growth of errors. Should we have
ambitions and achievements in software
systems of this size and complexity?”
engineering.”
“The general admission of the existence of the
“Particularly alarming is the seemingly software failure in this group of responsible
unavoidable fallibility of large software, people is the most refreshing experience I have
since a malfunction in an advanced had in a number of years, because the admission
hardware-software system can be a of shortcomings is the primary condition for
matter of life and death …” improvement.”
Flight Software Complexity 53
54. Epilogue
• Angst about software complexity in 2008 is the same
as in 1968 (See NATO 1968 report, slide)
– We build systems to the limit of our ability
– In 1968, 10K lines of code was complex
– Now, 1M lines of code is complex, for the same price
“While technology can change quickly, getting your people to change takes a great
deal longer. That is why the people-intensive job of developing software has had
essentially the same problems for over 40 years. It is also why, unless you do
something, the situation won’t improve by itself. In fact, current trends suggest that
your future products will use more software and be more complex than those of
today. This means that more of your people will work on software and that their
work will be harder to track and more difficult to manage. Unless you make some
changes in the way your software work is done, your current problems will
likely get much worse.”
Winning with Software: An Executive Strategy, 2001
Watts Humphrey, Fellow, Software Engineering Institute, and
Recipient of 2003 National Medal of Technology
Flight Software Complexity 54
56. What is Architecture?
• Architecture is an essential systems engineering
responsibility, which deals with the fundamental organization
of a system, as embodied in its components and their
relationships to each other and to the environment
– Architecture addresses the structure, not only of the system, but also
of its functions, the environment within which it will work, and the
process by which it will be built and operated
• Just as importantly, however, architecture also deals with the
principles guiding the design and evolution of a system
– It is through the application and formal evaluation of architectural
principles that complexity, uncertainty, and ambiguity in the design of
complicated systems may be reduced to workable concepts
– In the best practice of architecture, this aspect of architecture must
not be understated or neglected
Source: Bob Rasmussen, JPL Flight Software Complexity 56
57. Architecture
Some Essential Ideas
• Architecture is focused on fundamentals
– An architecture that must regularly change as issues arise provides
little guidance
– Architecture and design are not the same thing
• Guidance isn’t possible if the original concepts have little
structural integrity to begin with
– Choices must be grounded in essential need and solid principles
– Otherwise, any migration away from the original high level design
is easy to justify
• Even if the structural integrity is there, it can be lost if it is
poorly communicated or poorly stewarded
– The result is generally ever more inflexible and brittle
Source: Bob Rasmussen, JPL Flight Software Complexity 57
58. Reference
What is Software Architecture?
• The software architecture of a program or computing system is the
structure or structures of the system, which comprise software
elements, the externally visible properties of those elements, and the
relationships among them.” Software Architecture in Practice, 2nd edition,
Bass, Clements, Kazman, 2003, Addison-Wesley.
• Noteworthy points:
– Architecture is an abstraction of a system that suppresses some details
– Architecture is concerned with the public interfaces of elements and how
they interact at runtime
– Systems comprise more than one structure, e.g., runtime processes,
synchronization relations, work breakdown, etc. No single structure is
adequate.
– Every software system has an architecture, whether or not documented,
hence the importance of architecture documentation
– The externally visible behavior of each element is part of the architecture,
but not the internal implementation details
– The definition is indifferent as to whether the architecture is good or bad,
hence the importance of architecture evaluation
Flight Software Complexity 58
59. What is an Architect?
• An architect defines, documents, maintains, improves, and
certifies proper implementation of an architecture — both its
structure and the principles that guide it
– An architect ensures through continual attention that the elements of
a system come together in a coherent whole
– Therefore, in meeting these obligations the role of architect is
naturally concerned with leadership of the design effort throughout
the development lifecycle
• An architect must ensure that…
– The architecture (elements, relationships, principles) reflects
fundamental, stable concepts
– The architecture is capable of providing sound guidance throughout
the whole process
– The concept and principles of the architecture are never lost or
compromised
Source: Bob Rasmussen, JPL Flight Software Complexity 59
60. Architect
Essential Activities
• Understand what a system must do
• Define a system concept that will accomplish this
• Render that concept in a form that allows the work to be
shared
• Communicate the resulting architecture to others
• Ensure throughout development, implementation, and
testing that the design follows the concepts and comes
together as envisioned
• Refine ideas and carrying them forward to the next
generation of systems
Source: Bob Rasmussen, JPL Flight Software Complexity 60
61. Architectural Activities in More Detail (1)
• Function
– Help formulate the overall system objectives
– Help stakeholders express what they care about in an actionable form
– Capture in scenarios where and how the system will be used, and the
nature of its targets and environment
– Define the scope of the architecture, including external relationships
• Definition
– Select and refine concepts on which the architecture might be based
– Define essential properties concepts must satisfy, and the means by
which they will be analyzed and demonstrated
– Perform trades and assess options against essential properties — both to
choose the best concept and to help refine objectives
• Articulation
– Render selected concepts in elements that can be developed further
– Choose carefully the structure and relationships among the elements
– Identify the principles that will guide the evolution of the design
– Express these ideas in requirements for the elements and their
relationships that are complete, but preserve flexibility
Source: Bob Rasmussen, JPL Flight Software Complexity 61
62. Architectural Activities in More Detail (2)
• Communication
– Choose how the architecture will be documented — what views need to be
defined, what standards will be used to define them…
– Create documentation of the architecture that is clear and complete,
explaining all the choices and how implementation will be evaluated against
high level objectives and stakeholder needs
• Oversight
– Monitor the development, making corrections and clarifications, as
necessary to the architecture, while enforcing it
– Evaluate and test to ensure the result is as envisioned and that objectives
are met, including during actual operation
• Advancement
– Learn from others and document your experience and outcome for others to
learn from
– Stay abreast of new capabilities and methods that can improve the art
Source: Bob Rasmussen, JPL Flight Software Complexity 62