A summary of the key results of the Quamoco project that enabled an integrated software quality assessment from high-level quality attributes down to concrete measures.
Introduction to Multilingual Retrieval Augmented Generation (RAG)
The Quamoco Quality Modelling and Assessment Approach
1. The Benchmark for Software Quality
www.uni-stuttgart.de
The Quamoco
Product Quality Modelling
and
Assessment Approach
Stefan Wagner
Institute of Software Technology
ICSE 2012
Zürich, Switzerland
8 June 2012
2. "Quality is a complex and multi-faceted concept...
it is also the source of great confusion."
–David A. Garvin
3. Software Quality Models
IEC 61508
Siemens Technical
Topic
CMMI Classification
COQUALMO ISO 15504 -
SPICE
ISO 9126
Maintainability Index
Musa basic SAP Q-Index
Visser et al.
Activity-Based
Musa-Okumoto Quality Models Littlewood-Verall
Bayesian
iDAVE Avizienis et al.
Marinescu & Boehm et al. MISRA
Ratiu
McCall & Dromey Common Criteria
Walters
NHPP ISO 15005
SAP Quality
Capgemini sd&m Standards SQUID
Software-Blutbild ISO 25000 ROSQ
Models
4. ISO 9126
Reliability
Functionality Performance
Quality
model
Maintainability Usability
Portability
5. Quality Models
in Practical Use
Percentages of answers, multiple answers possible
Company-specific 71
ISO 9126 28
Domain-specific 20
None 4
Wagner et al., 2010
6. „The -ilities are good
for management talk only.“
–Anonymous developer
Wagner et al., ESEM'09
7. ISO 9126
ISO 25010
Quality
attribute
Comment Clone
ratio coverage
Measure Cyclomatic
complexity
8. ISO 9126
ISO 25010
Quality
attribute
?
Comment Clone
ratio coverage
Measure Cyclomatic
complexity
9. Quality attribute
ISO 9126
ISO 25010
Measure
Comment Clone
ratio coverage
Cyclomatic
complexity
10. Quality attribute Multitude of models
ISO 9126
ISO 25010
Measure
Comment Clone
ratio coverage
Cyclomatic
complexity
11. Quality attribute Multitude of models
ISO 9126 Too abstract
ISO 25010
Measure
Comment Clone
ratio coverage
Cyclomatic
complexity
12. Quality attribute Multitude of models
ISO 9126 Too abstract
ISO 25010 Not operationalised
Measure
Comment Clone
ratio coverage
Cyclomatic
complexity
13. Quality attribute Multitude of models
ISO 9126 Too abstract
ISO 25010 Not operationalised
Not adaptable
Measure
Comment Clone
ratio coverage
Cyclomatic
complexity
14. Quality attribute Multitude of models
ISO 9126 Too abstract
ISO 25010 Not operationalised
Not adaptable
Unreproducible
assessments
Measure
Comment Clone
ratio coverage
Cyclomatic
complexity
15. Quality attribute Multitude of models
ISO 9126 Too abstract
ISO 25010 Not operationalised
Not adaptable
Unreproducible
assessments
Measure
Differing definitions
Comment Clone
ratio coverage
Cyclomatic
complexity
16. Quality attribute Multitude of models
ISO 9126 Too abstract
ISO 25010 Not operationalised
Not adaptable
Unreproducible
assessments
Measure
Differing definitions
Comment Clone
ratio coverage Unclear relationship
to quality goals
Cyclomatic
complexity
37. Calibration for Java
6000 compilable open source systems
Random selection
110 open source systems
38. Calibration for Java
6000 compilable open source systems
Random selection
110 open source systems
Measurement
39. Calibration for Java
6000 compilable open source systems
Random selection
110 open source systems
Measurement
Distributions for all measures
40. Calibration for Java
6000 compilable open source systems
Random selection
110 open source systems
Measurement
Distributions for all measures
Calculations and reviews
41. Calibration for Java
6000 compilable open source systems
Random selection
110 open source systems
Measurement
Distributions for all measures
Calculations and reviews
Evaluation functions
42. Interpretation with School Grades
10
Worst
6
Assessment
5
4
3
2
Best
1
0.80 0.82 0.84 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00
Evaluation (Utility)
Fig. 7. Interpretation Model
48. Experiment with OSS Projects
Ranking Ranking
Model Experts
Good Checkstyle Checkstyle
Log4J RSSOwl Log4J
RSSOwl
TV-Browser TV-Browser
Bad
JabRef JabRef
49. Experiment with Industry System
Ranking Ranking
Model Expert
Good Subsystem D Subsystem D
Subsystem A Subsystem A
Subsystem C Subsystem B, E
Subsystem E Subsystem C
Bad Subsystem B
50. Visibility of Quality
Improvements
Grade
4.2
2.8
Investment in
quality improvement
1.4
0
1.9.0 2.0.0 2.0.1 2.0.2 2.1.0 2.2.1
Version
53. Drill-Downs
„Modeled relations are comprehensible
and reasonable.“
„It is good to get an overall view on the
quality of a software product.“
54. Drill-Downs
„Modeled relations are comprehensible
and reasonable.“
„It is good to get an overall view on the
quality of a software product.“
„It clarifies software metrics.“
55. Drill-Downs
„Modeled relations are comprehensible
and reasonable.“
„It is good to get an overall view on the
quality of a software product.“
„It clarifies software metrics.“
„It is the best that can be done
with static code analysis.“
64. The Benchmark for Software Quality
www.uni-stuttgart.de
The Quamoco
Product Quality Modelling
and
Assessment Approach
Stefan Wagner, Klaus Lochmann, Lars Heinemann, Michael
Kläs, Adam Trendowicz, Reinhold Plösch, Andreas Seidl,
Andreas Goeb and Jonathan Streit
65. Linear Utility Function
1.0
Measure M4
Linear decreasing
0.74
utility function
Utility
0.0
min = 0.0 M4 = 2.17E-06 max = 8.50E-6
Notes de l'éditeur
I‘m delighted to present to you the results on quality modelling and assessment of our three year research project Quamoco.\nI‘ve always found quality an interesting concept because it determines so much about a system, but also it is very complex.\n
As David Garvin pointed out: Quality is complex and multifaceted and therefore it is also the source of great confusion“.\nSo what do computer scientists do to handle complexity? They abstract! Researchers have developed a variety of software quality models.\n\n
The range of quality models goes from collections of metrics over academic models, domain standards to company-specific models. But none of them has been able to get really broad acceptance. When we set out to work on quality assessments, we wanted to find a good basis. What do you do when there is no clear leader? You looked at the ISO standard, here 9126.\nISO 15005: Road vehicles - Ergonomic aspects of transport information and control systems - Dialogue management principles and compliance procedures\nNHPP: Non-homogeneous Poisson process (reliability growth models)\n
It breaks down quality into quality attributes such as reliability or maintainability – the „-ilities“. It then breaks them further down and gives some metrics to measure them.\nHow well is it doing in practice? We asked over a hundred practicioners.\n
And the result is not pretty good.\nOnly 28% of the respondents of our international survey said that they use ISO 9126. Only 28%. A bit more than a quarter.\nWhy is that so? We asked in detailed interviews about the reasons.\n
The developers told us taht there is a huge gap between the abstract quality attributes of ISO 9126 and the concrete implementation and assessment on the product. Operationalising the quality attributes is considered extremely difficult. Hence, they use some metrics.\nThe existing metrics are concrete but lack a clear connection to quality goals.\n
Hence, we have the abstract quality attributes of ISO 9126 or similarly the new standard 25010 as well as various measures.\n
And there is this gap that prevents quality attributes from being assessed and measures from clearly contributing to quality goals.\n
The Quamoco project has worked three years on providing – among other things – four results to help to overcome this problems.\n
The Quamoco project has worked three years on providing – among other things – four results to help to overcome this problems.\n
The Quamoco project has worked three years on providing – among other things – four results to help to overcome this problems.\n
The Quamoco project has worked three years on providing – among other things – four results to help to overcome this problems.\n
The Quamoco project has worked three years on providing – among other things – four results to help to overcome this problems.\n
The Quamoco project has worked three years on providing – among other things – four results to help to overcome this problems.\n
The Quamoco project has worked three years on providing – among other things – four results to help to overcome this problems.\n
\n
\n
Schließen der Lücke, durchgängie Zusammenhänge\n
Beispiel\n
The higher in this quality model, the more general the model should be. So on the top level, we have ISO 25010 quality attribute, which are almost applicable to all software products.\nThe product factors should be quite generally applicable but on the measure level, we are often specific for technologies or languages. To have general measures and to decouple them from technical implementations, we introduced instruments.\n
For example, we have a product factor „Uselessness of Methods“, which we measure, for example, with „Statically unused method“. This measure is applicable to different languages. Hence, we refine this with an instrument that uses Gendarme for C# and PMD for Java.\n
Base model describes qualities important for almost any kind of software\nAim is to use it as basis for more specific quality models and also to be able to apply it directly for very popular paradigms and technologies\nIt has a modular structure\nroot contains quality attributes and very general product factors, modules for object-oriented factors and operationalisation for Java and C#\nmostly static analysis tools and inspection\nprototypical development for C, C++ and GUI\n 284 Faktoren\n 524 Maße\n
\n
With that, we can do an actual assessment or evaluation of a software product.\nWe measure using the instruments and collect values for all the measures.\nNow, we need evaluations and aggregations for the product factors and quality attributes.\n
With that, we can do an actual assessment or evaluation of a software product.\nWe measure using the instruments and collect values for all the measures.\nNow, we need evaluations and aggregations for the product factors and quality attributes.\n
\n
\n
\n
From over a hundred systems, found typical distributions, eliminated outliers and analysed quartiles\nThe result of an evaluation function is a value between 0 and one.\n
From over a hundred systems, found typical distributions, eliminated outliers and analysed quartiles\nThe result of an evaluation function is a value between 0 and one.\n
From over a hundred systems, found typical distributions, eliminated outliers and analysed quartiles\nThe result of an evaluation function is a value between 0 and one.\n
From over a hundred systems, found typical distributions, eliminated outliers and analysed quartiles\nThe result of an evaluation function is a value between 0 and one.\n
From over a hundred systems, found typical distributions, eliminated outliers and analysed quartiles\nThe result of an evaluation function is a value between 0 and one.\n
From over a hundred systems, found typical distributions, eliminated outliers and analysed quartiles\nThe result of an evaluation function is a value between 0 and one.\n
The model was a dictation in school, if you have some errors you will get a bad grade\nHere: German school grades but you could plug in any interpretation wodel\n
\n
\n
\n
\n
\n
Eine solche Bewertung muss valide sein, um Entscheidungen darauf abzustützen. Haben wir eine gute Aussage über das System gemacht?\nHier haben wir unsere Bewertungen mit Expertenbewertungen verglichen.\nEs ergab sich fast exakt die gleiche Rangfolge. Wir waren in dieser Auflösung also bereits so gut wie Experten.\n
Hier für fünf Subsysteme eines kommerziellen Systems für den Maintainability-Teil.\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
And with that I want to close with a a thanks to all the Quamoco partners and supporters!\n