Aggregation techniques for software metrics

Aggregation
of software metrics
Bogdan Vasilescu
b.n.vasilescu@student.tue.nl

Alexander Serebrenik
a.serebrenik@tue.nl

April 7, 2011

Aggregation techniques for software metrics 2/8

Better understand aggregation techniques for software metrics.
Source lines of code − freecol−0.9.4

0.004
0.003
Density

0.002
0.001
0.000

0 500 1000 1500 2000 2500 3000

SLOC per class

Traditional: mean, sum, median, standard deviation, variance,
skewness, kurtosis.

/ department of mathematics and computer science

Aggregation techniques for software metrics 2/8

Better understand aggregation techniques for software metrics.
Household income in Ilocos, the Philippines (1998) Source lines of code − freecol−0.9.4
5e−06

0.004
4e−06

0.003
3e−06

Density
Density

0.002
2e−06

0.001
1e−06
0e+00

0.000

0 500000 1000000 1500000 2000000 2500000 0 500 1000 1500 2000 2500 3000

Income SLOC per class

Traditional: mean, sum, median, standard deviation, variance,
skewness, kurtosis.
Inequality indices: Gini, Theil, Atkinson, Hoover, Kolm.


Correlation study 3/8

Aggregate SLOC from class to package level.

Study statistical correlation between pairs of aggregation techniques.

Not enough to measure.


Available datasets 4/8

Qualitas Corpus 20101126 r+e.
r (recent): the most recent versions from 106 systems.
e (evolution): all available versions from 13 systems (≥ 10 versions
available), 414 versions in total.


Tooling 5/8

Developed and available tooling to analyze the corpus:
Extract metrics: SLOCCount, Understand (still not generic enough)
Compute inequality indices, perform statistical analyses: R (highly
scriptable)
Put everything together: Python toolchain (easily extendable)

Kendall correlation: Atkinson − skewness (SLOC) Kendall correlation: Gini − Theil (SLOC) Kendall correlation: mean − kurtosis (SLOC)
1.0

1.0

1.0
q q

q
q

q q
q
0.5

0.5

0.5
q
Kendall correlation coefficient


q
0.0

0.0

0.0
q

q
−0.5

−0.5

−0.5
q
q
−1.0

−1.0

−1.0


Sample results - shape 6/8

jfreechart : Atkinson − skewness (SLOC)

q

4
q q q

3
q q
q
q

skewness (SLOC)
q q
q q q

2
q
q qq q q q
q q
q q
q q q q

1
q q q q
qq q q q q
qq q
q q q q
q q q
0
−1 qq q
q
q
q q q
q q

q
q
q
−2

q

0.0 0.1 0.2 0.3 0.4 0.5

Atkinson (SLOC)

jfreechart : Gini − Theil (SLOC) jfreechart : mean − kurtosis (SLOC)
1.5

q q

20
q
1.0

kurtosis (SLOC)
q
15
Theil (SLOC)

q q
qq qq q
q q
q
q
10

qq q
0.5

q q q
q qq q q
q q
q q q
q
q q
qq q q q
q q
qq
qq
q q qq q
5

q
qqq qq
qq
qq
q q q q q q
q q
q q q q
q q
q q
q q q qqq q q q qq q
q q q q
q q qq
q q q q q
q q q q q q
q q q q q q q
qq q
0.0

q q q

0.0 0.2 0.4 0.6 0.8 0 50 100 150 200 250 300

Gini (SLOC) mean (SLOC)


/
Cor. coeff. Atkinson(SLOC) − Kolm(SLOC) Cor. coeff. Gini(SLOC) − Theil(SLOC)

−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0

0.8.1 0.8.1
1.0 1.0
1.1 1.1
2.0−beta−1 2.0−beta−1
2.0−beta−2 2.0−beta−2
2.0−beta−3 2.0−beta−3
2.0−beta−4 2.0−beta−4
2.0−final 2.0−final
2.0−rc2 2.0−rc2
2.0.1 2.0.1
2.0.2 2.0.2
2.0.3 2.0.3
2.1−beta−1 2.1−beta−1
2.1−beta−2 2.1−beta−2
2.1−beta−3 2.1−beta−3
2.1−beta−3b 2.1−beta−3b
2.1−beta−4 2.1−beta−4
2.1−beta−5 2.1−beta−5
2.1−beta−6 2.1−beta−6
2.1−final 2.1−final
2.1−rc1 2.1−rc1
2.1.1 2.1.1
2.1.2 2.1.2
2.1.3 2.1.3
2.1.4 2.1.4
2.1.5 2.1.5
2.1.6 2.1.6
2.1.7 2.1.7

department of mathematics and computer science
2.1.8 2.1.8
3.0 3.0
3.0−alpha 3.0−alpha
3.0−beta1 3.0−beta1
3.0−rc1 3.0−rc1
3.0.1 3.0.1
3.0.2 3.0.2
3.0.3 3.0.3
Sample results - evolution

3.0.4 3.0.4
3.0.5 3.0.5
3.1 3.1
3.1−alpha1 3.1−alpha1
3.1−rc1 3.1−rc1
3.1−rc2 3.1−rc2
3.1−rc3 3.1−rc3
3.1.1 3.1.1
3.1.2 3.1.2
3.1.3 3.1.3
3.2−cr1 3.2−cr1
3.2−cr2 3.2−cr2
3.2.0−cr3 3.2.0−cr3
3.2.0−cr4 3.2.0−cr4
3.2.0−cr5 3.2.0−cr5
3.2.0.ga 3.2.0.ga
hibernate − Kendall(Gini(SLOC), Theil(SLOC)) (86 releases)

3.2.1−ga 3.2.1−ga
hibernate − Kendall(Atkinson(SLOC), Kolm(SLOC)) (86 releases)

3.2.2−ga 3.2.2−ga
3.2.3−ga 3.2.3−ga
3.2.4−ga 3.2.4−ga
3.2.4−sp1 3.2.4−sp1
3.2.5−ga 3.2.5−ga
3.2.6−ga 3.2.6−ga
3.2.7−ga 3.2.7−ga
3.3.0−cr2 3.3.0−cr2
3.3.0−ga 3.3.0−ga
3.3.0−sp1 3.3.0−sp1
3.3.0.cr1 3.3.0.cr1
3.3.1−ga 3.3.1−ga
3.3.2−ga 3.3.2−ga
3.5.0−beta−1 3.5.0−beta−1
3.5.0−beta−2 3.5.0−beta−2
3.5.0−beta−3 3.5.0−beta−3
3.5.0−beta−4 3.5.0−beta−4
3.5.0−cr−1 3.5.0−cr−1
3.5.0−cr−2 3.5.0−cr−2
3.5.3−final 3.5.3−final
3.5.5−final 3.5.5−final
3.6.0−beta1 3.6.0−beta1
3.6.0−beta2 3.6.0−beta2
3.6.0−beta3 3.6.0−beta3
3.6.0−beta4 3.6.0−beta4
7/8

Aggregation techniques for software metrics

Recommandé

Recommandé

Contenu connexe

Similaire à Aggregation techniques for software metrics

Similaire à Aggregation techniques for software metrics (20)

Plus de Bogdan Vasilescu

Plus de Bogdan Vasilescu (9)

Dernier

Dernier (20)

Aggregation techniques for software metrics