As a software project ages, its source code is modified to add new features, restructure existing ones, and fix defects. These source code changes often induce changes in the build system, i.e., the system that specifies how source code is translated into deliverables. However, since developers are often not familiar with the complex and occasionally archaic technologies used to specify build systems, they may not be able to identify when their source code changes require accompanying build system changes. This can cause build breakages that slow development progress and impact other developers, testers, or even users. In this paper, we mine the source and test code changes that required accompanying build changes in order to better understand this co-change relationship. We build random forest classifiers using language-agnostic and language-specific code change characteristics to explain when code-accompanying build changes are necessary based on historical trends. Case studies of the Mozilla C++ system, the Lucene and Eclipse open source Java systems, and the IBM Jazz proprietary Java system indicate that our classifiers can accurately explain when build co-changes are necessary with an AUC of 0.60-0.88. Unsurprisingly, our highly accurate C++ classifiers (AUC of 0.88) derive much of their explanatory power from indicators of structural change (e.g., was a new source file added?). On the other hand, our Java classifiers are less accurate (AUC of 0.60-0.78) because roughly 75% of Java build co-changes do not coincide with changes to the structure of a system, but rather are instigated by concerns related to release engineering, quality assurance, and general build maintenance.
Mining Co-Change Information to Understand when Build Changes are Necessary
1. Mining Co-Change Information
to Understand when Build
Changes are Necessary
@shane_mcintosh
Shane
McIntosh
Bram
Adams
Meiyappan
Nagappan
Ahmed E.
Hassan
9. Commit
9719cf0
Continuous Integration:!
Enabled by the
build system
Commit
4
Build
Test
Report
Commit
9719cf0 was
successfully
integrated
.c .mk
10. Commit
9719cf0
Continuous Integration:!
Enabled by the
build system
Commit
4
Build
Test
Report
Commit
9719cf0 was
successfully
integrated
.c .mk
11. “...nothing can be
said to be certain,
except death and
taxes”
- Benjamin Franklin
The Build “Tax”
Up to 27% of source
changes require build
changes, too!
An Empirical Study of Build
Maintenance Effort!
S. McIntosh, B. Adams, T. H. D.
Nguyen, Y. Kamei, A. E. Hassan
[ICSE 2011]
5
20. Neglected build maintenance!
can even impact end users
7
Not working due
to linking of
incorrect SQLite
library version
21. Neglected build maintenance!
can even impact end users
7
When are build!
changes necessary?
Not working due
to linking of
incorrect SQLite
library version
24. Overview of the studied systems
8
C++ Java
13 years of
historical data
25. Overview of the studied systems
8
C++ Java
13 years of
historical data
Single build system for
Firefox, Thunderbird, etc.
26. Overview of the studied systems
8
C++ Java
13 years of
historical data
Single build system for
Firefox, Thunderbird, etc.
Total of 16 years of
historical data
27. Overview of the studied systems
8
C++ Java
13 years of
historical data
Single build system for
Firefox, Thunderbird, etc.
Total of 16 years of
historical data
Proprietary and open
source systems
32. Grouping related changes according
to the work items that they address
Missed code!
in #2121
.c .c .c
Add feature!
#2121
Fix for!
bug #1234
Changes
Transactions
.mk
10
33. Grouping related changes according
to the work items that they address
Missed code!
2121
in #2121
.c .c .c
Add feature!
#2121
Fix for!
bug #1234
1234
Changes
Transactions
Work items
.mk
10
36. !
Case study structure
11
(RQ1)!
Co-change
frequency
.c ? .mk
37. !
Case study structure
(RQ2)!
Co-change
prediction
accuracy
11
(RQ1)!
Co-change
frequency
.c ? .mk
38. !
Case study structure
(RQ2)!
Co-change
prediction
accuracy
11
(RQ1)!
Co-change
frequency
.c ? .mk
(RQ3)!
Most powerful
co-change
indicators
39. !
Case study structure
(RQ2)!
Co-change
prediction
accuracy
12
(RQ1)!
Co-change
frequency
.c ? .mk
(RQ3)!
Most powerful
co-change
indicators
40. The proportion of build changes that are
accompanied by source code changes
13
.c ? .mk
41. The proportion of build changes that are
accompanied by source code changes
.mk .c .c .mk .mk .c
13
.c ? .mk
42. The proportion of build changes that are
accompanied by source code changes
13
.c ? .mk
.c .c
Work!
items
.mk
1
.mk .mk .c
2 3
43. The proportion of build changes that are
accompanied by source code changes
13
.c ? .mk
.c .c
Work!
items
.mk
1
.mk .mk .c
2 3
2 build co-changes
3 total build changes
44. Build changes are often accompanied
by changes to the source code
Mozilla Eclipse Lucene Jazz
Build ⇒ Source 86% 82% 44% 72%
Build ⇒ Test 29% 36% 41% 36%
Build ⇒ Src/Test 88% 82% 53% 77%
14
.c ? .mk
45. Build changes are occasionally
accompanied by changes to test code
Mozilla Eclipse Lucene Jazz
Build ⇒ Source 86% 82% 44% 72%
Build ⇒ Test 29% 36% 41% 36%
Build ⇒ Src/Test 88% 82% 53% 77%
15
.c ? .mk
46. 53%-88% of build changes are accompanied
by changes to source or test code
Mozilla Eclipse Lucene Jazz
Build ⇒ Source 86% 82% 44% 72%
Build ⇒ Test 29% 36% 41% 36%
Build ⇒ Src/Test 88% 82% 53% 77%
16
.c ? .mk
47. !
Case study structure
(RQ2)!
Co-change
prediction
accuracy
17
(RQ1)!
Co-change
frequency
.c ? .mk
(RQ3)!
Most powerful
co-change
indicators
Most build
changes are
accompanied
by source/test
code changes
48. !
Case study structure
(RQ2)!
Co-change
prediction
accuracy
18
(RQ1)!
Co-change
frequency
.c ? .mk
(RQ3)!
Most powerful
co-change
indicators
Most build
changes are
accompanied
by source/test
code changes
49. Metrics that may indicate whether a code change
will require an accompanying build co-change
19
50. Metrics that may indicate whether a code change
will require an accompanying build co-change
19
Language-agnostic
51. Metrics that may indicate whether a code change
will require an accompanying build co-change
19
Language-agnostic
+++ File added
52. Metrics that may indicate whether a code change
will require an accompanying build co-change
19
Language-agnostic
+++ File added
Historical .c
.mk
tendencies
53. Metrics that may indicate whether a code change
will require an accompanying build co-change
Language-agnostic
Language-aware
19
+++ File added
Historical .c
.mk
tendencies
54. Metrics that may indicate whether a code change
will require an accompanying build co-change
Language-agnostic
Language-aware
19
+++ File added
Historical .c
.mk
tendencies
Changed
dependencies
import
55. Metrics that may indicate whether a code change
will require an accompanying build co-change
Language-agnostic
Language-aware
19
+++ File added
Historical .c
.mk
tendencies
Changed
dependencies
import
Changed!
conditional
compilation
#ifdef
56. We train classifiers to identify code
changes that require build changes
.mk
1 2
20
Work!
items
.c .c .c
Random forest
classification model
Build change
necessary
No build change
necessary
57. We train classifiers to identify code
changes that require build changes
.mk
1 2
20
Work!
items
.c
Ran.dcom .fcorest
classification model
Build change
necessary
No build change
necessary
58. We train classifiers to identify code
changes that require build changes
.mk
1 2
21
Work!
items
.c
Random forest
classification model
Build change!
necessary
No build change
necessary
59. We train classifiers to identify code
changes that require build changes
.mk
1 2
22
Work!
items
.c .c .c
Random forest
classification model
Build change
necessary
No build change
necessary
60. We train classifiers to identify code
changes that require build changes
.mk
1 2
22
Work!
items
.c
.c .c
Random forest
classification model
Build change
necessary
No build change
necessary
61. We train classifiers to identify code
changes that require build changes
.mk
1 2
23
Work!
items
.c .c
Random forest
classification model
Build change
necessary
No build change!
necessary
62. Language-aware metrics add
significant explanatory power
Mozilla Eclipse Lucene Jazz
24
AUC (language-agnostic
metrics)
0.84 0.66 0.71 0.57
AUC (language-agnostic
and
aware metrics)
0.88 0.69 0.78 0.60
63. Mozilla classifiers are highly
accurate
Mozilla Eclipse Lucene Jazz
25
AUC (language-agnostic
metrics)
0.84 0.66 0.71 0.57
AUC (language-agnostic
and
aware metrics)
0.88 0.69 0.78 0.60
64. Java classifiers are less accurate
than Mozilla one
Mozilla Eclipse Lucene Jazz
26
AUC (language-agnostic
metrics)
0.84 0.66 0.71 0.57
AUC (language-agnostic
and
aware metrics)
0.88 0.69 0.78 0.60
65. Java build changes are rarely!
driven by code changes
Category Task Amount Correctly
System structure Refactorings 19 (25%) class8ified
General build
27
maintenance
Build tool
configuration
15 (20%) 0
Build defects 6 (8%) 0
Release
engineering
Add platform
support
12 (16%) 2
Packaging fixes 12 (16%) 3
Library
versioning
8 (11%) 0
Test maintenance Test
infrastructure
3 (4%) 0
66. !
Case study structure
(RQ2)!
Co-change
prediction
accuracy
28
(RQ1)!
Co-change
frequency
.c ? .mk
(RQ3)!
Most powerful
co-change
indicators
Most build
changes are
accompanied
by source/test
code changes
AUCC++ 0.88,
AUCJava 0.60-0.88
Java build changes
are rarely motivated
by code changes
67. !
Case study structure
(RQ2)!
Co-change
prediction
accuracy
29
(RQ1)!
Co-change
frequency
.c ? .mk
(RQ3)!
Most powerful
co-change
indicators
Most build
changes are
accompanied
by source/test
code changes
AUCC++ 0.88,
AUCJava 0.60-0.88
Java build changes
are rarely motivated
by code changes
68. Measuring variable importance in
random forest classifiers
30
Var1 Var2
500
Random Forests!
L. Breiman
[Machine Learning 2001]
150
Build!
change?
No
No
Yes
Var3
2
456
777
111
65
4
13
Random forest
classification model
Classification
Work!
Item
1
2
3
69. Measuring variable importance in
random forest classifiers
30
Var1 Var2
500
Random Forests!
L. Breiman
[Machine Learning 2001]
150
Build!
change?
No
No
Yes
Var3
2
456
777
111
65
4
13
Random forest
classification model
Classification
Work!
Item
1
2
3
No
Yes
Yes
70. Measuring variable importance in
random forest classifiers
30
Var1 Var2
500
Random Forests!
L. Breiman
[Machine Learning 2001]
150
Build!
change?
No
No
Yes
Var3
2
456
777
111
65
4
13
Random forest
classification model
Classification
Work!
Item
1
2
3
No
Yes
Yes
Misclassification rate of
1 ÷ 3 = 0.33
71. Measuring variable importance in
random forest classifiers
30
Var1 Var2
500
Random Forests!
L. Breiman
[Machine Learning 2001]
150
Build!
change?
No
No
Yes
Var3
2
456
777
111
65
4
13
Random forest
classification model
Classification
Work!
Item
1
2
3
No
Yes
Yes
Randomly
permute values
72. Measuring variable importance in
random forest classifiers
31
Var1 Var2
777
500
Random Forests!
L. Breiman
[Machine Learning 2001]
150
Build!
change?
No
No
Yes
Var3
2
456
111
65
4
13
Random forest
classification model
Classification
Work!
Item
1
2
3
73. Measuring variable importance in
random forest classifiers
31
Var1 Var2
777
500
Random Forests!
L. Breiman
[Machine Learning 2001]
150
Build!
change?
No
No
Yes
Var3
2
456
111
65
4
13
Random forest
classification model
Classification
Work!
Item
1
2
3
Yes
Yes
Yes
74. Measuring variable importance in
random forest classifiers
31
Var1 Var2
777
500
Random Forests!
L. Breiman
[Machine Learning 2001]
150
Build!
change?
No
No
Yes
Var3
2
456
111
65
4
13
Random forest
classification model
Classification
Work!
Item
1
2
3
Yes
Yes
Yes
Misclassification rate of
2 ÷ 3 = 0.67
75. Measuring variable importance in
random forest classifiers
Var1 randomness increases
misclassification rate by 0.33
31
Var1 Var2
777
500
Random Forests!
L. Breiman
[Machine Learning 2001]
150
Build!
change?
No
No
Yes
Var3
2
456
111
65
4
13
Random forest
classification model
Classification
Work!
Item
1
2
3
Yes
Yes
Yes
Misclassification rate of
2 ÷ 3 = 0.67
76. Variable importance in the!
Mozilla classifier
Mozilla Eclipse−core ●
●
●
●
● ● ●●
●●
0.10
0.05
0.00
Non−core dependencies
Conditional compilation
Test deleted
Source modified
Test added
Source deleted
Source added
Prior build co−changes
Test modified
Source renamed
Test renamed
Test added
Source deleted
Source added
Number of files
Non−core dependencies
Prior build co−changes
Test deleted
Source modified
Test modified
Source renamed
Number of files
Test renamed
Source added
Test Source Variable Importance Score
32
77. Variable importance in the!
Mozilla classifier
Mozilla Eclipse−core ●
●
●
●
● ● ●●
●●
0.10
0.05
0.00
Non−core dependencies
Conditional compilation
Test deleted
Source modified
Test added
Source deleted
Source added
Prior build co−changes
Test modified
Source renamed
Test renamed
Test added
Source deleted
Source added
Number of files
Non−core dependencies
Prior build co−changes
Test deleted
Source modified
Test modified
Source renamed
Number of files
Test renamed
Source added
Test Source Variable Importance Score
32
Structure-altering
changes are
most important in
C++ classifiers
78. Variable importance in the!
Mozilla classifier
Mozilla Eclipse−core ●
●
●
●
● ● ●●
●●
0.10
0.05
0.00
Non−core dependencies
Conditional compilation
Test deleted
Source modified
Test added
Source deleted
Source added
Prior build co−changes
Test modified
Source renamed
Test renamed
Test added
Source deleted
Source added
Number of files
Non−core dependencies
Prior build co−changes
Test deleted
Source modified
Test modified
Source renamed
Number of files
Test renamed
Source added
Test Source Variable Importance Score
32
Structure-altering
changes are
most important in
C++ classifiers
Historical co-change
tendencies
are also important
79. Variable importance in Java
classifiers
Eclipse−core Lucene Jazz
Non−core dependencies
Prior build co−changes
Test deleted
Source modified
33
● ●●
●●
●
●
●
Non−core dependencies
Prior build co−changes
Test deleted
Source modified
Test modified
Source renamed
added
Test added
Source deleted
Test added
Source deleted
Number of files
Test renamed
Source added
Test modified
Source renamed
Non−core dependencies
Test deleted
Source modified
Test added
Source deleted
Number of files
Test renamed
Source added
Test modified
Source renamed
Test renamed
80. Variable importance in Java
classifiers
Eclipse−core Lucene Jazz
Non−core dependencies
Prior build co−changes
Test deleted
Source modified
33
● ●●
●●
●
●
●
Non−core dependencies
Prior build co−changes
Test deleted
Source modified
Test modified
Source renamed
added
Test added
Source deleted
Test added
Source deleted
Number of files
Test renamed
Source added
Test modified
Source renamed
Non−core dependencies
Test deleted
Source modified
Test added
Source deleted
Number of files
Test renamed
Source added
Test modified
Source renamed
Test renamed
Structure-altering is less
important than in Mozilla
81. Variable importance in Java
classifiers
Eclipse−core Lucene Jazz
Non−core dependencies
Prior build co−changes
Test deleted
Source modified
33
● ●●
●●
●
●
●
Non−core dependencies
Prior build co−changes
Test deleted
Source modified
Test modified
Source renamed
added
Test added
Source deleted
Test added
Source deleted
Number of files
Test renamed
Source added
Test modified
Source renamed
Non−core dependencies
Test deleted
Source modified
Test added
Source deleted
Number of files
Test renamed
Source added
Test modified
Source renamed
Test renamed
Structure-altering is less
important than in Mozilla
Java classifiers rely more
on historical co-change
tendencies
82. !
Case study structure
(RQ2)!
Co-change
prediction
accuracy
34
(RQ1)!
Co-change
frequency
.c ? .mk
(RQ3)!
Most powerful
co-change
indicators
Most build
changes are
accompanied
by source/test
code changes
AUCC++ 0.88,
AUCJava 0.60-0.88
Java build changes
are rarely motivated
by code changes
C++: Structure-altering
changes
Java: Historical
co-change
tendencies