The CI as a partner for test improvement suggestions

1
The CI as a partner for test improvement suggestions
Benoit Baudry
Project Coordinator and Scientific Leader
KTH, Sweden
Caroline Landry
Project Technical Manager
INRIA, France
19-Apr-2019

This work was partially supported by the EU Project
STAMP ICT-16-10 No.731529
•4 research institutions
•5 companies
•1 open source
consortium
•Automated Testing in
DevOps

DevOps
5
• Better quality
• Shorter release cycles

DevOps
6
• Better quality
• Reconnect Ops to Dev

DevOps
7
Automation• Better quality

DevOps
8
Step acceptance & Feedback
Automation• Better quality

DevOps
11
IDEs
CI
completion linter

DevOps
12
unit
perf.
UI
IDEs
CI
completion linter

DevOps
13
unit
perf.
UI
Config.
IDEs
CI
completion linter

DevOps
14
unit
perf.
fuzzing
UI
Config.
IDEs
CI
completion linter
coverage
mutation

DevOps
15
unit
perf.
fuzzing
UI
Config.
IDEs
CI
completion linter
coverage
mutation
CD

DevOps
16
unit
perf.
fuzzing
UI
chaos
A/B testing
Config.
IDEs
CI
completion linter
coverage
mutation
CD

DevOps
17
unit
perf.
fuzzing
UI
chaos
Logs
analysis
A/B testing
Config.
IDEs
CI
completion linter
coverage
mutation Crash
analysis
CD

DevOps
18
unit
perf.
fuzzing
UI
chaos
Logs
analysis
A/B testing
Config.
IDEs
Continuous testing
CI
completion linter
coverage
mutation Crash
analysis
CD

STAMP’s concept: amplification
 Amplify (v.): to increase the size or effect of
something
https://dictionary.cambridge.org/dictionary/english/amplify 20

something
 Test amplification: Increase the effect of test assets
21https://dictionary.cambridge.org/dictionary/english/amplify

something
 Test amplification: Increase the effect of test assets
 Test assets: test cases, configuration files, production logs
 Effect metrics: mutation score, feature interactions
 Automatic amplification
22https://dictionary.cambridge.org/dictionary/english/amplify

Unit test amplification
27
Descartes

Test Your Tests
•What do you expect from test cases?
28

Test Your Tests
• Cover requirements
• Code coverage
• Stress the application
• Reveal bugs
29

Test Your Tests
• Cover requirements
• Code coverage
• Stress the application
• Reveal bugs
30

Example
31
long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
result = result * i;
}
return result;
}

Example
32
long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
Coverage

Example
33
long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
@Test
factorialWith5Test() {
long obs = fact(5);
assertTrue(obs > 5);
}
Coverage

Example
34
long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
@Test
long obs = fact(5);
}
@Test
assertEqual(fact(0), 1);
}
Coverage

Example
35
long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
Coverage
@Test
long obs = fact(5);
}
@Test
}
Is these test suite good at
detecting bugs?

Example
36
long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
Coverage
@Test
long obs = fact(5);
}
@Test
}
Is these test suite good at
detecting bugs?
Let’s mutate our code to see.

Mutation Analysis
38
inject fault
Mutant 1

Mutation Analysis
39
inject fault
Mutant 1
Run

Mutation Analysis
40
inject fault
Mutant 1
Run ✗ Mutant 1

Mutation Analysis
41
inject fault
Mutant 1
Run ✗ Mutant 1
Killed


Mutation Analysis
42
inject fault
Mutant 1
Mutant 2
Run ✗ Mutant 1
Killed


Mutation Analysis
43
inject fault
Mutant 1
Mutant 2
Run ✗ Mutant 1
Killed


Mutation Analysis
44
inject fault
Mutant 1
Mutant 2
Run ✗
✗
Mutant 1
Mutant 2
Killed


Mutation Analysis
45
inject fault
Mutant 1
Mutant 2
Run ✗
✗
Mutant 1
Mutant 2
Killed

Killed


Mutation Analysis
46
inject fault
Mutant 1
Mutant 2
Mutant 3
Run ✗
✗
Mutant 1
Mutant 2
Killed

Killed


Mutation Analysis
47
inject fault
Mutant 1
Mutant 2
Mutant 3
Run ✗
✗
Mutant 1
Mutant 2
Killed

Killed


Mutation Analysis
48
inject fault
Mutant 1
Mutant 2
Mutant 3
Run ✗
✓
✗
Mutant 1
Mutant 2
Mutant 3
Killed

Killed


Mutation Analysis
49
inject fault
Mutant 1
Mutant 2
Mutant 3
Run ✗
✓
✗
Mutant 1
Mutant 2
Mutant 3
Killed

Alive

Killed


Mutation Analysis
50
inject fault
Mutant 1
Mutant 2
Mutant 3
Run ✗
✓
✗
 Mutation score
Mutant 1
Mutant 2
Mutant 3
Killed

Alive

Killed


Example
51
long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
@Test
long obs = fact(5);
}
@Test
}

long fact(int n) {
if (n != 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
Example
52
✗
✓
@Test
long obs = fact(5);
}
@Test
}

long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
Example
53
n!=0
return 1+1
i < n
!(i<=n)
i--
result/i
result+1

Example
54
n!=0
return 1+1
i < n
!(i<=n)
i--
result/i
result+1
long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
@Test
long obs = fact(5);
}

Example
55
n!=0
return 1+1
i < n
!(i<=n)
i--
result/i
result+1
long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
@Test
}

Example
56
n!=0
return 1+1
i < n
!(i<=n)
i--
result/i
result+1
long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
@Test
long obs = fact(5);
}
@Test
}

Example
57
 Mutation score = 71%
Test suite:
• Weak oracle
• Missing input
long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
@Test
long obs = fact(5);
}
@Test
}

Example
58
n!=0
return 1+1
i < n
!(i<=n)
i--
result/i
result+1
@Test
assertEqual(fact(5),120);
}
long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
@Test
long obs = fact(5);
}
@Test
}

Mutation Analysis
•Tests are good if they can detect bugs
•Mutation operators
• Based on common faults
•PIT or PITest
• Open source, in active development and production ready
• Integrates with major build systems
• State of the art mutation testing
• Extensible via plugins
• Concurrent execution
• Test selection
59

Limitations of mutation testing
•Expensive computation
•Huge number of mutants
•Presence of equivalent mutants

Limitations of mutation testing
•Expensive computation
•Huge number of mutants
•Presence of equivalent mutants
 Extreme Mutation

Extreme mutation
•Proposed in 2016 by Niedermayr et al.
•Remove the body of the method
•Replace by a single return
•Less mutants
•Most equivalent mutants can be detected
62
R. Niedermayr, E. Juergens, and S. Wagner, “Will my tests tell me if I break this code?,” in Proceedings of the International Workshop on Continuous
Software Evolution and Delivery, 2016, pp. 23–29.

long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
Example
63
n!=0
return 1+1
i < n
!(i<=n)
i--
result/i
result+1

long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
Example
64
n!=0
return 1+1
i < n
!(i<=n)
i--
result/i
result+1

long fact(int n) {
if (n == 0) {
return 1;
}
long result = 1;
for (int i = 2; i <= n; i++) {
}
return result;
}
long fact(int n) {
return 0;
}
long fact(int n) {
return 1;
}
Example
65

Descartes I mutate therefore I am
• A mutation engine for PIT
• Implement extreme mutation
• Compute code coverage & mutation score
• Identify weaknesses in your tests
• Find pseudo-tested methods

Pseudo-tested methods
•A method executed by the test suite
•No extreme mutation is detected
•Found in well tested projects

class SingletonListIterator
implements Iterator<Node> {
...
void add() {
throw
new
UnsupportedOperationException();
}
...
}
A pseudo-tested method
68
Apache Commons Collections

69
...
void add() {
throw
new
}
...
}
@Test
void testAdd() {
SingletonListIterator it = ...;
try {
it.add(value);
}
catch(Exception ex) {}
}

...
void add() {
throw
new
}
...
}
70
Pseudo-tested
@Test
void testAdd() {
try {
it.add(value);
}
}

...
void add() {
throw
new
}
...
}
71
No exception is thrown
A fail is needed here
Pseudo-tested
@Test
void testAdd() {
try {
it.add(value);
}
}

public class VersionedSet {
private long version = 0;
private ArrayList elements = new ArrayList();
public void add(Object item) {
if (! elements.contains(item)) {
elements.add(item);
incrementVersion();
}
}
private void incrementVersion() { version++; }
}
A pseudo-tested method (2)
72

elements.add(item);
incrementVersion();
}
}
}
73
@Test
public void testAdd() {
VersionedSet list = new VersionedSet();
list.add(1);
assertEquals(list.size(), 1);
}

elements.add(item);
incrementVersion();
}
}
}
74
@Test
list.add(1);
}
Pseudo-tested

elements.add(item);
incrementVersion();
}
}
}
75
@Test
list.add(1);
}
Pseudo-tested
Testability issue

81
Descartes

82
Descartes &
DSpot

DSpot
•Automatically enhances existing JUnit test suites
•Generate new assertions or new test cases
83

x
Dspot principle
85
Test criterion

x
Dspot principle
86
Test criterion
(Mutation score)

x
Dspot principle
87
Test criterion
(Mutation score)

x
Dspot principle
88
Test criterion
(Mutation score)

x
Dspot principle
89
Test criterion
(Mutation score)

Dspot – How does it work ?
90

91
Mutation
Analysis
✗
✓
✗
✗
✓

92
Amplified
Test 1
Mutation
Analysis
✗
✓
✗
✗
✓
amplify

93
@Test
public void html() {
Food kouignAmann = new Food("KouignAmann");
PhD benjamin = new PhD("Benjamin");
benjamin.eat(kouignAmann);
assertFalse(benjamin.isHungry());
}
Example
Original
test case

94
@Test
}
Example
Original
test case
Remove a method call

95
@Test
}
Example
Original
test case
Remove existing assertions
Remove a method call

96
@Test
benjamin.isHungry();
Log.log(benjamin, id : "benjamin");
}
Example
Original
test case
Instrument the test

97
@Test
}
Example
Original
test case
Instrument the test
Run the
instrumented test

benjamin.isHungry()true
benjamin.isHappy()false
98
@Test
}
Example
Original
test case
Instrument the test
Run the
instrumented test

benjamin.isHungry()true
benjamin.isHappy()false
99
@Test
}
Example
Original
test case
Instrument the test
Run the
instrumented test
assertTrue(benjamin.isHungry());
assertFalse(benjamin.isHappy());Generate assertions

@Test
assertTrue(benjamin.isHungry());
assertFalse(benjamin.isHappy());
} 100
Example
Original
test case
Amplified
test case
@Test
}

101
Amplified
Test 1
Mutation
Analysis
✗
✓
✗
✗
✓
amplify

102
Amplified
Test 1
Mutation
Analysis
✗
✓
✗
✗
✓
✗
✓
✗
✓
✓
amplify

103
Amplified
Test 1
Mutation
Analysis
✗
✓
✗
✗
✓
✗
✓
✗
✓
✓
amplify

104
Amplified
Test 1
Mutation
Analysis
✗
✓
✗
✗
✓
✗
✓
✗
✓
✓
amplify

105
Amplified
Test 1
Amplified
Test 2
Mutation
Analysis
✗
✓
✗
✗
✓
✗
✓
✗
✓
✓
amplify

106
Amplified
Test 1
Amplified
Test 2
Mutation
Analysis
✗
✓
✗
✗
✓
✗
✓
✗
✓
✓ ✓
✓
✓
✗
✓
amplify

107
Amplified
Test 1
Amplified
Test 2
Mutation
Analysis
✗
✓
✗
✗
✓
✗
✓
✗
✓
✓ ✓
✓
✓
✗
✓
amplify

108
Amplified
Test 1
Amplified
Test 2
Mutation
Analysis
✗
✓
✗
✗
✓
✗
✓
✗
✓
✓ ✓
✓
✓
✗
✓
amplify

109
Amplified
Test 1
Amplified
Test 2
Amplified
Test 3
Mutation
Analysis
✗
✓
✗
✗
✓
✗
✓
✗
✓
✓ ✓
✓
✓
✗
✓
amplify

110
Amplified
Test 1
Amplified
Test 2
Amplified
Test 3
Mutation
Analysis
✗
✓
✗
✗
✓
✗
✓
✗
✓
✗
✓
✗
✗
✓
✓ ✓
✓
✓
✗
✓
amplify

111
Amplified
Test 1
Amplified
Test 2
Amplified
Test 3
Mutation
Analysis
✗
✓
✗
✗
✓
✗
✓
✗
✓
✗
✓
✗
✗
✓
✓ ✓
✓
✓
✗
✓
amplify

112
Amplified
Test 1
Amplified
Test 2
Amplified
Test 3
Mutation
Analysis
✗
✓
✗
✗
✓
✗
✓
✗
✓
✗
✓
✗
✗
✓
✓ ✓
✓
✓
✗
✓
amplify

113
Amplified
Test 1
Amplified
Test 2
Amplified
Test 3
Mutation
Analysis
✗
✓
✗
✗
✓
✗
✓
✗
✓
✗
✓
✗
✗
✓
✓ ✓
✓
✓
✗
✓
✗
✗
✗
✗
✗
amplify

114
DSpot
Automatic Test Improvement with DSpot: a Study with Ten Mature Open-Source Projects. B. Danglot, O. Luis Vera-Pérez, B. Baudry, M. Monperrus. Submitted to EMSE.

The pull request loop
Collaborative
platform
115

Collaborative
platformpull req.
116

Collaborative
platformpull req.
code
117

Collaborative
platformpull req.
code
analysesfeedback
118

Collaborative
platformpull req.
code
analysesfeedback
119

pull req.
code
analysesfeedback
120

Collaborative
platformpull req.
code
analysesfeedback
121

Collaborative
platformpull req.
code
analysesfeedback
122
Novel analyses

Collaborative
platformpull req.
code
analysesfeedback
123
Novel analyses

Integration
•Command line
•Eclipse plugins
•Maven plugins
•Gradle plugins

Collaborative
platformpull req.
code
analysesfeedback
125
Novel analyses

Collaborative
platformpull req.
code
analysesfeedback
126
Novel analyses

Integration
•Jenkins
• Plugin to monitor score and pseudo-tested methods
• Xwiki
• Same strategy used with the code coverage
• Threshold on the mutation score
• https://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescar
tes

Collaborative
platformpull req.
code
analysesfeedback
128
Novel analyses

Collaborative
platformpull req.
code
analysesfeedback
129
Novel analyses

Integration
•In the CI as a service
• GitHub Application
• Find pseudo-tested methods in pull-requests
• Feedback using GitHub Check Runs API

More
•Open Source tools
• https://github.com/STAMP-project
•Public website
• https://www.stamp-project.eu
•Medium on Descartes
• https://medium.com/@almyre/short-circuiting-method-
executions-to-assess-test-quality-2d3fda45bc7f
•Publications
• https://www.stamp-project.eu/view/main/publications
131

Contacts
https://github.com/STAMP-project/
http://stamp-project.eu/
baudry@kth.se
caroline.landry@inria.fr
http://www.diverse-team.fr/
barais@irisa.fr
132

The CI as a partner for test improvement suggestions

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (11)

Similaire à The CI as a partner for test improvement suggestions

Similaire à The CI as a partner for test improvement suggestions (20)

Dernier

Dernier (20)

The CI as a partner for test improvement suggestions