Search-Based Robustness Testing of Data Processing Systems

.lusoftware veriﬁcation & validation
VVS
Search-Based !
Robustness Testing!
of Data Processing Systems
Daniel Di Nardo, Fabrizio Pastore,
Andrea Arcuri, Lionel Briand
University of Luxembourg
Interdisciplinary Centre for Security, Reliability and Trust
Software Veriﬁcation and Validation Lab

Data
Processing
System
Satellite

Data
Processing
System
4
•  Essential component of systems that aggregate and
analyse real-world data
•  Robustness is “the degree to which a system or
component can function correctly in the presence of
invalid inputs or stressful environmental conditions”

Data
Processing
System
Multiple ﬁelds
Nested structure w/ different types
Constraints among ﬁelds
Huge amount
Valid data
Invalid
data
Accepted
& processed
Discarded

Contributions
• An evolutionary algorithm to automate robustness testing of
data processing systems
• Use of four fitness functions (model-based and code-based)
that enable the effective generation of robustness test cases
by means of evolutionary algorithms
• An extensive study of the effect of fitness functions and
configuration parameters on the effectiveness of the approach
using an industrial data processing system as case study.
7

Testing Automation Problems
• How to automatically generate test inputs?
• Data mutation methodology [ICST’15]
• How to automatically verify test execution results?
• Modelling methodology [ASE’13]
• How to identify the most effective inputs?
• Best size of inputs? Which data types to consider? !
How many data faults should be present? !
Which constraints should be broken?
• Meta-heuristic search approach [ASE’15]

10
Meta-heuristic search approach
Model-based mutation to generate inputs
Model-based validation as oracle
+
Coverage objectives to evaluate inputs
+

Data Modelling Methodology
Input
Output
Modelling using Class Diagrams + OCL Constraints

Software
Under Test
Output
Constraints
Checking
Test Input
Data Model
Violated
Constraints
Model Instance
Field Data
Mutation
Based
Generation
Model-Based Mutation Testing

Generic mutation operators
(reusable across projects)
Class

Instance

Duplica/on

A2ribute

Replacement

with
Random

A2ribute

Bit

Flipping

Class

Instance

Removal

Class

Instances

Swapping

A2ribute

Replacement

Boundary
Cond.

Conﬁgurations for operators
(ﬁt the fault model)
Transmission
Vcdu
1
1
1
versionNumber : Integer
spaceCraftId : Integer
checksum : Integer

Header
Class

Instance

Duplica/on

A2ribute

Bit

Flipping

Class

Instance

Removal

Class

Instances

Swapping

A2ribute

Replacement

Boundary
Cond.

1..*
A2ribute

Replacement

with
Random

Transmission
Vcdu
1
1
1
checksum : Integer

Header
Class

Instance

Duplica/on

A2ribute

Bit

Flipping

Class

Instance

Removal

Class

Instances

Swapping

A2ribute

Replacement

Boundary
Cond.

1..*
A2ribute

Replacement

with
Random

Conﬁguration for mutation operators
is provided by UML stereotypes used
to select mutation targets
!
!

Transmission
Vcdu
1
1
1
<Identiﬁer> versionNumber : Integer
checksum : Integer

Header
Class

Instance

Duplica/on

A2ribute

Bit

Flipping

Class

Instance

Removal

Class

Instances

Swapping

A2ribute

Replacement

Boundary
Cond.

1..*
A2ribute

Replacement

with
Random

!
!

Transmission
Vcdu
1
1
1
checksum : Integer

Header
Class

Instance

Duplica/on

A2ribute

Replacement

with
Random

A2ribute

Bit

Flipping

Class

Instance

Removal

Class

Instances

Swapping

A2ribute

Replacement

Boundary
Cond.

1..*
!
!

Test Input
Field Data
Mutation
Based
Generation
Test Input
Test Input
Test Input
Test Input
Test Input
Effective
Small size
By measuring speciﬁc objectives.
How to generate an effective and small test suite?
How to evaluate the effectiveness of a test suite?
By means of a meta-heuristic search algorithm.

Transmission
Vcdu
1
1
1
<Derived> checksum : Integer

Header
Class

Instance

Duplica/on

A2ribute

Replacement

with
Random

A2ribute

Bit

Flipping

Class

Instance

Removal

Class

Instances

Swapping

A2ribute

Replacement

Boundary
Cond.

•  UML stereotypes to select !
mutation targets
•  UML stereotype to identify the !
ﬁelds to update!
!

1..*
19

Transmission
Vcdu
1..*
1
1
1
<Derived> checksum : Integer

Header
Class

Instance

Duplica/on

A2ribute

Replacement

with
Random

A2ribute

Bit

Flipping

Class

Instance

Removal

Class

Instances

Swapping

A2ribute

Replacement

Boundary
Cond.

•  UML stereotypes to select !
mutation targets
•  UML stereotype to identify the !
ﬁelds to update
•  OCL queries to express complex !
target selection criteria
20

How to determine if the
generated test suite is effective?
21

Test Effectiveness Objectives
•  O1: Include input data that covers all the classes of the data model
•  Data has a complex structure
•  O2: Cover all the data faults of a fault model
•  A variety of faults might be present in a system
•  O3: Cover all the clauses of the input/output constraints
•  Input/output constraints can have multiple conditions under which
a given output is expected
•  O4: Maximise code coverage
•  Implemented features should be fully executed

O1: Cover all the classes of the
data model
• Coverage of each class of a data model is tracked
• Test input covers a class if it contains at least one instance of
the class

data model
Vcdu
ActivePacketZone
1
1
1
vcFrameCount : Integer
checksum : Integer

Header
1..*
Transmission
Packet
IdlePacketZone
PacketZone
1
1..*
1
1

data model
Objective Targets
Test Inputs
Inp1
Inp2
Inp3
Vcdu
X
X
X
Header
X
X
X
IdlePacketZone
X
X
ActivePacketZone
X
X
Packet
X
X
X

O2: Cover the fault model
• Attributes and class instances of the input data model can be
mutated in different ways by different mutation operators
• Keep track of which mutation operator(s) have been applied to
a speciﬁc class/attribute instance when generating test data

Vcdu
Packet
1
1
1

Header
1..*

Vcdu
Packet
1
1
1

Header
1..*
Header.versionNumber::ReplaceWithRandom
Attribute Instance
Mutation Operator

Vcdu
Packet
1
1
1

Header
1..*
Header.vcFrameCount::ReplaceWithRandom

Vcdu
Packet
1
1
1

Header
1..*
Packet::InstanceDuplication
Packet::InstanceRemoval
Packet::InstanceSwapping
Class Instance
Mutation Operator

Objective Targets
Test Inputs
Inp1
Inp2
Inp3
X
X
X
X

O3: Cover clauses of constraints
• An input/output constraint shows the output expected under a
given input condition
• The test suite should stress all the conditions under which a
given output is expected

context Vcdu inv:

if previousFrameCount < 16777215

then frameCount <> previousFrameCount + 1
else

previousFrameCount = 16777215 and frameCount <> 0
endif

implies

VcduEvent.allInstances()èexists(e | e.eventType = COUNTER_JUMP)

context Vcdu inv:

if previousFrameCount < 16777215

then frameCount <> previousFrameCount + 1
else

previousFrameCount = 16777215 and frameCount <> 0
endif

implies

VcduEvent.allInstances()èexists(e | e.eventType = COUNTER_JUMP)

For each clause, keep track of whether a test input makes the
clause true and/or false.

Objective Targets
Test Inputs
Inp1
Inp2
Inp3
True : previousFrameCount < 16777215
X
X
X
True : frameCount <> previousFrameCount + 1
X
True : previousFrameCount = 16777215
True : frameCount <> 0
X
X
X
False : previousFrameCount < 16777215
False : frameCount <> previousFrameCount + 1
X
X
X
False : previousFrameCount = 16777215
X
X
X
False : frameCount <> 0

O4: Maximize code coverage
• Execute JaCoCo to measure the instructions covered by each
test case
Objective Targets
Test Inputs
Inp1
Inp2
Inp3
SesDaq.java : Instruction 10
X
X
X
SesDaq.java : Instruction 11
X
…
• Limitation: Requires the execution of the system under test

Evolutionary Algorithm with Archive
How to generate an effective
and small test suite?
Huge amount of test inputs can be generated
Exhaustive test generation not feasible

Sample new chunk
Field Data
With Seeding
No Seeding

Sample new chunk
Field Data
Field data (satellite transmission):
With Seeding
No Seeding
No seeding: frequent packet types selected more often
No seeding: packets randomly selected

Sample new chunk
Field Data
Field data (satellite transmission):
With Seeding
No Seeding
With seeding: all packet types same probability
With seeding: packet types randomly selected

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
ﬁltering
pruning
Assessment

ﬁltering
pruning
Assessment
Put in Archive

Objective 1

Data model
coverage
Vcdu
X
X
X
Header
X
X
X
IdlePacketZone
X
X
ActivePacketZone
X
X
X
Packet
X
X
X
Objective 2

Fault model!
coverage
X
X
X
X
Objective 3

Constraints !
clause !
coverage
X
X
X
X
X
X
X
X
X
X
X
X
X
Objective 4
Code cov.
SesDaq.java : Line 10
X
X
X
X

Objective Targets
Test Inputs
Inp1
Inp2
Inp3
Objective 1

Data model
coverage
Vcdu
X
X
X
Header
X
X
X
IdlePacketZone
X
X
ActivePacketZone
X
X
X
Packet
X
X
X
Objective 2

Fault model
coverage
X
X
X
X
Objective 3

Constraints !
clause !
coverage
X
X
X
X
X
X
X
X
X
X
X
X
X
False : frameCount <> 0
Objective 4
Code Coverage
X
X
X
X

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
ﬁltering
pruning
Assessment
Test
Inputs
Execute on System
And Validate Results
Constraint
Violations

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Test
Inputs
Execute on System
Constraint
Violations
ﬁltering
pruning
Assessment

Research questions

•  RQ1: How does the search algorithm compare with random and state-of-
the-art approaches?
•  RQ2: How does fitness based on code coverage affect performance?
•  RQ3: How does seeding affect performance?
•  RQ4: What are the configuration parameters that affect performance?
•  RQ5: What configuration should be used in practice?
•  Case study: Satellite DAQ developed by SES

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Test
Inputs
Execute on System
Constraint
Violations
filtering
pruning
Assessment
p seeding = 0, 0.5
p mutation = 0, 0.5, 1
p sampling = 0.3, 0.5, 0.8
Max mutations = 1, 10, 100
Stop after: 50k, 100k, 150k, 200k, 250k
Coverage-fitness: on, off
This leads to 3 × 3 × 3 × 2 × 2 = 108 configurations
108 × 5 = 540 different configurations of search algorithm
Each experiment repeated 5 times to account for
randomness: 540 × 5 = 2700 runs

RQ1: How does the search algorithm compare with random and
state-of-the-art approaches?
Budget (in Cadus)
Configuration
Coverage
# of Tests
50k
Best: r=0.5,m=1,n=100 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23424.4 
23424.4 
23386.8
28.4 
28.4 
43.2
100k
Best: r=0.5,m=1,n=100 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23487.8 
23487.8
23436.8
31.6
31.6
52.0
150k
Best: r=0.5,m=1,n=100 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23502.0
23502.0
23453.4
34.0
34.0
57.8
200k
Best: r=0.5,m=0.5,n=100 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23519.6
23513.4
23465.8
34.6
36.0
60.2
250k
Best: r=0.5,m=1,n=10 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23538.6
23515.2
23482.6
38.4
36.4
62.4
r, probability of random sampling
m, probably of applying mutation when sampling
n, maximum number of allowed mutations in a test
(Seeding not used)
Best, best configuration for the given search budget
BO, best configuration, on average, over all the search budgets
Rand, random approach

RQ1: How does the search algorithm compare with
random and state-of-the-art approaches?
• Random approach
• Always sample and mutate; do not reuse archived items
• Previous approach (ICST’15)
• Stops test input generation when all attributes have been
mutated at least once by each applicable mutation operator
• Search-based algorithm
• Best overall conﬁguration
• Best conﬁguration for a given budget

Budget
Conﬁguration
Coverage
# of Tests
ICST’15
23283.0
43.0
50k
Best: r=0.5,m=1,n=100 
BO: r=0.5,m=1,n=100 
Random
23424.4 
23424.4 
23386.8
28.4 
28.4 
43.2
100k
Best: r=0.5,m=1,n=100 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23487.8 
23487.8
23436.8
31.6
31.6
52.0
150k
Best: r=0.5,m=1,n=100 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23502.0
23502.0
23453.4
34.0
34.0
57.8
200k
Best: r=0.5,m=0.5,n=100 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23519.6
23513.4
23465.8
34.6
36.0
60.2
250k
Best: r=0.5,m=1,n=10 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23538.6
23515.2
23482.6
38.4
36.4
62.4
Search algorithm achieves better coverage
than both random and the ICST’15
approaches.
Search also generates signiﬁcantly
smaller test suites.

Budget
Conﬁguration
Coverage
# of Tests
ICST’15
23283.0
43.0
50k
Best: r=0.5,m=1,n=100 
BO: r=0.5,m=1,n=100 
Random
23424.4 
23424.4 
23386.8
28.4 
28.4 
43.2
100k
Best: r=0.5,m=1,n=100 
BO: r=0.5,m=1,n=100 
Random
23487.8 
23487.8
23436.8
31.6
31.6
52.0
150k
Best: r=0.5,m=1,n=100 
BO: r=0.5,m=1,n=100 
Random
23502.0
23502.0
23453.4
34.0
34.0
57.8
200k
Best: r=0.5,m=0.5,n=100 
BO: r=0.5,m=1,n=100 
Random
23519.6
23513.4
23465.8
34.6
36.0
60.2
250k
Best: r=0.5,m=1,n=10 
BO: r=0.5,m=1,n=100 
Random
23538.6
23515.2
23482.6
38.4
36.4
62.4
At the cost of a larger test suite.
With higher search budgets, search can
achieve greater coverage.

RQ1: How does the search algorithm compare with random and
state-of-the-art approaches?
55
Budget
Conﬁguration
Coverage
# of Tests
ICST’15
23283.0
43.0
50k
Best: r=0.5,m=1,n=100 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23424.4 
23424.4 
23386.8
28.4 
28.4 
43.2
100k
Best: r=0.5,m=1,n=100 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23487.8 
23487.8
23436.8
31.6
31.6
52.0
150k
Best: r=0.5,m=1,n=100 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23502.0
23502.0
23453.4
34.0
34.0
57.8
200k
Best: r=0.5,m=0.5,n=100 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23519.6
23513.4
23465.8
34.6
36.0
60.2
250k
Best: r=0.5,m=1,n=10 
BO: r=0.5,m=1,n=100 
Rand: r=1,m=1,n=1
23538.6
23515.2
23482.6
38.4
36.4
62.4
Search algorithm achieves better coverage
than a random approach.
APT achieved an average coverage of
23283 instructions.
Less than both search and random.
Search also generates
signiﬁcantly smaller test suites.

RQ2: How does ﬁtness based on code coverage affect
performance?
Budget
Code
Seeding
Conﬁguration
Coverage
# of Tests
# of Mut.
50k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=10
Best: r=0.5,m=1,n=10
BO: r=0.3,m=0,n=10
23361.4
23424.4
23417.2
23428.4
23401.8
17.0
28.4
21.0
34.2
27.0
4.8
3.6
4.0
3.2
4.3
100k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.3,m=1,n=10 
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=10
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23404.4
23487.8
23442.2
23487.0
23487.0
16.8
31.6
21.0
33.2
33.2
8.2
4.9
6.4
5.6
5.6
150k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.8,m=1,n=100
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=100
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23418.4
23502.0
23447.4
23528.2
23528.2
28.2
34.0
23.4
35.6
35.6
4.0
6.0
7.5
6.5
6.5
200k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.8,m=1,n=100
Best: r=0.5,m=0.5,n=100 
Best: r=0.5,m=1,n=100
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23426.0
23519.6
23456.0
23551.0
23551.0
28.0
34.6
23.2
37.2
37.2
4.7
6.7
9.2
7.0
7.0
250k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.8,m=1,n=100
Best: r=0.5,m=1,n=10 
Best: r=0.5,m=1,n=100
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23433.2
23538.6
23461.8
23554.4
23554.4
28.6
38.4
23.6
37.2
37.2
5.4
7.1
10.3
7.4
7.4

performance?
Budget
Code
Seeding
Conﬁguration
Coverage
# of Tests
# of Mut.
50k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=10
Best: r=0.5,m=1,n=10
BO: r=0.3,m=0,n=10
23361.4
23424.4
23417.2
23428.4
23401.8
17.0
28.4
21.0
34.2
27.0
4.8
3.6
4.0
3.2
4.3
100k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.3,m=1,n=10 
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=10
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23404.4
23487.8
23442.2
23487.0
23487.0
16.8
31.6
21.0
33.2
33.2
8.2
4.9
6.4
5.6
5.6
150k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.8,m=1,n=100
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=100
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23418.4
23502.0
23447.4
23528.2
23528.2
28.2
34.0
23.4
35.6
35.6
4.0
6.0
7.5
6.5
6.5
200k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.8,m=1,n=100
Best: r=0.5,m=0.5,n=100 
Best: r=0.5,m=1,n=100
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23426.0
23519.6
23456.0
23551.0
23551.0
28.0
34.6
23.2
37.2
37.2
4.7
6.7
9.2
7.0
7.0
250k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.8,m=1,n=100
Best: r=0.5,m=1,n=10 
Best: r=0.5,m=1,n=100
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23433.2
23538.6
23461.8
23554.4
23554.4
28.6
38.4
23.6
37.2
37.2
5.4
7.1
10.3
7.4
7.4
ess

ger

performance?
• For each search budget:
• Identiﬁed the best conﬁguration with/without the code
coverage objective enabled
Code coverage objective results in test suites with
higher code coverage.
At the expense of a larger test suite
(50% more test cases).

RQ3: How does seeding affect performance?
• For each search budget:
• Identified the best configuration with/without seeding
Seeding is always part of the configurations
that achieve the
highest code coverage
or lowest number of test cases
(for search budgets above 150k).

RQ4: What are the conﬁguration parameters that
affect performance?

RQ3: How does smart seeding affect performance?
Budget
Code
Seeding
Conﬁguration
Coverage
# of Tests
# of Mut.
50k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=10
Best: r=0.5,m=1,n=10
BO: r=0.3,m=0,n=10
23361.4
23424.4
23417.2
23428.4
23401.8
17.0
28.4
21.0
34.2
27.0
4.8
3.6
4.0
3.2
4.3
100k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.3,m=1,n=10 
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=10
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23404.4
23487.8
23442.2
23487.0
23487.0
16.8
31.6
21.0
33.2
33.2
8.2
4.9
6.4
5.6
5.6
150k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.8,m=1,n=100
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=100
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23418.4
23502.0
23447.4
23528.2
23528.2
28.2
34.0
23.4
35.6
35.6
4.0
6.0
7.5
6.5
6.5
200k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.8,m=1,n=100
Best: r=0.5,m=0.5,n=100 
Best: r=0.5,m=1,n=100
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23426.0
23519.6
23456.0
23551.0
23551.0
28.0
34.6
23.2
37.2
37.2
4.7
6.7
9.2
7.0
7.0
250k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.8,m=1,n=100
Best: r=0.5,m=1,n=10 
Best: r=0.5,m=1,n=100
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23433.2
23538.6
23461.8
23554.4
23554.4
28.6
38.4
23.6
37.2
37.2
5.4
7.1
10.3
7.4
7.4

RQ3: How does smart seeding affect performance?
Budget
Code
Seeding
Conﬁguration
Coverage
# of Tests
# of Mut.
50k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=10
Best: r=0.5,m=1,n=10
BO: r=0.3,m=0,n=10
23361.4
23424.4
23417.2
23428.4
23401.8
17.0
28.4
21.0
34.2
27.0
4.8
3.6
4.0
3.2
4.3
100k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.3,m=1,n=10 
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=10
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23404.4
23487.8
23442.2
23487.0
23487.0
16.8
31.6
21.0
33.2
33.2
8.2
4.9
6.4
5.6
5.6
150k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.8,m=1,n=100
Best: r=0.5,m=1,n=100 
Best: r=0.5,m=1,n=100
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23418.4
23502.0
23447.4
23528.2
23528.2
28.2
34.0
23.4
35.6
35.6
4.0
6.0
7.5
6.5
6.5
200k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.8,m=1,n=100
Best: r=0.5,m=0.5,n=100 
Best: r=0.5,m=1,n=100
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23426.0
23519.6
23456.0
23551.0
23551.0
28.0
34.6
23.2
37.2
37.2
4.7
6.7
9.2
7.0
7.0
250k
F
T
F
T
T
0.0
0.0
0.5
0.5
0.5
Best: r=0.8,m=1,n=100
Best: r=0.5,m=1,n=10 
Best: r=0.5,m=1,n=100
Best: r=0.3,m=0,n=10
BO: r=0.3,m=0,n=10
23433.2
23538.6
23461.8
23554.4
23554.4
28.6
38.4
23.6
37.2
37.2
5.4
7.1
10.3
7.4
7.4
For search budgets greater than 150k,
smart seeding achieves the highest coverage or
lowest number of test cases.

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Constraint
Violations
filtering
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k
affect performance?
Coverage fitness applied in top configurations.
Never by worst ones.

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Test
Inputs
Execute on System
Constraint
Violations
filtering
pruning
Assessment
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k
Coverage fitness applied in top configurations.
Never by worst ones.

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Constraint
Violations
ﬁltering
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k
affect performance?
For small search budgets, search achieves better results
when more focused on exploitation (using archived inputs).

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Test
Inputs
Execute on System
Constraint
Violations
ﬁltering
pruning
Assessment
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k
For small search budgets, search achieves better results
when more focused on exploitation (using archived inputs).

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Constraint
Violations
ﬁltering
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k
affect performance?
For larger search budgets, with no seeding or coverage,
putting more emphasis on exploration (new samples) pays off.

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Test
Inputs
Execute on System
Constraint
Violations
ﬁltering
pruning
Assessment
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k
For larger search budgets, with no seeding or coverage,
putting more emphasis on exploration (new samples) pays off.

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Constraint
Violations
ﬁltering
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k
affect performance?
If either seeding OR coverage ﬁtness used, the need
to explore the search landscape decreases.

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Test
Inputs
Execute on System
Constraint
Violations
ﬁltering
pruning
Assessment
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k
If either seeding OR coverage ﬁtness used, the need
to explore the search landscape decreases.

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Constraint
Violations
ﬁltering
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k
affect performance?
If either seeding AND coverage ﬁtness used, the need
to explore the search landscape further decreases.

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Test
Inputs
Execute on System
Constraint
Violations
ﬁltering
pruning
Assessment
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k
If either seeding AND coverage ﬁtness used, the need
to explore the search landscape further decreases.

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Constraint
Violations
ﬁltering
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k
affect performance?
Average number of mutations per test input
remains low (~10).

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Test
Inputs
Execute on System
Constraint
Violations
ﬁltering
pruning
Assessment
Average number of mutations per test input
remains low (~10).
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k

Apply a mutation
Put in Archive
Copy from archive
Sample new chunk
Field Data
Test
Inputs
Execute on System
Constraint
Violations
ﬁltering
pruning
Assessment
p sampling = 0.3, 0.5, 0.8
p seeding = 0, 0.5
Stop after: 50k, 100k, 150k, 200k, 250k
RQ5: What conﬁguration should be
used in practice?

RQ5: What conﬁgurations should be used in practice?
• Small probability of sampling new test data at random
• (p=0.3) !

• Do not mutate new inputs immediately when sampled!

• Limit the max number of mutations (max mutations = 10)
• Seeding and code coverage are used

Higher coverage
Smaller test suites

Search-Based Robustness Testing of Data Processing Systems

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to Search-Based Robustness Testing of Data Processing Systems

Similar to Search-Based Robustness Testing of Data Processing Systems (20)

More from Lionel Briand

More from Lionel Briand (20)

Recently uploaded

Recently uploaded (20)

Search-Based Robustness Testing of Data Processing Systems