TAROT 2013 9th International Summer School on Training And Research On Testing, Volterra, Italy, 9-13 July, 2013
These slides summarize Leonardo Mariani's presentation about "Automated Failure Analysis in Absence of Specification"
TAROT2013 Testing School - Leonardo Mariani presentation
1. Automated
Failure
Analysis
in
Absence
of
Specifica7on
Leonardo
Mariani
University
of
Milano
Bicocca
mariani@disco.unimib.it
2. Analysis
of
So?ware
Behaviors
Analysis
of
So?ware
Failures
(semi-‐)automa7cally,
when
no
specifica7on
is
available
3. Automated
Debugging
Fault
localiza7on:
search
for
the
fault
loca7on
(e.g.,
search
for
faulty
program
statements
in
the
program
source
code)
Failure
analysis
(aka
anomaly
detec7on):
search
for
failure
causes
(e.g.,
search
for
erroneous
events
in
the
execu7on
space)
5. Fault
Localiza7on
(output
obtained
with
Tarantula)
public
class
ChipsListener
implements
ServletContextListener
{
public
ChipsListener()
{
}
public
void
contextIni7alized(ServletContextEvent
evt)
{
ServletContext
context
=
evt.getServletContext();
JspApplica7onContext
jspContext
=
JspFactory.getDefaultFactory().getJspApplica7onContext(context);
jspContext.addELResolver(new
ChipsELResolver());
}
n even
if
related
to
the
bug,
the
bug
is
not
there!
n this
is
not
the
best
ranked
piece
of
code
n why
is
this
fragment
of
code
relevant?
9
[Jones
et
al.
Visualiza7on
of
Test
Informa7on
to
Assist
Fault
Localiza7on.
ICSE,
2002.]
6. Failure
Analysis
(output
obtained
with
BCT)
• javax.servlet.jsp.JspFactory.getDefaultFactory()
returned
null
• then
org.apache.tomcat.u7l.modeler.Registry.unregisterComponent(javax.managem
ent.ObjectName)
invoked
org.apache.catalina.session.ManagerBase.postDeregister()
• and
org.apache.tomcat.u7l.modeler.Registry.unregisterComponent(javax.mana
gement.ObjectName)
invoked
org.apache.catalina.loader.WebappLoader.postDeregister()
11
[Mariani,
Pastore,
Pezzè.
Dynamic
Analysis
for
Diagnosing
Integra7on
Faults.
TSE,
2011.]
Tomcat
failed
because
7. How
to
iden7fy
the
events
responsible
for
a
failure
when
no
spec
is
available?
13. generate
a
model
that
represents
the
actual
behavior
from
samples
X=-‐2
X=-‐9
X=0
X=5
X=7
…
Actual
Behavior
-‐10<X<10
Mined
behavior
-‐10<X<10
Specifica7on
Mining
14. Model
genera7on
is
imprecise…
Over-‐Generaliza7on
Over-‐Restric7on
Over-‐Generaliza7on
and
Over-‐Restric7on
Actual
Behavior
-‐10<X<10
Mined
specifica7on
X
>
-‐100
Mined
specifica7on
-‐5
<
X
<
5
Mined
specifica7on
-‐100
<
X
<
5
15. Real
Specifica7on
-‐10<X<10
Mined
Specifica7on
-‐100<X<5
Specifica7on
Mining:
Models
Used
as
Specifica7ons
X
=
100
Correctly
rejected
behavior
X
=
1
Correctly
accepted
behavior
X
=
7
Erroneously
rejected
behavior
X
=
-‐50
Erroneously
accepted
behavior
16. Models
1 2
a
3
c
4 5
d f
eb
x > 0
Full
Ordering
of
Events
Data
Values
Ordering
of
Events
+
Data
Values
Par7al
Ordering
of
Events
open
=>
close
18. Mining
of
Finite
State
Models
• Trace-‐based
mining
– State-‐based
merging
– Behavior-‐based
merging
• State-‐based
mining
Total
=
0
Elem
=
0
Total
=
3
Elem
=
1
Total
=
5
Elem
=
2
Total
=
0
Elem
=
0
onLoad
add
add
empty
onLoad
add
add
empty
19. kTail
(state-‐based
merging)
TRACES
PTA
FSA
[Biermann
and
Feldman.
On
the
synthesis
of
finite
state
machines
from
samples
of
their
behavior.
IEEE
ToC,
1972.
]
a
a
a
b
c
a
b
c
a
a
b
c
a
a
a
a
a
c
20. Build
the
PTA
TRACES
PTA
a
a
a
b
c
a
b
c
a
a
b
c
a
a
a
a
a
c
34. kBehavior
(behavior-‐based
merging)
Traces
login
home
checkMsg
logout
login
home
checkMsg
7meout
login
home
checkMsg
watchVideo
home
checkMsg
logout
K
=
min
length
of
matched
behavior
35. kBehavior
(behavior-‐based
merging)
Traces
login
home
checkMsg
logout
login
home
checkMsg
7meout
login
home
checkMsg
watchVideo
home
checkMsg
logout
36. kBehavior
(behavior-‐based
merging)
Traces
login
home
checkMsg
logout
login
home
checkMsg
7meout
login
home
checkMsg
watchVideo
home
checkMsg
logout
login
home
checkMsg
read
home
checkMsg
logout
reply
37. kBehavior
(behavior-‐based
merging)
Traces
login
home
checkMsg
logout
login
home
checkMsg
7meout
login
home
checkMsg
watchVideo
home
checkMsg
logout
login
home
checkMsg
read
home
checkMsg
logout
reply
38. The
Parameter
K
• K
determines
the
degree
of
generaliza7on
• Empirically,
behavior-‐based
merging
generates
models
that
are
more
general
than
state-‐based
merging
[Lo
et
al.,
JSS,
2012]
State-‐based
merging
behavior-‐based
merging
39. State-‐Based
Inference
of
FSM
Models
Total
=
0
Elem
=
0
Total
=
3
Elem
=
1
Total
=
5
Elem
=
2
Total
=
0
Elem
=
0
onLoad
add
add
empty
<0
==0
>0
Abstrac7on
func7on
<0
=0
>0
<0
=0
>0
<0
=0
>0
<init>
Total
==
0
Elem==0
Total
>
0
Elem
>
0
onLoad
add
add
empty
[Dallmeier,
Lindig,
Wasylkowski,
Zeller:
Mining
Object
Behavior
with
ADABU.
WODA
2006]
[Marcheso,
Tonella,
Ricca:
State-‐Based
Tes7ng
of
Ajax
Web
Applica7ons.
ICST
2008]
[Mariani,
Marcheso,
Nguyen,
Tonella.
Revolu7on:
Automa7c
evolu7on
of
mined
specifica7ons.
ISSRE.
2012]
40. The
Abstrac7on
Func7on
• Quality
of
the
final
model
influenced
by
– Completeness
of
the
state
informa7on
that
is
traced
– The
kind
of
abstrac7on
implemented
by
the
abstrac7on
func7on
numElements
numDis7nctElements
numElements
VS
<0
=0
>0
<-‐1
=-‐1
=0
=1
>1
VS
44. The
Set
of
Template
Expressions
• Expressiveness
depends
on
the
template
expressions
• More
template
expr
=>
more
candidate
expressions
=>
higher
computa7onl
cost
• Recently
defined
an
approach
to
deal
with
polynomial
and
array
expressions
[Nguyen
et
al.
ICSE
2012]
_
+
_
=_
_
<
_
_=_
_
>
0
1
>
_
46. Mine
Temporal
Rules
Traces
Template
Rules
<pre>
<post>
CONFIDENCE
AND
SUPPORT
THRESHOLDS
Temporal
Rules
[Lo,
Khoo,
Liu.
Mining
temporal
rules
for
so?ware
maintenance.
JSME,
2008]
[Yang,
Evans,
Bhardwaj,
Bhat,
Das.
Perracosa:
mining
temporal
API
Rules
from
Imperfect
Traces.
ICSE.
2006]
…
start
open
close
stop
start
load
stop
start
open
close
stop
begin
end
47. Mine
Temporal
Rules
start
open
close
stop
start
load
stop
start
open
close
stop
Traces
Template
Rules
<pre>
<post>
CONFIDENCE
AND
SUPPORT
THRESHOLDS
Temporal
Rules
CONFIDENCE
OF
A
RULE
#
traces
rule
holds
…
#traces
pre
applies
start
open
has
67%
confidence
begin
end
48. Mine
Temporal
Rules
Traces
Template
Rules
<pre>
<post>
CONFIDENCE
AND
SUPPORT
THRESHOLDS
Temporal
Rules
Conf
=
100%
Supp
=
20%
start
stop
close
open
…
…
SUPPORT
OF
A
RULE
#
traces
rule
holds
#traces
start
open
has
50%
support
start
open
close
stop
start
load
stop
start
open
close
stop
begin
end
49. Template
Rules
• Expressiveness
depends
on
the
template
rules
• Confidence
and
Support
for
tuning
the
technique
wrt
imperfect
traces
50. Steering
FSA
Models
with
Temporal
Rules
kTail
with
k=2
Overgeneraliza7on
problem:
-‐ locally,
it
sounds
to
be
a
good
decision
-‐
globally,
it
generates
anomalous
behaviors
51. Idea:
mine
global
proper7es,
exploit
them
when
taking
decisions
locally
Traces
Mine
Temporal
Rules
Build
PTA
openFile
closeFile
closeConn
connDB
Apply
kTail
(e.g.
with
k=2)
BUT
prevent
state
merges
that
violate
temporal
rules
(LOCAL
DECISIONS)
[Lo,
Mariani,
Pezzè.
Automa7c
Steering
of
Behavioral
Model
Inference.
ESEC/FSE
2009]
GLOBAL
PROPERTIES
60. Empirical
Studies
-‐
complexity
-‐
Length
of
traces/Noise/Number
of
different
events
in
the
traces
Mining
simple
FSA
Mining
extended
FSA
Mining
temporal
rules
Mining
constraints
[Lo,
Mariani,
Santoro,
Learning
extended
FSA
from
So?ware:
An
Empirical
Assessment.
JSS,
2012]
[Yang,
Evans,
Bhardwaj,
Bhat,
Das.
Perracosa:
mining
temporal
API
Rules
from
Imperfect
Traces.
ICSE.
2006]
[Nugyen,
Marcheso,
Tonella.
Automated
Oracles:
An
Empirical
Study
on
Cost
and
Effec7veness,
ESEC/FSE,
2013]
61. Empirical
Studies
-‐
sensi7vity
-‐
Capture
small
differences
Mining
simple
FSA
Mining
extended
FSA
Mining
temporal
rules
Mining
constraints
Capture
major
differences
FSA
good
to
analy7cally
capture
the
behavior
of
small
units
(e.g.,
components)
Temporal
rules
and
constraints
good
to
capture
some
behaviors
in
rela7vely
big
applica7ons
62. Quality
of
Models
vs
Number
of
Traces
Component/
API/Method
Traces
Ideal
Model
-‐
Transi7on
coverage
enough
for
mining
good
FSAs
[Lo,
JSS,
2012]
-‐
Addi7onal
tests
can
be
generated
to
improve
models
[Dallmeier,
TSE,
2012]
63. Quality
of
Models
vs
Number
of
Traces
Applica7on
Traces
Ideal
Model
-‐
Good
FSAs
hard
to
mine
-‐
Other
models:
several
traces
necessary
for
par7cularly
complex
cases
[Nguyen,
ESEC/FSE,
2013]
64. Take
Home
About
Specifica7on
Mining
• Think
to
your
research
area
– If
you
need
models
and
specifica7ons…
– …and
you
do
not
have
any,
– but
you
have
a
way
of
execu7ng
your
so?ware
– Specifica7on
Mining
could
an
op7on!
66. 1 2
a
3
c
4 5
d f
eb
Applica7on
Traces
Model
Analysis
Trace
Failure
67. Failure
Analysis
Based
on
Specifica7on
Mining
• Analysis
of
(Field
and
Regression)
Failures
– BCT
[Mariani
et
al.
Dynamic
Analysis
for
Diagnosing
Integra7on
Faults.
TSE,
2011.]
• Analysis
of
Regression
Failures
– Radar
[Pastore
et
al.
Dynamic
Analysis
of
Upgrades
in
C/C++
So?ware.
ISSRE,
2012.]
• Produce
Descrip7ve
Reports
– AVA
[Babenko
et
al.
AVA:
automated
interpreta7on
of
dynamically
detected
anomalies.
ISSTA,
2009.]
71. Dis7lling
Behavioural
Models
I/O
Data
Interac7on
Data
Daikon
kBehavior
I/O
Model
Interac7on
Model
x != null
method1
method2
method3
method4
I/O
and
Interac7on
Models
I/O
and
Interac7on
Models
I/O
and
Interac7on
Models
I/O
and
Interac7on
Models
72. Run-‐Time
Verifica(on
and
Failure
Analysis
System Failure!!
unexpected
interac?on!
unexpected
interac?on!
unexpected
interac?on!
unexpected
value!
unexpected
value!
103
73. Filtering
• Re-‐execute
tests
and
remove
anomalies
detected
in
both
passing
and
failing
tests
Regression
tes7ng
• Country==US violated
by
passing
regression
tests
because
the
new
version
of
the
applica7on
is
available
outside
US
• Viola7ons
of
this
property
can
be
ignored
Field
failures
• date==20/3/2013 spurious property violated
by
passing
regression
tests
• Viola7ons
of
this
property
can
be
ignored
Remaining
anomalies
are
re-‐arranged
according
to
likely
cause-‐
effects
74. Run-‐Time
Verifica(on
and
Failure
Analysis:
Rela(ng
Anomalies
105
start
ini?alize
getValue
I
ini7alize
the
next
component
I
need
a
proper
value
for
ini7aliza7on
I
do
not
know
the
value!!!
I
return
null
null
is
not
a
proper
value!
I
return
an
excep7on
we
have
an
excep7on!!!
Let’s
try
to
terminate
safely
75. Run-‐Time
Verifica(on
and
Failure
Analysis:
Rela(ng
Failures
106
start
ini?alize
getValue
undo
log
closeConnec?on
log
the
event
and
close
the
connec7on
76. Run-‐Time
Verifica(on
and
Failure
Analysis:
Rela(ng
Failures
107
start
ini?alize
getValue
undo
log
closeConnec?on
one
anomaly
is
the
cause
of
many
others!
return
null
value
throw
excep?on
call
undo
early
close
the
connec?on
80. Output
Obtained
with
BCT
for
the
Tomcat
Failure
ON
EXIT
from
javax.servlet.jsp.JspFactory.getDefaultFactory()
MODEL
VIOLATED
returnValue
!=
null
=
false
FROM
org.apache.tomcat.u7l.modeler.Registry.unregisterComponent(javax.man
agement.ObjectName)
UNEXPECTED
CALL
TO
org.apache.catalina.session.ManagerBase.postDeregister()
FROM
org.apache.tomcat.u7l.modeler.Registry.unregisterComponent(ja
vax.management.ObjectName)
UNEXPECTED
CALL
TO
org.apache.catalina.loader.WebappLoader.postDeregister()
112
84. Stopping
Criterion
edges
with
weights
greater
than
this
value
are
removed
cohesion(graph)
116
cohesion(graph)
=
avg(cohesion
(CCs))
cohesion(CC)
=
avg(weight
edges)
smaller
value
==
beser
cohesion
85. Resul7ng
Graph
…
•
the
components
are
inspected
from
the
biggest
to
the
smallest
•
the
first
two
graphs
are
enough
to
explain
the
problem!
117
86. Improvements
Make
the
analysis
specific
to
the
type
of
considered
faults
Radar:
failure
analysis
of
regression
problems
Produce
outputs
that
beser
explain
the
reason
of
the
failure
AVA:
automa7c
analysis
of
anomalies
87. V1
V2
chainItems.size > 0
24
25
27
28
29
31
availableQty
32
34
36
37
28
28
33
Radar
in
a
Nutshell
TEST
SUITE
TEST
SUITE
TRACE
Failed
because
initItems()
has
not
been
invoked
and
chainItem.size
=0
92. File.open
File.write
File.close sortFile File.delete
File.open …sortFile File.deleteFile.write
Anomaly
Detec7on
Should the path of the file be
the problem?
Should the sorting be the
problem?
Should the content of the file
be the problem?
May be the file has not been
closed!!!
…
102. File.open File.write File.close sortFile File.delete
File.open File.write - sortFile File.delete
Expected
Observed
The
applica7on
failed
because
File.close
has
not
been
executed
sortFile
is
anomalous
BETTER
THAN
113. ICSE
DOCTORAL
SYMPOSIUM
S.C.
Cheung
and
L.
Mariani
Submission
deadline:
Nov
22,
2013
No7fica7on:
Feb
17,
2014
Camera
Ready:
Mar
14,
2014
Event
Date:
Jun
3,
2014