Paper: Localizing Failure-Inducing Program Edits Based on Spectrum Information.
Authors: Lingming Zhang, Miryung Kim, Sarfraz Khurshid.
Session: Research Track Session 1: Faults and Regression Testing
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Faults and Regression testing - Localizing Failure-Inducing Program Edits Based on Spectrum Information
1. Localizing Failure-Inducing Program
Edits Based on S
Edit B d Spectrum I f
t Information
ti
Lingming Zhang, Miryung Kim, Sarfraz Khurshid
The University of Texas at Austin
ICSM2011, September 27th 2011
1
2. Overview
Change impact analysis is effective at finding
suspicious edits but lacks precise ranking.
Spectrum-based
Spectrum based fault localization is effective at
ranking but does not scale well.
Our insight: combine change impact analysis and
change-impact
spectrum-based fault localization.
• Identify suspicious edits using extended call graphs.
• Rank suspicious edits using dynamic program
spectrum information.
L. Zhang: Localizing failure-inducing program edits based on spectrum information 2
3. Summary of our results
y
FaultTracer localizes failure-inducing edits with
failure inducing
high precision:
• Id tif i suspicious edits: outperforms
Identifying i i dit t f
Chianti by 19.37%.
• Ranking all suspicious edits: ranks real
regression faults within top 3 edits for 14 of
the 22 studied real-world failures.
• R ki method-level suspicious edits:
Ranking th d l l i i dit
outperforms existing heuristic by 56.25%.
L. Zhang: Localizing failure-inducing program edits based on spectrum information 3
5. Example
p
Program P Program P’
P
public class A { public class A {
public static int f1=0; public static int f1=1;
p
public static int f2=0;
; public static int f2=1;
f2 1;
... ...
} }
evolve
class B { class B {
int f1=0; int f2=0; int f3=0; int f1=0; int f2=1; int f3=1;
; ; ;
public int foo(){return f1;} int f4=1;
... public int foo()
} { if(f1>=0) return f1;
class C extends B{ else return f4;
... }
} ...
}
Regression test suite T class C extends B{
public int f1=3;
public void test1() public void bar(int f) {f3=f+f1;}
{ A.bar(1); } ...
public void test2() { ... } }
Test
T t public void test3() { ... }
public void test4() { Bug!
C c = new C();
}
int f = c.foo();
Re-Test
public void test5() { ... }
L. Zhang: Localizing failure-inducing program edits based on spectrum information 5
6. FaultTracer overview
Selecting tests
g
T based on Extended
Call Graph analysis
Detecting
changes and
changes and
T
T’
P dependences
②
① ∆
P’ ③
Identifying suspicious
Id tif i i i
ᵟt
edits based
t on Extended Rank suspicious
Call Graph analysis
Call Graph analysis ④ edits based on
edits based on
program spectrum
information
’
ᵟt
L. Zhang: Localizing failure-inducing program edits based on spectrum information 6
7. Extended Call Graph representation
p p
public void test1() { A.bar(1); }
public void test4() {
C c = new C();
int f = c.foo();
}
TraditionalCallGraph ExtendedCallGraph
usedbyChianti
used by Chianti usedbyFaultTracer
used by FaultTracer
test1 test4 test1 test4
<C,C.foo()> <C,C.foo()>
A.Clinit() A.bar() C.C() C.foo() A.Clinit() A.bar() C.C() C.foo()
<SFW,A.f2> <FR,C.f1>
B.B()
A.f2 B.B() B.f1
L. Zhang: Localizing failure-inducing program edits based on spectrum information 7
8. Step 1. Detecting atomic changes and
p g g
dependences
Change Description
types
CM Changemethod
AM Addmethod
DM Deletemethod
AF Addfield
DF Deletefield
CFI Changeinstancefield
CSFI Changestaticfield
Change static field
LCm Methodlook-upchange
LCf Fieldlook-up change
Changedependencesinferencerules
Change dependences inference rules
AtomicChangeTypes
L. Zhang: Localizing failure-inducing program edits based on spectrum information 8
9. Step 2. Test selection based on Extended
Call Graph
C ll G h (ECG) analysisl i
FaultTracer directly matches all changes with test ECGs
before edits to select the influenced tests.
L. Zhang: Localizing failure-inducing program edits based on spectrum information 9
10. Step 3. Suspicious edit identification
based on E t d d C ll G h analysis
b d Extended Call Graph l i
FaultTracer directly selects the non-look-up changes
appear on test ECGs after edits as suspicious edits.
FaultTracer selects method or field edits that have caused
look up
look-up changes on test ECGs as suspicious edits
edits.
L. Zhang: Localizing failure-inducing program edits based on spectrum information 10
11. Step 4. Spectrum-based fault localization
for
f program edits
dit
Correlation between suspicious edits and tests
p
Edits test2 test3 test4 test5
CSFI(A.f1)
CM(B.foo)
CM(B f )
AF(C.f1)
AM(C.bar)
out Pass Pass Pass Fail
Suspiciousness score computation
Suspiciousness Score Tie
Break
Edits Tarantula SBI Jaccard Ochiai -
CSFI(A.f1) 0.00 0.00 0.00 0.00 -
CM(B.foo) 0.75 0.50 0.50 0.71 1
AF(C.f1) 0.75 0.50 0.50 0.71 0
AM(C.bar) 1.00 1.00 1.00 1.00 -
L. Zhang: Localizing failure-inducing program edits based on spectrum information 11
13. Research Questions
RQ1: How does FaultTracer compare to Chianti in
identifying
id tif i suspicious edits?
i i dit ?
RQ2: How effective is FaultTracer in ranking
suspicious edits?
L. Zhang: Localizing failure-inducing program edits based on spectrum information 13
14. Subjects: overview
j
Subjects from Software-artifact Infrastructure
Repository (SIR)
(SIR).
Project Version Program Size (KLoC)
Program Size (KLoC) Number
Number
of Test
Jtopas 0.0-3.0 1.83 ~ 5.36 95-209
Xml-Security 0.0-3.0 17.44 ~ 18.99 84-106
JMeter 0.0-5.0 31.01 ~ 41.05 70-97
Ant 0.0-8.0 17.20 ~ 80.44 112-878
L. Zhang: Localizing failure-inducing program edits based on spectrum information 14
15. Subjects: change statistics
j g
Number of changes for each version pair
Ant7.0-8.0
Ant6.0-7.0
Ant5.0 6.0
Ant5 0-6 0
Ant4.0-5.0
Ant3.0-4.0
AM
Ant2.0-3.0
Ant1.0-2.0 DM
Ant0.0-1.0 CM
JMeter4.0-5.0
AF
JMeter3.0 4.0
JMeter3.0-4.0
JMeter2.0-3.0 DF
JMeter1.0-2.0 CFI
JMeter0.0-1.0 CSFI
XmlSec2.0-3.0
LCm
XmlSec1.0-2.0
XmlSec0.0-1.0 LCf
Jtopas2.0-3.0
p
Jtopas1.0-2.0
Jtopas0.0-1.0
0 1000 2000 3000 4000 5000 6000 7000
L. Zhang: Localizing failure-inducing program edits based on spectrum information 15
16. RQ1: How does FaultTracer compare to
Chianti in identifying
Chi ti i id tif i suspicious edits?
i i dit ?
FaultTracer achieves 19.37% improvement in the
precision of identification suspicious edits.
160
140
120
100
80
60 Chianti
40 FaultTracer
20
0
XmlSec0.0-1.0
2.0
3.0
Ant0.0-1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
Jtopas0.0-1.0
2.0
3.0
JMeter0.0-1.0
2.0
3.0
4.0
5.0
XmlSec1.0-2
XmlSec2.0-3
Ant1.0-2
Ant2.0-3
Ant3.0-4
Ant4.0-5
Ant5.0-6
Ant6.0-7
Ant7.0-8
Jtopas1.0-2
Jtopas2.0-3
JMeter1.0-2
JMeter2.0-3
JMeter3.0-4
JMeter4.0-5
X
X
X
L. Zhang: Localizing failure-inducing program edits based on spectrum information 16
17. RQ2: How effective is FaultTracer in
ranking suspicious edits?
ki i i dit ?
Ranks all types of edits:
• Average performance.
Tarantula SBI Jaccard Ochiai Suspicious Edit
edit num. number
Average 8.50 8.50 10.83 14.66 68.83 3932
Percentage To 0.22% 0.22% 0.28% 0.37% 1.75% --
edit number
• Example (Ant5.0-6.0)
Test
T t Tarantula
T t l SBI Jaccar
J Ochiai
O hi i Suspicious Edit
S i i
d edit num. number
ant.taskdefs.optional.EchoPro 1 1 1 10 182 5019
pertiesTest.testEchoToBadFile
pertiesTest testEchoToBadFile
L. Zhang: Localizing failure-inducing program edits based on spectrum information 17
18. RQ2: How effective is FaultTracer in
ranking suspicious edits?
ki i i dit ?
Ranks method edits (FaultTracer v.s. Heuristic)
• Achieves 56.25% improvement in the precision of
localizing method-level failure-inducing edits
L. Zhang: Localizing failure-inducing program edits based on spectrum information 18
19. Limitations
Does not currently filter out refactorings (e.g., use
RefFinder [Prete+2010]).
Uses only four spectrum based fault localization
spectrum-based
techniques.
The experimental evaluation is limited by the small
number of real regression faults.
L. Zhang: Localizing failure-inducing program edits based on spectrum information 19
20. Related work
Change impact
Change-impact analysis
• Chianti [Ren+2004]
• Crisp [Chesley+2005]
• Heuristic ranking [Ren+2007]
Fault localization
• Spectrum-based
Spectrum based
• E.g., Tarantula [Jones+2002], SBI [Liblit+2005], Jaccard
[Abreu+2007], Ochiai [Abreu+2007].
• Delta debugging [Zeller1999]
• Model-based
Model based
• E.g., Bayesian diagnosis [Kleer+1987]
L. Zhang: Localizing failure-inducing program edits based on spectrum information 20
21. Conclusion
FaultTracer combines change impact analysis with
g p y
dynamic spectra.
FaultTracer improves change impact analysis based
extended call graph analysis.
Experimental evaluation shows FaultTracer:
• Performs 19.37% better than Chianti in determining
affecting changes
changes.
• Localizes failure-inducing edits within top 3 edits for
14 of the 22 regression failures
failures.
• Performs 56.25% better than previous heuristic for
localizing f il
l li i failure-inducing program edits.
i d i dit
zhanglm10@gmail.com
zhanglm10@gmail com
L. Zhang: Localizing failure-inducing program edits based on spectrum information 21