Testing tools and AI - ideas what to try with some tool examples
ICPC08b.ppt
1.
2. Motivation
□ >50% of maintenance time spent trying to
understand the program
Marc Eaddy ICPC 2008 2
3. Motivation
□ >50% of maintenance time spent trying to
understand the program
□ Where are the features,
reqs, etc. in the code?
Marc Eaddy ICPC 2008 3
4. Motivation
□ >50% of maintenance time spent trying to
understand the program
□ Where are the features,
reqs, etc. in the code?
□ What is this code for?
Marc Eaddy ICPC 2008 4
5. Motivation
□ >50% of maintenance time spent trying to
understand the program
□ Where are the features,
reqs, etc. in the code?
□ What is this code for?
□ Why is it hard to
understand and change
the program?
Marc Eaddy ICPC 2008 5
6. What is a “concern?”
Anything that affects the implementation of a program
□ Feature, requirement, design pattern,
coding idiom, etc.
□ Raison d'être for code
□ Every line of code exists to satisfy some concern
Marc Eaddy ICPC 2008 6
7. Concern location problem
Concern–code relationship hard to obtain
Program
Concerns Elements
Marc Eaddy ICPC 2008 7
8. Concern location problem
Concern–code relationship hard to obtain
Program
Concerns Elements
□ Concern–code relationship undocumented
Marc Eaddy ICPC 2008 8
9. Concern location problem
Concern–code relationship hard to obtain
Program
Concerns Elements
□ Concern–code relationship undocumented
□ Reverse engineer the relationship
□ (but, which one?)
Marc Eaddy ICPC 2008 9
10. Software pruning
□ Remove code that supports certain features,
reqs, etc.
□ Reduce program’s footprint
□ Support different platforms
□ Simplify program
Marc Eaddy ICPC 2008 10
11. Prune dependency rule [ACOM’07]
□ Code is prune dependent on concern if
□ Pruning the concern requires removing or
altering the code
□ Must alter code that depends on removed
code
□ Prevent compile errors
□ Eliminate “dead code”
□ Easy to determine/approximate
Marc Eaddy ICPC 2008 11
12. Automated concern location
Concern–code relationship predicted by an “expert”
□ Experts mine clues in code, docs, etc.
□ Existing techniques use 1 or 2 experts only
□ Our solution: Cerberus
1. Information retrieval
2. Execution tracing
3. Prune dependency analysis
Marc Eaddy ICPC 2008 12
13. IR-based concern location
□ i.e., Google for code
□ Program entities are documents
□ Requirements are queries
Requirement Source
“Array.join” Code
Id_joi
join n
js_join(
)
Marc Eaddy ICPC 2008 13
14. Vector space model [Salton]
□ Parse code and reqs doc to extract term vectors
□ NativeArray.js_join() method “native,” “array,” “join”
□ “Array.join” requirement “array,” “join”
□ Our contributions
□ Expand abbreviations
□ numconns number, connections, numberconnections
□ Index fields
□ Weigh terms (tf · idf)
□ Term frequency (tf)
□ Inverse document frequency (idf)
□ Similarity = cosine distance between document and
query vectors
Marc Eaddy ICPC 2008 14
15. Tracing-based concern location
□ Observe elements activated when concern is
exercised
□ Unit tests for each concern
□ e.g., find elements uniquely activated by a concern
Marc Eaddy ICPC 2008 15
16. Tracing-based concern location
□ Observe elements activated when concern is
exercised
□ Unit tests for each concern
□ e.g., find elements uniquely activated by a concern
Unit Test Call
for “Array.join” Graph
var a = new Array(1,
2);
if (a.join(',') ==
"1,2")
{
print "Test
passed";
}
Marc Eaddy else { 16
print "Test js_construct js_joi
failed"; n
17. Tracing-based concern location
□ Observe elements activated when concern is
exercised
□ Unit tests for each concern
□ e.g., find elements uniquely activated by a concern
Unit Test Call
for “Array.join” Graph
var a = new Array(1,
2);
if (a.join(',') ==
"1,2")
{
print "Test
passed";
}
Marc Eaddy else { 17
print "Test js_construct js_joi
failed"; n
18. Prune dependency analysis
□ Infer relevant elements based on structural
relationship to relevant element e (seed)
□ Assumes we already have some seeds
□ Prune dependency analysis
□ Determines prune dependency rule using
program analysis
□ Find references to e
□ Find superclasses and subclasses of e
Marc Eaddy ICPC 2008 18
19. PDA example
Source Code Program Dependency Graph
interface A { inherits
public void foo(); A
} C B
public class B implements A {
public void foo() { ... } refs
public void bar() { ... }
contains contains contains
}
public class C { contains
public static void main() {
B b = new B();
b.bar(); calls
} main bar foo foo
Marc Eaddy ICPC 2008 19
20. PDA example
Source Code Program Dependency Graph
interface A { inherits
public void foo(); A
} C B
public class B implements A {
public void foo() { ... } refs
public void bar() { ... }
contains contains contains
}
public class C { contains
public static void main() {
B b = new B();
b.bar(); calls
} main bar foo foo
Marc Eaddy ICPC 2008 20
21. PDA example
Source Code Program Dependency Graph
interface A { inherits
public void foo(); A
} C B
public class B implements A {
public void foo() { ... } refs
public void bar() { ... }
contains contains contains
}
public class C { contains
public static void main() {
B b = new B();
b.bar(); calls
} main bar foo foo
Marc Eaddy ICPC 2008 21
22. PDA example
Source Code Program Dependency Graph
interface A {
public void foo();
inherits
C B A
}
public class B implements A {
public void foo() { ... } refs
public void bar() { ... }
contains contains contains
}
public class C { contains
public static void main() {
B b = new B();
b.bar(); calls
} main bar foo foo
Marc Eaddy ICPC 2008 22
23. PDA example
Source Code Program Dependency Graph
interface A { inherits
public void foo(); A
} C B
public class B implements A {
public void foo() { ... } refs
public void bar() { ... }
contains contains contains
}
public class C { contains
public static void main() {
B b = new B();
b.bar(); calls
} main bar foo foo
Marc Eaddy ICPC 2008 23
28. Future work
□ Improve PDA
□ Reimplemented using Soot and Polyglot
□ Generalize using prune dependency predicates
□ Improve precision using points-to analysis
□ Improve accuracy using
□ Dominator heuristic
□ Variable liveness analysis
□ Improve accuracy of Cerberus
□ Combine experts using matrix linear regression
Marc Eaddy ICPC 2008 28
29. Cerberus contributions
□ Effectively combined 3
concern location techniques
□ PDA boosts performance of Source Code
interface A {
public void foo();
Program Dependency Graph
C B A
other techniques
}
public class B implements A {
public void foo() { ... } refs
public void bar() { ... }
contains contains contains
}
public class C { contains
public static void main() {
B b = new B();
b.bar();
calls
} main bar foo foo
Marc Eaddy ICPC 2008 29
30. Questions?
Marc Eaddy
Columbia University
eaddy@cs.columbia.edu
Marc Eaddy ICPC 2008 30