This document describes a technique for detecting refactorings between two versions of a program using heuristic search. Refactorings are detected by generating intermediate program states through applying refactorings, and finding a path from the original to modified program that minimizes differences. Structural differences are used to identify likely refactorings. Candidate refactorings are evaluated and applied to generate new states, with the search terminating when the state matches the modified program. A supporting tool was developed and a case study found the technique could correctly detect an actual series of refactorings between program versions.
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Detecting Occurrences of Refactoring with Heuristic Search
1. 2008. 12. 5 TOKYO INSTITUTE OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE
APSEC 2008
Detecting Occurrences
of Refactoring
with Heuristic Search
Shinpei Hayashi Yasuyuki Tsuda Motoshi Saeki
Department of Computer Science
Tokyo Institute of Technology
Japan
2. Abstract
Detecting Occurrences of Refactoring
− Detecting where to and what kinds of refactoring
were performed between two ver. of programs
− From version archives such as CVS/Subversion
with Heuristic Search
− We use a graph search technique for detecting
impure refactorings
Results
− (Semi-)automated tools have been developed
− We have a simple case study
3. Background
It is important to reduce the cost for
understanding the changes of a program,
which is continually modified
I have to follow the changes of L...
What do the changes mean? ver. 123
modify
ver. 124
use Program L
(a library)
FOSS project
Program S (FOSS: Free/Open Source Software)
3
4. Aim
Categorization of modifications is useful
− Detecting refactorings between two versions of a
program
The changes are:
Extract Method + ver. 123
Move Method + α!
modify
ver. 124
use Program L
(a library)
FOSS project
Program S (FOSS: Free/Open Source Software)
4
5. Detecting Refactorings
Related work
− with software metrics [Demeyer 2000]
− by checking pre/post-conditions of source
code properties [Weißgerber 2006]
Key issue
− detecting impure refactorings [Görg 2005]
• refactoring + refactoring
• refactoring + other modifications
5
6. Motivating Scenario
Example from Fowler’s book [Fowler 1999]
Extract Move
n0 Method n1 Method nm
Customer Customer Customer
statement() statement() statement()
getCharge(Rental)
Rental
Rental Rental
getCharge()
6
7. Scenario: Lost of States
Version
Archive
Rold Rnew
lose
commit intermediate state commit
Extract Move
n0 Method n1 Method n2
Customer Customer
Customer
statement()
statement() statement()
getCharge(Rental)
Rental
Rental Rental
getCharge()
7
8. Scenario: Diffs between revs
Hard to detect refactorings via mixed differences
− Considering intermediate states is required
Version
Archive
Rold Rnew
Customer
1. Some code fragments in Customer
statement() Customer#statement is removed statement()
2. a method invocation to
Rental#getCharge is added to Rental
Rental
statement
getCharge()
3. Rental#getCharge is added
Mixed differences
8
9. Our Approach
Generating intermediate states by actually
applying refactorings to the program
Finding an appropriate path from Rold (= n0) to
Rnew (= nm) by a graph search technique
States (nodes): versions of the program
Transitions (edges): refactoring operations
n0 nm
Initial state Final state
(Rold) (Rnew)
9
10. Procedure
1. Find likely refactorings
2. Evaluate the distance to the nm
3. Apply the best one and generate new state
6
n0 4 nm
Initial state Final state
(Rold) (Rnew)
8 10
11. Procedure
1. Find likely refactorings
2. Evaluate the distance to the nm
3. Apply the best one and generate new state
Terminate if the new state almost equals to Rnew
6
3
n0 2 nm
Initial state Final state
(Rold) (Rnew)
8 3 11
12. Efficient Search
1. How to find candidates of refactorings?
− They should be similar to the changes for nm
2. How to evaluate them?
− The best one should generate new state closer to nm
We use structural differences between two states
6
3
n0 2 nm
Initial state Final state
(Rold) (Rnew)
8 3 12
13. Structural Differences
Calculated by comparing two AST
− 4 types: add, remove, change, move
public class Customer {
public class Customer {
int id;
int customerID;
String name;
String name;
int[] phoneNum;
}
}
change(Customer, the name customerID, the name id)
add(FieldDeclaration int[] phoneNum, Customer)
…
13
14. Find New Refactorings
By matching between diffs. (D) and modifications
representing a refactoring operation (R)
− if a subset of D matches a subset of R, the refactoring
operation is expanded.
− Matching likelihood: (# of matched modifications) / (# of R)
The differences (D): nm
remove(Block, Customer#statement)
add(MethodInvocation, Rental#statement)
change(Rental, the name “n1”, the name“n2”)
…
Extract Method (R):
remove(Block, ClassA#method1)
Matching likelihood:
add(MethodDeclaration, ClassA)
0.5 (1/2) 14
15. Evaluation
Using f(n, o) = g(n) + h(n) / α(o)
− g(n): # of applied refactorings for obtaining n
− h(n): the size of differences between n and nm
− α(o): likelihood of o
f(n2, o6) = 2 + 3 / (1/2) = 8
o6
α(o6) = 1/2
n2
o1
g(n2) = 2 o4 h(n2) = 3
n0 nm
Initial State
o2 n1 Goal State
(Pold) o3 o5 (Pnew)
15
16. Supporting Tools
Expanding and evaluating candidates of
refactorings with REUSAR (implemented)
− Input: Two versions of Java source code
− Output: Candidates of refactorings with priority
− Calculating differences with XMLdiff (an existing tool)
Applying refactorings by Eclipse
− Checking pre/post-conditions
− Modifying source code
16
17. Case Study
Applying our technique to an
existing version archive, REUSAR
(supporting tool in this study)
1. Rename Method
R1796 2. Remove Parameter R1799
DistanceCalculator DistanceCalculator
calculateDistance(List<Diff>) calcDistance()
17
18. Case Study: Result
R1796 R1799
4 Rename
Method
n1 4 Remove
Parameter
n0 4 Remove
n2
Parameter 7 Extract
Method
8 Extract
Method
Our technique is effective
− It can correctly detect impure refactorings which were
actually performed
− Prioritizing the candidates reduces # of applications of
refactoring operations 18
19. Conclusion
Summary
− Detecting refactorings by a graph search,
based on heuristics with structural
differences between two programs
− Our technique can detect impure
refactorings
Future Work
− Tool integration
− Evaluation: larger-scale case study
19