SlideShare une entreprise Scribd logo
1  sur  33
Accurate and Efficient Refactoring
Detection in Commit History
1May 31, 2018
Danny DigNikolaos Tsantalis Matin Mansouri Laleh Eshkevari Davood Mazinanian
Refactoring is noise in evolution analysis
• Bug-inducing analysis (SZZ): flag refactoring edits as bug-introducing
changes
• Tracing requirements to code: miss traceability links due to
refactoring
• Regression testing: unnecessary execution of tests for refactored
code with no behavioral changes
• Code review/merging: refactoring edits tangled with the actual
changes intended by developers
2
There are many refactoring detection tools
• Demeyer et al. [OOPSLA’00]
• UMLDiff + JDevAn [Xing & Stroulia ASE’05]
• RefactoringCrawler [Dig et al. ECOOP’06]
• Weißgerber and Diehl [ASE’06]
• Ref-Finder [Kim et al. ICSM’10, FSE’10]
• RefDiff [Silva & Valente, MSR’17]
3
Limitations of previous approaches
• Dependence on similarity thresholds
• thresholds need calibration for projects with different characteristics
• Dependence on built versions
• only 38% of the change history can be successfully compiled [Tufano et al., 2017]
• Unreliable oracles for evaluating precision/recall
• Incomplete (refactorings found in release notes or commit messages)
• Biased (applying a single tool with two different similarity thresholds)
• Artificial (seeded refactorings)
4
Why do we need better accuracy?
5
Empirical studies
Refactoring
detection
Library adaptation
Framework migration
poor
accuracy
Why do we need better accuracy?
6
Contributions
1. First refactoring detection algorithm operating without any
code similarity thresholds
2. RefactoringMiner open-source tool with an API
3. Oracle comprising 3,188 refactorings found in 538 commits
from 185 open-source projects
4. Evaluation of precision/recall and comparison with previous
state-of-the-art
5. Tool infrastructure for comparing multiple refactoring
detection tools
7
Approach in a nutshell
AST-based statement matching algorithm
• Input: code fragments T1 from parent commit and T2 from child commit
• Output:
• M set of matched statement pairs
• UT1 set of unmatched statements from T1
• UT2 set of unmatched statements from T2
• Code changes due to refactoring mechanics: abstraction, argumentization
• Code changes due to overlapping refactorings or bug fixes:
syntax-aware AST node replacements
8
9
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
AfterBefore
10
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static List<Address> createAddresses(int count) {
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
AfterBefore
11
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
AfterBefore
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", ports.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
12
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
AfterBefore
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", ports.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
13
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
}
return addresses;
}
try {
addresses[i] =
new Address("127.0.0.1", ports.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
AfterBefore
14
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
addresses.add(createAddress("127.0.0.1", ports.incrementAndGet()));
}
return addresses;
}
protected static Address createAddress(String host, int port) {
try {
return new Address(host, port);
}
catch (UnknownHostException e) {
e.printStackTrace();
}
return null;
}
AfterBefore
15
protected static Address createAddress(String host, int port) {
try {
return new Address(host, port);
}
catch (UnknownHostException e) {
e.printStackTrace();
}
return null;
}
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
addresses.add(createAddress("127.0.0.1", ports.incrementAndGet()));
}
return addresses;
}
AfterBefore
textual similarity  30%
16
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
addresses.add(createAddress("127.0.0.1", ports.incrementAndGet()));
}
return addresses;
}
protected static Address createAddress(String host, int port) {
try {
return new Address(host, port);
}
catch (UnknownHostException e) {
e.printStackTrace();
}
return null;
}
AfterBefore
(1) Abstraction
17
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
addresses.add(createAddress("127.0.0.1", ports.incrementAndGet()));
}
return addresses;
}
protected static Address createAddress(String host, int port) {
try {
return new Address(host, port);
}
catch (UnknownHostException e) {
e.printStackTrace();
}
return null;
}
AfterBefore
(1) Abstraction
18
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
addresses.add(createAddress("127.0.0.1", ports.incrementAndGet()));
}
return addresses;
}
protected static Address createAddress(String host, int port) {
try {
return new Address(host, port);
}
catch (UnknownHostException e) {
e.printStackTrace();
}
return null;
}
AfterBefore
(2) Argumentization
19
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
addresses.add(createAddress("127.0.0.1", ports.incrementAndGet()));
}
return addresses;
}
protected static Address createAddress(String host, int port) {
try {
return new Address("127.0.0.1", ports.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
return null;
}
AfterBefore
(2) Argumentization
20
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
addresses.add(createAddress("127.0.0.1", ports.incrementAndGet()));
}
return addresses;
}
protected static Address createAddress(String host, int port) {
try {
return new Address("127.0.0.1", ports.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
return null;
}
AfterBefore
(3) AST Node Replacements
21
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
addresses.add(createAddress("127.0.0.1", ports.incrementAndGet()));
}
return addresses;
}
protected static Address createAddress(String host, int port) {
try {
return new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
return null;
}
AfterBefore
(3) AST Node Replacements
22
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
addresses.add(createAddress("127.0.0.1", ports.incrementAndGet()));
}
return addresses;
}
protected static Address createAddress(String host, int port) {
try {
return new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
return null;
}
AfterBefore
textual similarity = 100%
23
private static Address[] createAddresses(int count) {
Address[] addresses = new Address[count];
for (int i = 0; i < count; i++) {
try {
addresses[i] =
new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
}
return addresses;
}
private static List<Address> createAddresses(AtomicInteger ports, int count){
List<Address> addresses = new ArrayList<Address>(count);
for (int i = 0; i < count; i++) {
addresses.add(createAddress("127.0.0.1", ports.incrementAndGet()));
}
return addresses;
}
protected static Address createAddress(String host, int port) {
try {
return new Address("127.0.0.1", PORTS.incrementAndGet());
}
catch (UnknownHostException e) {
e.printStackTrace();
}
return null;
}
AfterBefore
A
B
C
D
E
F
G
1
2
3
9
4
5
6
7
8
M = {(C, 4) (D, 5) (E, 6) (F, 7)}
UT1 = {A, B, G}
UT2 = {8}
Extract Method detection rule
(M, UT1, UT2) = statement-matching(createAddresses, createAddress)
M = {(C, 4) (D, 5) (E, 6) (F, 7)} UT1 ={A, B, G} UT2 = {8}
createAddress is a newly added method in child commit 
createAddresses in parent commit does not call createAddress 
createAddresses in child commit calls createAddress 
|M| > |UT2| 
 createAddress has been extracted from createAddresses
24
Evaluation
RQ1: What is the accuracy of RefactoringMiner and how does it
compare to the state-of-the-art?
RQ2: What is the execution time of RefactoringMiner and how does
it compare to the state-of-the-art?
25
Oracle construction
• Public dataset with validated refactoring instances: 538 commits from
185 open-source projects [Silva et al., FSE’2016 Distinguished Artifact]
• We executed two tools:
RefactoringMiner and RefDiff [Silva & Valente, MSR’2017]
• We manually validated all 4,108 detected instances with 3 validators
for a period of 3 months
• 3,188 true positives and 920 false positives
26
Comparison with state-of-the-art
RefDiff [Silva & Valente, MSR’2017]
• Commit-based refactoring detection tool
• Evaluation on 448 seeded refactorings in 20 open-source projects
• RefDiff has much higher precision/recall than
Ref-Finder [Kim et al. 2010] and RefactoringCrawler [Dig et al. 2006]
• Ref-Finder and RefactoringCrawler need fully built Eclipse projects
as input
27
28
• RMiner better precision
in all refactoring types
• In half of the types
RefDiff has better recall
• Overall RMiner has +22%
precision and +1.5% recall
Advantage of RefDiff
• Treats code fragments as bags of tokens and ignores the structure
29
- private void startScanner() throws Exception + public void startScanner() throws Exception
{ {
- // check if scanning is enabled + if (!isScanningEnabled())
- if (scanIntervalSeconds <= 0) return;
- if ( "manual".equalsIgnoreCase( reload ) )
return; return;
+ public boolean isScanningEnabled ()
+ {
+ if (scanIntervalSeconds <=0 || "manual".equalsIgnoreCase( reload ))
+ return false;
+ return true;
+ }
Disadvantage of RefDiff
• Inability to deal with changes in the tokens
30
- if (eventBus != null) { + onRemoteStatusChanged(lastRemoteInstanceStatus, currentRemoteInstanceStatus);
- StatusChangeEvent event = new
StatusChangeEvent(lastRemoteInstanceStatus,
- currentRemoteInstanceStatus);
- eventBus.publish(event);
- }
+ protected void onRemoteStatusChanged(InstanceInfo.InstanceStatus oldStatus,
InstanceInfo.InstanceStatus newStatus) {
+ if (eventBus != null) {
+ StatusChangeEvent event = new StatusChangeEvent(oldStatus, newStatus);
+ eventBus.publish(event);
+ }
+ }
Execution time per commit [ms]
• On median, RefactoringMiner is 7 times faster than RefDiff
31
ms
Limitations + Future work
• Missing context: Pull Up reported as Move, if a class between the
source and destination is unchanged.
• Nested refactorings: unable to detect Extract Method applied
within an extracted method
• Unsupported refactorings: refactoring types, such as Rename
Variable/Parameter/Field, Extract/Inline Variable can be supported
from the analysis of AST replacements.
• Oracle bias: plan to add more tools for constructing the oracle
(challenge: make tools work without binding information)
32
Conclusions
• RefactoringMiner: commit-based refactoring detection
• No similarity thresholds
• High accuracy: 98% precision, 87% recall
• Ultra-fast: 58ms on median per commit
• Better than competitive tools (RefDiff): +22% precision, 7 times faster
• Largest and least biased refactoring oracle up to date
• 3188 true refactoring instances
• 538 commits
• 185 open-source projects
• 3 validators over 3 months (9 person-months)
33
http://refactoring.encs.concordia.ca/oracle/
https://github.com/tsantalis/RefactoringMiner

Contenu connexe

Tendances

읽기 좋은 코드가 좋은 코드다 Part one
읽기 좋은 코드가 좋은 코드다   Part one읽기 좋은 코드가 좋은 코드다   Part one
읽기 좋은 코드가 좋은 코드다 Part one
Ji Hun Kim
 

Tendances (20)

Sql Antipatterns Strike Back
Sql Antipatterns Strike BackSql Antipatterns Strike Back
Sql Antipatterns Strike Back
 
Python-List comprehension
Python-List comprehensionPython-List comprehension
Python-List comprehension
 
PROCESSOR AND CONTROL UNIT
PROCESSOR AND CONTROL UNITPROCESSOR AND CONTROL UNIT
PROCESSOR AND CONTROL UNIT
 
Java Foundations: Arrays
Java Foundations: ArraysJava Foundations: Arrays
Java Foundations: Arrays
 
Domain Modeling with FP (DDD Europe 2020)
Domain Modeling with FP (DDD Europe 2020)Domain Modeling with FP (DDD Europe 2020)
Domain Modeling with FP (DDD Europe 2020)
 
Python程式設計 - 串列資料應用
Python程式設計 - 串列資料應用 Python程式設計 - 串列資料應用
Python程式設計 - 串列資料應用
 
Python : Regular expressions
Python : Regular expressionsPython : Regular expressions
Python : Regular expressions
 
Python indexing (menard maranan)
Python indexing (menard maranan)Python indexing (menard maranan)
Python indexing (menard maranan)
 
Four Languages From Forty Years Ago
Four Languages From Forty Years AgoFour Languages From Forty Years Ago
Four Languages From Forty Years Ago
 
Function in Python
Function in PythonFunction in Python
Function in Python
 
Clean Code @Voxxed Days Cluj 2023 - opening Keynote
Clean Code @Voxxed Days Cluj 2023 - opening KeynoteClean Code @Voxxed Days Cluj 2023 - opening Keynote
Clean Code @Voxxed Days Cluj 2023 - opening Keynote
 
SQL practice questions set - 2
SQL practice questions set - 2SQL practice questions set - 2
SQL practice questions set - 2
 
읽기 좋은 코드가 좋은 코드다 Part one
읽기 좋은 코드가 좋은 코드다   Part one읽기 좋은 코드가 좋은 코드다   Part one
읽기 좋은 코드가 좋은 코드다 Part one
 
Python
PythonPython
Python
 
Domain Driven Design with the F# type System -- NDC London 2013
Domain Driven Design with the F# type System -- NDC London 2013Domain Driven Design with the F# type System -- NDC London 2013
Domain Driven Design with the F# type System -- NDC London 2013
 
JavaScript - Chapter 7 - Advanced Functions
 JavaScript - Chapter 7 - Advanced Functions JavaScript - Chapter 7 - Advanced Functions
JavaScript - Chapter 7 - Advanced Functions
 
Introduction to JSON
Introduction to JSONIntroduction to JSON
Introduction to JSON
 
DAA Lab File C Programs
DAA Lab File C ProgramsDAA Lab File C Programs
DAA Lab File C Programs
 
Solarwinds Orion NPM ve NTA sunumu
Solarwinds Orion NPM ve NTA sunumuSolarwinds Orion NPM ve NTA sunumu
Solarwinds Orion NPM ve NTA sunumu
 
Database connectivity in python
Database connectivity in pythonDatabase connectivity in python
Database connectivity in python
 

Similaire à Accurate and Efficient Refactoring Detection in Commit History

DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdfDoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
aathiauto
 
C# Starter L04-Collections
C# Starter L04-CollectionsC# Starter L04-Collections
C# Starter L04-Collections
Mohammad Shaker
 
Solve the coding errors for upvotemake test-statsg++ -g -std=c++.pdf
Solve the coding errors for upvotemake test-statsg++ -g -std=c++.pdfSolve the coding errors for upvotemake test-statsg++ -g -std=c++.pdf
Solve the coding errors for upvotemake test-statsg++ -g -std=c++.pdf
snewfashion
 

Similaire à Accurate and Efficient Refactoring Detection in Commit History (20)

Refactoring Mining - The key to unlock software evolution
Refactoring Mining - The key to unlock software evolutionRefactoring Mining - The key to unlock software evolution
Refactoring Mining - The key to unlock software evolution
 
java sockets
 java sockets java sockets
java sockets
 
Rx.NET, from the inside-out - Stas Rivkin - Codemotion Rome 2018
Rx.NET, from the inside-out - Stas Rivkin - Codemotion Rome 2018Rx.NET, from the inside-out - Stas Rivkin - Codemotion Rome 2018
Rx.NET, from the inside-out - Stas Rivkin - Codemotion Rome 2018
 
Rx.NET, from the inside out - Codemotion 2018
Rx.NET, from the inside out - Codemotion 2018Rx.NET, from the inside out - Codemotion 2018
Rx.NET, from the inside out - Codemotion 2018
 
C++ practical
C++ practicalC++ practical
C++ practical
 
greenDAO
greenDAOgreenDAO
greenDAO
 
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdfDoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
 
Java 8 Stream API. A different way to process collections.
Java 8 Stream API. A different way to process collections.Java 8 Stream API. A different way to process collections.
Java 8 Stream API. A different way to process collections.
 
Blazing Fast Windows 8 Apps using Visual C++
Blazing Fast Windows 8 Apps using Visual C++Blazing Fast Windows 8 Apps using Visual C++
Blazing Fast Windows 8 Apps using Visual C++
 
Дмитрий Верескун «Синтаксический сахар C#»
Дмитрий Верескун «Синтаксический сахар C#»Дмитрий Верескун «Синтаксический сахар C#»
Дмитрий Верескун «Синтаксический сахар C#»
 
(Rx).NET' way of async programming (.NET summit 2017 Belarus)
(Rx).NET' way of async programming (.NET summit 2017 Belarus)(Rx).NET' way of async programming (.NET summit 2017 Belarus)
(Rx).NET' way of async programming (.NET summit 2017 Belarus)
 
Unit 3
Unit 3Unit 3
Unit 3
 
C# Starter L04-Collections
C# Starter L04-CollectionsC# Starter L04-Collections
C# Starter L04-Collections
 
program#include iostreamusing namespace std;void calculatio.pdf
program#include iostreamusing namespace std;void calculatio.pdfprogram#include iostreamusing namespace std;void calculatio.pdf
program#include iostreamusing namespace std;void calculatio.pdf
 
Oops lab manual2
Oops lab manual2Oops lab manual2
Oops lab manual2
 
COA_remaining_lab_works_077BCT033.pdf
COA_remaining_lab_works_077BCT033.pdfCOA_remaining_lab_works_077BCT033.pdf
COA_remaining_lab_works_077BCT033.pdf
 
PostThis
PostThisPostThis
PostThis
 
C# 7.x What's new and what's coming with C# 8
C# 7.x What's new and what's coming with C# 8C# 7.x What's new and what's coming with C# 8
C# 7.x What's new and what's coming with C# 8
 
Solve the coding errors for upvotemake test-statsg++ -g -std=c++.pdf
Solve the coding errors for upvotemake test-statsg++ -g -std=c++.pdfSolve the coding errors for upvotemake test-statsg++ -g -std=c++.pdf
Solve the coding errors for upvotemake test-statsg++ -g -std=c++.pdf
 
Linq intro
Linq introLinq intro
Linq intro
 

Plus de Nikolaos Tsantalis

Improving the Unification of Software Clones Using Tree and Graph Matching Al...
Improving the Unification of Software Clones Using Tree and Graph Matching Al...Improving the Unification of Software Clones Using Tree and Graph Matching Al...
Improving the Unification of Software Clones Using Tree and Graph Matching Al...
Nikolaos Tsantalis
 
Code Smell Research: History and Future Directions
Code Smell Research: History and Future DirectionsCode Smell Research: History and Future Directions
Code Smell Research: History and Future Directions
Nikolaos Tsantalis
 
Preventive Software Maintenance: The Past, the Present, the Future
Preventive Software Maintenance: The Past, the Present, the FuturePreventive Software Maintenance: The Past, the Present, the Future
Preventive Software Maintenance: The Past, the Present, the Future
Nikolaos Tsantalis
 
An Empirical Study of Duplication in Cascading Style Sheets
An Empirical Study of Duplication in Cascading Style SheetsAn Empirical Study of Duplication in Cascading Style Sheets
An Empirical Study of Duplication in Cascading Style Sheets
Nikolaos Tsantalis
 
Ranking Refactoring Suggestions based on Historical Volatility
Ranking Refactoring Suggestions based on Historical VolatilityRanking Refactoring Suggestions based on Historical Volatility
Ranking Refactoring Suggestions based on Historical Volatility
Nikolaos Tsantalis
 
Feature Detection in Ajax-enabled Web Applications
Feature Detection in Ajax-enabled Web ApplicationsFeature Detection in Ajax-enabled Web Applications
Feature Detection in Ajax-enabled Web Applications
Nikolaos Tsantalis
 
A Multidimensional Empirical Study on Refactoring Activity
A Multidimensional Empirical Study on Refactoring ActivityA Multidimensional Empirical Study on Refactoring Activity
A Multidimensional Empirical Study on Refactoring Activity
Nikolaos Tsantalis
 
Unification and Refactoring of Clones
Unification and Refactoring of ClonesUnification and Refactoring of Clones
Unification and Refactoring of Clones
Nikolaos Tsantalis
 

Plus de Nikolaos Tsantalis (16)

CASCON 2023 Most Influential Paper Award Talk
CASCON 2023 Most Influential Paper Award TalkCASCON 2023 Most Influential Paper Award Talk
CASCON 2023 Most Influential Paper Award Talk
 
SANER 2019 Most Influential Paper Talk
SANER 2019 Most Influential Paper TalkSANER 2019 Most Influential Paper Talk
SANER 2019 Most Influential Paper Talk
 
Clone Refactoring with Lambda Expressions
Clone Refactoring with Lambda ExpressionsClone Refactoring with Lambda Expressions
Clone Refactoring with Lambda Expressions
 
Why We Refactor? Confessions of GitHub Contributors
Why We Refactor? Confessions of GitHub ContributorsWhy We Refactor? Confessions of GitHub Contributors
Why We Refactor? Confessions of GitHub Contributors
 
Migrating cascading style sheets to preprocessors
Migrating cascading style sheets to preprocessorsMigrating cascading style sheets to preprocessors
Migrating cascading style sheets to preprocessors
 
JDeodorant: Clone Refactoring
JDeodorant: Clone RefactoringJDeodorant: Clone Refactoring
JDeodorant: Clone Refactoring
 
An empirical study on the use of CSS preprocessors
An empirical study on the use of CSS preprocessorsAn empirical study on the use of CSS preprocessors
An empirical study on the use of CSS preprocessors
 
An Empirical Study on the Use of CSS Preprocessors
An Empirical Study on the Use of CSS PreprocessorsAn Empirical Study on the Use of CSS Preprocessors
An Empirical Study on the Use of CSS Preprocessors
 
Improving the Unification of Software Clones Using Tree and Graph Matching Al...
Improving the Unification of Software Clones Using Tree and Graph Matching Al...Improving the Unification of Software Clones Using Tree and Graph Matching Al...
Improving the Unification of Software Clones Using Tree and Graph Matching Al...
 
Code Smell Research: History and Future Directions
Code Smell Research: History and Future DirectionsCode Smell Research: History and Future Directions
Code Smell Research: History and Future Directions
 
Preventive Software Maintenance: The Past, the Present, the Future
Preventive Software Maintenance: The Past, the Present, the FuturePreventive Software Maintenance: The Past, the Present, the Future
Preventive Software Maintenance: The Past, the Present, the Future
 
An Empirical Study of Duplication in Cascading Style Sheets
An Empirical Study of Duplication in Cascading Style SheetsAn Empirical Study of Duplication in Cascading Style Sheets
An Empirical Study of Duplication in Cascading Style Sheets
 
Ranking Refactoring Suggestions based on Historical Volatility
Ranking Refactoring Suggestions based on Historical VolatilityRanking Refactoring Suggestions based on Historical Volatility
Ranking Refactoring Suggestions based on Historical Volatility
 
Feature Detection in Ajax-enabled Web Applications
Feature Detection in Ajax-enabled Web ApplicationsFeature Detection in Ajax-enabled Web Applications
Feature Detection in Ajax-enabled Web Applications
 
A Multidimensional Empirical Study on Refactoring Activity
A Multidimensional Empirical Study on Refactoring ActivityA Multidimensional Empirical Study on Refactoring Activity
A Multidimensional Empirical Study on Refactoring Activity
 
Unification and Refactoring of Clones
Unification and Refactoring of ClonesUnification and Refactoring of Clones
Unification and Refactoring of Clones
 

Dernier

Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
gindu3009
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 

Dernier (20)

Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 

Accurate and Efficient Refactoring Detection in Commit History

  • 1. Accurate and Efficient Refactoring Detection in Commit History 1May 31, 2018 Danny DigNikolaos Tsantalis Matin Mansouri Laleh Eshkevari Davood Mazinanian
  • 2. Refactoring is noise in evolution analysis • Bug-inducing analysis (SZZ): flag refactoring edits as bug-introducing changes • Tracing requirements to code: miss traceability links due to refactoring • Regression testing: unnecessary execution of tests for refactored code with no behavioral changes • Code review/merging: refactoring edits tangled with the actual changes intended by developers 2
  • 3. There are many refactoring detection tools • Demeyer et al. [OOPSLA’00] • UMLDiff + JDevAn [Xing & Stroulia ASE’05] • RefactoringCrawler [Dig et al. ECOOP’06] • Weißgerber and Diehl [ASE’06] • Ref-Finder [Kim et al. ICSM’10, FSE’10] • RefDiff [Silva & Valente, MSR’17] 3
  • 4. Limitations of previous approaches • Dependence on similarity thresholds • thresholds need calibration for projects with different characteristics • Dependence on built versions • only 38% of the change history can be successfully compiled [Tufano et al., 2017] • Unreliable oracles for evaluating precision/recall • Incomplete (refactorings found in release notes or commit messages) • Biased (applying a single tool with two different similarity thresholds) • Artificial (seeded refactorings) 4
  • 5. Why do we need better accuracy? 5 Empirical studies Refactoring detection Library adaptation Framework migration poor accuracy
  • 6. Why do we need better accuracy? 6
  • 7. Contributions 1. First refactoring detection algorithm operating without any code similarity thresholds 2. RefactoringMiner open-source tool with an API 3. Oracle comprising 3,188 refactorings found in 538 commits from 185 open-source projects 4. Evaluation of precision/recall and comparison with previous state-of-the-art 5. Tool infrastructure for comparing multiple refactoring detection tools 7
  • 8. Approach in a nutshell AST-based statement matching algorithm • Input: code fragments T1 from parent commit and T2 from child commit • Output: • M set of matched statement pairs • UT1 set of unmatched statements from T1 • UT2 set of unmatched statements from T2 • Code changes due to refactoring mechanics: abstraction, argumentization • Code changes due to overlapping refactorings or bug fixes: syntax-aware AST node replacements 8
  • 9. 9 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } AfterBefore
  • 10. 10 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(int count) { List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } AfterBefore
  • 11. 11 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } AfterBefore private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", ports.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; }
  • 12. 12 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } AfterBefore private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", ports.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; }
  • 13. 13 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { } return addresses; } try { addresses[i] = new Address("127.0.0.1", ports.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } AfterBefore
  • 14. 14 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address(host, port); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } AfterBefore
  • 15. 15 protected static Address createAddress(String host, int port) { try { return new Address(host, port); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } AfterBefore textual similarity  30%
  • 16. 16 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address(host, port); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } AfterBefore (1) Abstraction
  • 17. 17 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address(host, port); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } AfterBefore (1) Abstraction
  • 18. 18 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address(host, port); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } AfterBefore (2) Argumentization
  • 19. 19 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address("127.0.0.1", ports.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } AfterBefore (2) Argumentization
  • 20. 20 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address("127.0.0.1", ports.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } AfterBefore (3) AST Node Replacements
  • 21. 21 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } AfterBefore (3) AST Node Replacements
  • 22. 22 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } AfterBefore textual similarity = 100%
  • 23. 23 private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } AfterBefore A B C D E F G 1 2 3 9 4 5 6 7 8 M = {(C, 4) (D, 5) (E, 6) (F, 7)} UT1 = {A, B, G} UT2 = {8}
  • 24. Extract Method detection rule (M, UT1, UT2) = statement-matching(createAddresses, createAddress) M = {(C, 4) (D, 5) (E, 6) (F, 7)} UT1 ={A, B, G} UT2 = {8} createAddress is a newly added method in child commit  createAddresses in parent commit does not call createAddress  createAddresses in child commit calls createAddress  |M| > |UT2|   createAddress has been extracted from createAddresses 24
  • 25. Evaluation RQ1: What is the accuracy of RefactoringMiner and how does it compare to the state-of-the-art? RQ2: What is the execution time of RefactoringMiner and how does it compare to the state-of-the-art? 25
  • 26. Oracle construction • Public dataset with validated refactoring instances: 538 commits from 185 open-source projects [Silva et al., FSE’2016 Distinguished Artifact] • We executed two tools: RefactoringMiner and RefDiff [Silva & Valente, MSR’2017] • We manually validated all 4,108 detected instances with 3 validators for a period of 3 months • 3,188 true positives and 920 false positives 26
  • 27. Comparison with state-of-the-art RefDiff [Silva & Valente, MSR’2017] • Commit-based refactoring detection tool • Evaluation on 448 seeded refactorings in 20 open-source projects • RefDiff has much higher precision/recall than Ref-Finder [Kim et al. 2010] and RefactoringCrawler [Dig et al. 2006] • Ref-Finder and RefactoringCrawler need fully built Eclipse projects as input 27
  • 28. 28 • RMiner better precision in all refactoring types • In half of the types RefDiff has better recall • Overall RMiner has +22% precision and +1.5% recall
  • 29. Advantage of RefDiff • Treats code fragments as bags of tokens and ignores the structure 29 - private void startScanner() throws Exception + public void startScanner() throws Exception { { - // check if scanning is enabled + if (!isScanningEnabled()) - if (scanIntervalSeconds <= 0) return; - if ( "manual".equalsIgnoreCase( reload ) ) return; return; + public boolean isScanningEnabled () + { + if (scanIntervalSeconds <=0 || "manual".equalsIgnoreCase( reload )) + return false; + return true; + }
  • 30. Disadvantage of RefDiff • Inability to deal with changes in the tokens 30 - if (eventBus != null) { + onRemoteStatusChanged(lastRemoteInstanceStatus, currentRemoteInstanceStatus); - StatusChangeEvent event = new StatusChangeEvent(lastRemoteInstanceStatus, - currentRemoteInstanceStatus); - eventBus.publish(event); - } + protected void onRemoteStatusChanged(InstanceInfo.InstanceStatus oldStatus, InstanceInfo.InstanceStatus newStatus) { + if (eventBus != null) { + StatusChangeEvent event = new StatusChangeEvent(oldStatus, newStatus); + eventBus.publish(event); + } + }
  • 31. Execution time per commit [ms] • On median, RefactoringMiner is 7 times faster than RefDiff 31 ms
  • 32. Limitations + Future work • Missing context: Pull Up reported as Move, if a class between the source and destination is unchanged. • Nested refactorings: unable to detect Extract Method applied within an extracted method • Unsupported refactorings: refactoring types, such as Rename Variable/Parameter/Field, Extract/Inline Variable can be supported from the analysis of AST replacements. • Oracle bias: plan to add more tools for constructing the oracle (challenge: make tools work without binding information) 32
  • 33. Conclusions • RefactoringMiner: commit-based refactoring detection • No similarity thresholds • High accuracy: 98% precision, 87% recall • Ultra-fast: 58ms on median per commit • Better than competitive tools (RefDiff): +22% precision, 7 times faster • Largest and least biased refactoring oracle up to date • 3188 true refactoring instances • 538 commits • 185 open-source projects • 3 validators over 3 months (9 person-months) 33 http://refactoring.encs.concordia.ca/oracle/ https://github.com/tsantalis/RefactoringMiner

Notes de l'éditeur

  1. My name is Nikolaos Tsantalis, associate professor at Concordia University, and I will present to you our work on Refactoring detection. The rest of co-authors are Matin (Master student in my group), Laleh (PDF in my group), Davood (PhD in my group), and Danny from Oregon State Univeristy.