Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Dipenta msr2011-renaming
1. Identifier
Renamings
MSR’2011
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta,
Rocco Oliveto,
Yann-Ga¨l
e
An Exploratory Study of Identifier Renamings
Gu´h´neuc, Giulio
e e
Antoniol
Introduction Laleh Eshkevari Venera Arnaoudova Max Di Penta
Study goal Rocco Oliveto Yann-Ga¨l Gu´h´neuc
e e e Giulio Antoniol
Term renaming
taxonomy
´
SOCCER Lab. & Ptidej Team, Ecole Polytechnique de Montr´al, Canada
e
Approach
University of Sannio, Italy
Empirical study University of Molise, Italy
Discussion
Conclusions and May 22, 2011
future directions
2. Identifier
Renamings Outline
MSR’2011
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta, Introduction
Rocco Oliveto,
Yann-Ga¨l
e
Gu´h´neuc, Giulio
e e
Antoniol Study goal
Introduction
Study goal
Term renaming taxonomy
Term renaming
taxonomy Approach
Approach
Empirical study
Empirical study
Discussion
Conclusions and
future directions Discussion
Conclusions and future directions
2 / 17
3. Identifier
Renamings Introduction
MSR’2011
Laleh Eshkevari,
Venera
Arnaoudova, Identifiers: names of classes/interfaces, attributes, methods,
Max Di Penta,
Rocco Oliveto, formal parameters.
Yann-Ga¨l
e
Gu´h´neuc, Giulio
e e
Antoniol ◮ Identifiers are important because they
Introduction ◮ reflect developer’s perception of the problem domain
Study goal model
Term renaming
◮ capture developer’s understanding
taxonomy ◮ convey ideas
Approach ◮ are a way to communicate with other developers
Empirical study
Discussion
◮ Often composed of terms, where a term is
Conclusions and ◮ word
future directions ◮ abbreviation
◮ acronym
◮ jargon
3 / 17
4. Identifier
Renamings Exploratory study on Identifier Renaming
MSR’2011
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta,
Rocco Oliveto,
Yann-Ga¨l
e
Gu´h´neuc, Giulio
e e
Antoniol
Undersanding Identifier Renaming
Introduction
Study goal
Identifiers evolve as the source code evolve.
Term renaming
taxonomy
◮ Existence and frequency
Approach ◮ Who does renaming
Empirical study ◮ Study different types of renaming
Discussion
Conclusions and
future directions
4 / 17
5. Identifier
Renamings Term renaming taxonomy
MSR’2011 (1/2)
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta, D1: Entity
Rocco Oliveto,
Yann-Ga¨l
e
Gu´h´neuc, Giulio
e e
◮ class, interface, field, method, constructor, formal
Antoniol parameter, local variable
Introduction
Study goal D2: Semantic
Term renaming ◮ Add a meaning
taxonomy
Approach ◮ Remove a meaning
Empirical study ◮ Keep the meaning
Discussion ◮ Same meaning (synonym, abbreviation/expansion, typo)
Conclusions and ◮ More general/special meaning (hyponym/hypernym)
future directions
◮ Change the meaning
◮ Opposite meaning (antonym)
◮ Whole part relation (holonym/meronym)
◮ Unrelated meaning
5 / 17
6. Identifier
Renamings Term renaming taxonomy
MSR’2011 (2/2)
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta,
Rocco Oliveto, D3: String distance
Yann-Ga¨l
e
Gu´h´neuc, Giulio
e e We used the normalized Levenshtein distance.
Antoniol
Introduction
◮ Low distance (≤ 0.40)
Study goal ◮ High distance (otherwise)
Term renaming
taxonomy
Approach
D4: Grammar
Empirical study
◮ noun
Discussion ◮ verb
Conclusions and
future directions
◮ adverb
◮ adjective
◮ none
6 / 17
7. Identifier
Renamings Identification and classification of renamings
MSR’2011
Laleh Eshkevari,
Venera Identification
Arnaoudova,
Max Di Penta, ◮ Map lines of code of two consecutive versions of each
Rocco Oliveto,
Yann-Ga¨l
e file (diff )
Gu´h´neuc, Giulio
e e
Antoniol ◮ Identify declarations of entities (java parser )
Introduction ◮ Select potential renamings as declarations of same type
Study goal and high similarity (perl scripts)
Term renaming
taxonomy Manual validation reports less than 20% of false positives for
Approach the studied systems.
Empirical study
Discussion
Classification
Conclusions and
future directions ◮ Split identifiers into terms
◮ Classify term renaming
◮ semantic and grammar (WordNet)
◮ string distance (Normalized Levenshtein distance)
◮ entity (Java AST )
7 / 17
8. Identifier
Renamings Empirical study
MSR’2011 (1/7)
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta,
Rocco Oliveto,
Yann-Ga¨l
e Research questions
Gu´h´neuc, Giulio
e e
Antoniol ◮ RQ1: When do identifier renamings happen?
Introduction ◮ RQ2: Who are the developers that mostly perform
Study goal identifier renamings?
Term renaming
taxonomy ◮ RQ3: What kind of changes occur in the terms
Approach composing renamed identifiers according to our
Empirical study taxonomy?
Discussion
Conclusions and
future directions
Context
◮ Eclipse-JDT
◮ Tomcat
8 / 17
9. Identifier
Renamings Empirical study
MSR’2011 (2/7)
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta,
Results - RQ1 (When?)
Rocco Oliveto,
Yann-Ga¨l
e Eclipse-JDT 350
300
2.1.3
# of renamings
Gu´h´neuc, Giulio
e e 250
Antoniol 200
3.0
150 2.1
2.0
100 1.0
Introduction 50
3.1
0
Study goal
2001-05
2001-09
2002-01
2002-05
2002-09
2003-01
2003-05
2003-09
2004-01
2004-05
2004-09
2005-01
2005-05
2005-09
2006-01
2006-05
2006-09
Term renaming
Month
taxonomy
Approach 35 4.0b1
Tomcat 30
5.0 5.5.9 5.5.16
# of renamings
Empirical study 25
20
Discussion 15
10
Conclusions and 5
future directions 0
1999-10
2000-03
2000-08
2001-01
2001-06
2001-11
2002-04
2002-09
2003-02
2003-07
2003-12
2004-05
2004-10
2005-03
2005-08
2006-01
2006-06
Month
◮ Renamings are concentrated in specific time frames.
9 / 17
10. Identifier
Renamings Empirical study
MSR’2011 (3/7)
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta,
Results - RQ2 (Who?)
Rocco Oliveto,
Yann-Ga¨l
e Eclipse-JDT Tomcat
Gu´h´neuc, Giulio
e e ID # of renamings ID # of renamings
Antoniol pmulet 792 (3%) costin 139 (1%)
othomann 269 (3%) luehe 107 (3%)
Introduction jlanneluc 263 (3%) remm 89 (1%)
maeschli 260 (1%) fhanik 78 (3%)
Study goal
jdesrivieres 197 (12%) craigmcc 51 (1%)
Term renaming darin 158 (1%) kinman 29 (1%)
taxonomy ptff 150 (7%) markt 27 (0%)
Approach daudel 127 (3%) amyroh 22 (1%)
maeschlimann 127 (5%) pier 22 (1%)
Empirical study
kmaetzel 123 (6%) billbarker 15 (1%)
Discussion Total top 10 2,466 Total top 10 579
Total renamings 4,500 Total renamings 724
Conclusions and
future directions % renamings top 10 55% % renamings top 10 80%
◮ Renamings are performed by a subset of committers: 36
out of 50 (72%) for Eclipse and 39 out of 84 (49%) for
Tomcat.
10 / 17
11. Identifier
Renamings Empirical study
MSR’2011 (4/7)
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta,
Rocco Oliveto, Results - RQ3 (What? - D1: Entity)
Yann-Ga¨l
e
Gu´h´neuc, Giulio
e e 'C%#
Antoniol >,+?2=# &!!'#
&!&#
<:,5=# %&'!#
Introduction
D&#
820/5#9/-:/;5,# C(E#
Study goal
CF#
6/-/7,+,-# $!E#
Term renaming
taxonomy 15/33#
%(#
C$#
Approach C#
12*3+-40+2-# ((#
Empirical study !#
)*+,-./0,# &#
Discussion 6,-0,*+/G,#2.#
-,*/7:*G3# !"# $"# %!"# %$"# &!"# &$"# '!"# '$"# (!"# ($"# $!"#
Conclusions and @270/+# A05:B3,#
future directions
◮ Most renamings occur on class interfaces.
11 / 17
12. Identifier
Renamings Empirical study
MSR’2011 (5/7)
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta,
Rocco Oliveto,
Results - RQ3 (What? - D2: Semantic)
Yann-Ga¨l
e Renaming Eclipse Tomcat Example
Gu´h´neuc, Giulio
e e
add meaning 3,333 357 type → authtype (T)
Antoniol
resource → visitedResource (E)
remove meaning 2,580 326 copyJAR → copy (T)
Introduction fTypeBinding → fBinding (E)
Study goal
same meaning 436 42 committed → commited (T)
methodsBuffer → methodsBuffered (E)
Term renaming generalization/specialization 24 0 scanCurrentPosition → scanCurrentLine (E)
taxonomy thrownExceptionSize → thrownExceptionLength (E)
opposite meaning 17 0 findNextLevelChildrenByElementName
Approach
→ findNextLevelParentByElementName (E)
Empirical study hasClosingBracket → hasOpeningBracket (E)
whole/part relation 0 0
Discussion unrelated meaning 989 207 createContents → createControl (E)
getClusterReceiver → getChannelReceiver (T)
Conclusions and
future directions Total 7,379 932
◮ We observe renamings towards antonyms and
meronyms. Is it performed to correct wrong semantic?
12 / 17
13. Identifier
Renamings Empirical study
MSR’2011 (6/7)
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta,
Rocco Oliveto,
Yann-Ga¨l
e
Gu´h´neuc, Giulio
e e
Antoniol
Results - RQ3 (What? - D3: String Distance)
Distance Eclipse-JDT Tomcat Examples
Introduction low 1,433 249 isOverriddenMethod → areOverriddenMethods (E)
Study goal statement → stmt (E)
parameters → params (E)
Term renaming
warining → warning (E)
taxonomy
message → msg (T)
Approach high 5,946 683 isOverriddenMethod → areOverriddenMethods (E)
Empirical study Total 7,379 932
Discussion
Conclusions and
◮ Small string changes are often due to typos, expansions,
future directions or contractions.
13 / 17
14. Identifier
Renamings Empirical study
MSR’2011 (7/7)
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta,
Rocco Oliveto,
Yann-Ga¨l
e
Gu´h´neuc, Giulio
e e
Antoniol Results - RQ3 (What? - D4: Grammatical changes)
Renaming Eclipse-JDT Tomcat Example
Introduction
noun to verb 4 0 editor → edit (E)
Study goal noun to adjective 7 0 qualificationPattern → qualified Pattern (E)
verb to noun 4 2 preparedAuthenticate → preparedCredentials (T)
Term renaming verb to adjective 5 0 fReconcileListeners → fReconcilingListeners (E)
taxonomy adjective to noun 5 0 fLayoutHierarchicalAction → fShowTestHierarchyAction (E)
adjective to verb 2 0 isValidClassFile → validateClassFile (E)
Approach adverb to adjective 0 0
Empirical study Other changes 347 27 filterStatic (n;a) → filterStatics (n)
No change 230 27
Discussion
Conclusions and
future directions
◮ Grammar forms of terms tend not to change often.
14 / 17
15. Identifier
Renamings Discussion
MSR’2011
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta,
Why are identifiers renamed?
Rocco Oliveto,
Yann-Ga¨l
e
◮ Formatting: e.g., appbase → appBase,
Gu´h´neuc, Giulio
e e
Antoniol
SC A SSL KEYSIZE → SC A SSL KEY SIZE.
Introduction
◮ Improving abbreviations: e.g.,
Study goal
TYPE CONF APPLIC → TYPE CONF ENUM APPL.
Term renaming ◮ Never satisfied with a term:
taxonomy
e.g., list → roleList → roles.
Approach
Empirical study
◮ Propagation to different packages/artifacts: e.g.,
Discussion size → capacity in WarpTable.java in two different
Conclusions and folders.
future directions
◮ Declarations consistent with comments:
e.g., list → roleList, where the comment is:
“// Accumulate the user’s roles”.
15 / 17
16. Identifier
Renamings Conclusions and future directions
MSR’2011
Laleh Eshkevari,
Venera
Arnaoudova, Conclusions
Max Di Penta,
Rocco Oliveto, ◮ Renamings are concentrated in specific time frames
Yann-Ga¨l
e
Gu´h´neuc, Giulio
e e
Antoniol
◮ Renamings are performed by a subset of committers
Introduction
◮ Most renamings occur on class interfaces (method and
Study goal
field identifiers)
Term renaming ◮ Renamings occur not only towards synonym, but also
taxonomy
Approach
antonyms and meronyms
Empirical study ◮ Grammar forms of terms tend not to change often
Discussion
Conclusions and Future directions
future directions
◮ Aggregate term taxonomy into identifier taxonomy
◮ Finding inconsistencies/bad smells in source code using
the renaming taxonomy
16 / 17
17. Identifier
Renamings Thank you!
MSR’2011
Laleh Eshkevari,
Venera
Arnaoudova,
Max Di Penta,
Rocco Oliveto,
Yann-Ga¨l
e
Gu´h´neuc, Giulio
e e
Antoniol
Introduction
Study goal
Term renaming
Questions?
taxonomy
Approach
Empirical study
Discussion
Conclusions and
future directions
17 / 17