SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
Mining the Lexicon Used by
Programmers during Software
          Evolution

         Giuliano Antoniol, Yann-Gaël Guéhéneuc,
         Ettore Merlo, Paolo Tonella
         From:
         École Polytechnique de Montréal, Canada
         University of Montreal, Canada
         FBK-IRST, Trento, Italy
                                               1
Is this one the class you
                          are looking for?
public class T01<T02,T03> extends T04<T02,T03>
  implements T05<T02,T03>, T06, T07 {

    public T03 m01(T02 x01, T03 x02) {
      if (x01 == null)
        return m02(x02);
      int x03 = m03(x01.m04());
      int x04 = m05(x03, x05.x06);
      for (T08<T02,T03> x07 = x08[x04]; x07 != null; x07 = x07.x09) {
        T09 x10;
        if (x07.x11 == x03 && ((x10 = x07.x12) == x01 || x01.m06(x10))) {
          T03 x13 = x07.x14;
          x07.x14 = x02;
          x07.m07(this);
          return x13;
        }
      }
      x15++;
      m08(x03, x01, x02, x04);
      return null;
    }                                                                 2
}
Is this one the class you
                         are looking for?
public class HashMap<K,V> extends AbstractMap<K,V>
  implements Map<K,V>, Cloneable, Serializable {

    public V put(K key, V value) {
      if (key == null)
        return putForNullKey(value);
      int hash = hash(key.hashCode());
      int i = indexFor(hash, table.length);
      for (Entry<K,V> e = table[i]; e != null; e = e.next) {
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
          V oldValue = e.value;
          e.value = value;
          e.recordAccess(this);
          return oldValue;
        }
      }
      modCount++;
      addEntry(hash, key, value, i);
      return null;
    }                                                                3
}
Self-documenting
                              identifiers
Good identifiers:
  provide concise clues on the semantics of labeled entities;
  save programmers from reading the entire code segment;
  speed up knowledge acquisition;
  support program understanding (code queries, grep, etc.).

To some extent, we know how the structure of a program evolve.
  How does the lexicon of identifiers evolve?


Corollary: When we teach programming, we should never let our students use
names such as foo or bar (pippo, pluto) for any program entity.
                                                                         4
Research questions


RQ1: How does the stability of the lexicon of identifiers
compare to the stability of the program structure as the
program evolves?

RQ2: What is the frequency of changes to program
entities (in particular renaming) due to identifier
refactoring?




                                                            5
Data model
(Sub)System or (Sub)Directory    class HashMap<K, V> {
                                   Entry<K, V> table[];
       File         Function       V put(K key, V value) {…}
                    name         }
                    struct
                    lexicon      Full lexicon:
                                 <hash, map, table, put, key, value>
      Class         Attribute    Function
                    name         Name: put
                    struct       Struct: <10, 1, 2, 31, 0, 0, 24>
                    lexicon      Lexicon: <0, 0, 0, 1, 1, 1>
                                 Attribute
                                 Name: table
                                 Struct: <1, …>
                                 Lexicon: <0, 0, 1, 0, 0, 0>


                                                                       6
Stability metrics
For leaf entities, cosine similarity:

  StructSim(Ei, Ej) = <struct(Ei), struct(Ej)> / |struct(Ei)| |struct(Ej)|

  LexicalSim(Ei, Ej) = <lexicon(Ei), lexicon(Ej)> / |lexicon(Ei)| |lexicon(Ej)|


Function                                    StructSim(put, put’) = 0.998
Name: put                                   LexicalSim(put, put’) = 1
Struct: <10, 1, 2, 31, 0, 0, 24>
Lexicon: <0, 0, 0, 1, 1, 1>                  For container entities,
                                             average similarity
Function
Name: put’                                   Similarity between
Struct: <11, 2, 1, 30, 0, 0, 22>             corresponding entities in the
Lexicon: <0, 0, 0, 1, 1, 1>                  history ► entity traceability
                                             required!
                                                                                  7
Entity traceability

By name:

       f()      f()   f()      f()       f()

                      (1) Traceability by name fails
By structure:         (2) There is no entity g() in previous release
                      (3) StructSim(f, g) ≥ T (=1)



       f()      f()   f()      g()       g()


                Renaming detected!
                                                                       8
Metrics and analysis

RQ1 (struct vs. lexicon evolution):

Null hypothesis: there is no statistically significant difference
between the probability distribution of lexical vs. structural
stability.
Statistical test: non-parametric Wilcoxon paired test.

RQ2 (frequency of renamings):

RenFreq = DetectedRenamings / Total Entities


                                                                    9
Subject systems

System       Language      Size      Versions   Identifiers
Eclipse        Java     2.9 MLOC       19        124187
Mozilla        C++      4.4 MLOC       24         55244
CERN/Alice     C++      0.825 MLOC     13         9002




                                                          10
Stability

          0.75                 0.80   0.85        0.90   0.95   1.00



1.0−2.0


  2.0.1


  2.0.2


    2.1


  2.1.1


  2.1.2


    3.0


  3.0.1


  3.0.2


    3.1


  3.1.1


  3.1.2


    3.2


  3.2.1


 3.3M1
           P-value < 0.05




 3.3M2
           for every release




 3.3M3
                                                                       Stability plot: Eclipse




 3.3M4
  11
Stability

          0.980                       0.985       0.990   0.995   1.000



1.0−1.1

    1.2

  1.2.1

    1.3

    1.4

    1.5

  1.5a

  1.5b

    1.6

  1.6a

  1.6b

    1.7

  1.7a

  1.7b

  1.7.2

  1.7.3

  1.7.5

  1.7.6

  1.7.7
                  P-value < 0.05




  1.7.8
                  for every release




 1.7.10

 1.7.11
                                                                          Stability plot: Mozilla




 1.7.12
  12
Stability

            0.70   0.75                0.80    0.85       0.90   0.95   1.00



3.01−3.02




     3.03




     3.04




     3.05




     3.06




     3.07




     3.08




     3.09




     4.01




     4.02
                   P-value < 0.05
                   for every release




     4.03
                                                                               Stability plot: Alice




     4.04
    13
Renaming

Eclipse (Java):   AvgRenFreq = 7 / 106760 = 0.000065
Mozilla (C++):    AvgRenFreq = 0 / 51981 = 0
Alice (C++):      AvgRenFreq = 0 / 6736 = 0




                                                   14
Summary
RQ1 (struct vs. lexicon evolution):

   Lexical and structural changes have different
distributions over time; they probably obey different rules.
   Lexicon is always more stable than structure.
   Both structural and lexical stabilities tend to increase
over time and tend to have correlated instabilities.

RQ2 (frequency of renamings):

  Renamings are rare during the evolution of a software
system.
                                                               15
Discussion
                         (our interpretations)
• A different change process holds for lexicon and structure.
• Programmers are generally reluctant to change the lexicon.
  Some possible reasons:
   – Optimistically, there is no need to do it (domain perfectly modeled by
     lexicon).
   – High cognitive burden associated with this kind of change.
   – No dedicated tool available.
• The development environment seems to have an influence
  on the evolution of the lexicon. A renaming tool available in
  the IDE may help (Java vs. C++ in our study).
   – Other tools that may help: glossaries, cross-referencing tools,
     abbreviation expansion tools, documentation tools (possible using
     ontologies).
 Corollary: A program written with a bad lexicon (foo, bar, pippo, pluto and
 the like) tends to keep its poor identifiers forever.                     16
 Programmers must adapt to them; the inverse rarely happens.
Conclusions and
                     future work
RQ1: The lexicon is more stable than the structure.
RQ2: Identifier restructuring is rare.

Drafting a future work agenda:
• The lexicon of a program represents a substantial
investment for a company.
• However, almost no support is available to
preserve and increase such value over time.
• Research on techniques and tools for program
lexicon analysis and manipulation is strongly
needed.                                           17

Contenu connexe

Tendances

Java programming lab_manual_by_rohit_jaiswar
Java programming lab_manual_by_rohit_jaiswarJava programming lab_manual_by_rohit_jaiswar
Java programming lab_manual_by_rohit_jaiswarROHIT JAISWAR
 
Dart - web_ui & Programmatic components - Paris JUG - 20130409
Dart - web_ui & Programmatic components - Paris JUG - 20130409Dart - web_ui & Programmatic components - Paris JUG - 20130409
Dart - web_ui & Programmatic components - Paris JUG - 20130409yohanbeschi
 
Introduction to dart - So@t - 20130410
Introduction to dart - So@t - 20130410Introduction to dart - So@t - 20130410
Introduction to dart - So@t - 20130410yohanbeschi
 
How and why I turned my old Java projects into a first-class serverless compo...
How and why I turned my old Java projects into a first-class serverless compo...How and why I turned my old Java projects into a first-class serverless compo...
How and why I turned my old Java projects into a first-class serverless compo...Mario Fusco
 
Pragmatic metaprogramming
Pragmatic metaprogrammingPragmatic metaprogramming
Pragmatic metaprogrammingMårten Rånge
 
Deep Learning - Exploring The Magical World of Neural Network
Deep Learning - Exploring The Magical World of Neural NetworkDeep Learning - Exploring The Magical World of Neural Network
Deep Learning - Exploring The Magical World of Neural NetworkMinhas Kamal
 
The Ring programming language version 1.9 book - Part 10 of 210
The Ring programming language version 1.9 book - Part 10 of 210The Ring programming language version 1.9 book - Part 10 of 210
The Ring programming language version 1.9 book - Part 10 of 210Mahmoud Samir Fayed
 
GR8Conf 2011: Grails 1.4 Update by Peter Ledbrook
GR8Conf 2011: Grails 1.4 Update by Peter LedbrookGR8Conf 2011: Grails 1.4 Update by Peter Ledbrook
GR8Conf 2011: Grails 1.4 Update by Peter LedbrookGR8Conf
 
Lecture06 methods for-making_data_structures_v2
Lecture06 methods for-making_data_structures_v2Lecture06 methods for-making_data_structures_v2
Lecture06 methods for-making_data_structures_v2Hariz Mustafa
 
Solving real world problems with Hadoop
Solving real world problems with HadoopSolving real world problems with Hadoop
Solving real world problems with Hadoopsynctree
 
Object Oriented Solved Practice Programs C++ Exams
Object Oriented Solved Practice Programs C++ ExamsObject Oriented Solved Practice Programs C++ Exams
Object Oriented Solved Practice Programs C++ ExamsMuhammadTalha436
 
Refactoring Jdbc Programming
Refactoring Jdbc ProgrammingRefactoring Jdbc Programming
Refactoring Jdbc Programmingchanwook Park
 
Introduction à dart
Introduction à dartIntroduction à dart
Introduction à dartyohanbeschi
 
GoshawkDB: Making Time with Vector Clocks
GoshawkDB: Making Time with Vector ClocksGoshawkDB: Making Time with Vector Clocks
GoshawkDB: Making Time with Vector ClocksC4Media
 
M sc it_modelling_biological_systems2005
M sc it_modelling_biological_systems2005M sc it_modelling_biological_systems2005
M sc it_modelling_biological_systems2005Aafaq Malik
 
What makes a good bug report?
What makes a good bug report?What makes a good bug report?
What makes a good bug report?Rahul Premraj
 

Tendances (20)

Java programming lab_manual_by_rohit_jaiswar
Java programming lab_manual_by_rohit_jaiswarJava programming lab_manual_by_rohit_jaiswar
Java programming lab_manual_by_rohit_jaiswar
 
Ad java prac sol set
Ad java prac sol setAd java prac sol set
Ad java prac sol set
 
Java Lab Manual
Java Lab ManualJava Lab Manual
Java Lab Manual
 
Dart - web_ui & Programmatic components - Paris JUG - 20130409
Dart - web_ui & Programmatic components - Paris JUG - 20130409Dart - web_ui & Programmatic components - Paris JUG - 20130409
Dart - web_ui & Programmatic components - Paris JUG - 20130409
 
Introduction to dart - So@t - 20130410
Introduction to dart - So@t - 20130410Introduction to dart - So@t - 20130410
Introduction to dart - So@t - 20130410
 
How and why I turned my old Java projects into a first-class serverless compo...
How and why I turned my old Java projects into a first-class serverless compo...How and why I turned my old Java projects into a first-class serverless compo...
How and why I turned my old Java projects into a first-class serverless compo...
 
Pragmatic metaprogramming
Pragmatic metaprogrammingPragmatic metaprogramming
Pragmatic metaprogramming
 
Plc (1)
Plc (1)Plc (1)
Plc (1)
 
Deep Learning - Exploring The Magical World of Neural Network
Deep Learning - Exploring The Magical World of Neural NetworkDeep Learning - Exploring The Magical World of Neural Network
Deep Learning - Exploring The Magical World of Neural Network
 
Plc (1)
Plc (1)Plc (1)
Plc (1)
 
The Ring programming language version 1.9 book - Part 10 of 210
The Ring programming language version 1.9 book - Part 10 of 210The Ring programming language version 1.9 book - Part 10 of 210
The Ring programming language version 1.9 book - Part 10 of 210
 
GR8Conf 2011: Grails 1.4 Update by Peter Ledbrook
GR8Conf 2011: Grails 1.4 Update by Peter LedbrookGR8Conf 2011: Grails 1.4 Update by Peter Ledbrook
GR8Conf 2011: Grails 1.4 Update by Peter Ledbrook
 
Lecture06 methods for-making_data_structures_v2
Lecture06 methods for-making_data_structures_v2Lecture06 methods for-making_data_structures_v2
Lecture06 methods for-making_data_structures_v2
 
Solving real world problems with Hadoop
Solving real world problems with HadoopSolving real world problems with Hadoop
Solving real world problems with Hadoop
 
Object Oriented Solved Practice Programs C++ Exams
Object Oriented Solved Practice Programs C++ ExamsObject Oriented Solved Practice Programs C++ Exams
Object Oriented Solved Practice Programs C++ Exams
 
Refactoring Jdbc Programming
Refactoring Jdbc ProgrammingRefactoring Jdbc Programming
Refactoring Jdbc Programming
 
Introduction à dart
Introduction à dartIntroduction à dart
Introduction à dart
 
GoshawkDB: Making Time with Vector Clocks
GoshawkDB: Making Time with Vector ClocksGoshawkDB: Making Time with Vector Clocks
GoshawkDB: Making Time with Vector Clocks
 
M sc it_modelling_biological_systems2005
M sc it_modelling_biological_systems2005M sc it_modelling_biological_systems2005
M sc it_modelling_biological_systems2005
 
What makes a good bug report?
What makes a good bug report?What makes a good bug report?
What makes a good bug report?
 

En vedette

Quality and Software Design Patterns
Quality and Software Design PatternsQuality and Software Design Patterns
Quality and Software Design PatternsPtidej Team
 
Software Design Patterns in Theory
Software Design Patterns in TheorySoftware Design Patterns in Theory
Software Design Patterns in TheoryPtidej Team
 
Software Design Patterns in Practice
Software Design Patterns in PracticeSoftware Design Patterns in Practice
Software Design Patterns in PracticePtidej Team
 
AsianPLoP'14: How and Why Design Patterns Impact Quality and Future Challenges
AsianPLoP'14: How and Why Design Patterns Impact Quality and Future ChallengesAsianPLoP'14: How and Why Design Patterns Impact Quality and Future Challenges
AsianPLoP'14: How and Why Design Patterns Impact Quality and Future ChallengesPtidej Team
 

En vedette (9)

MSR Asia Summit
MSR Asia SummitMSR Asia Summit
MSR Asia Summit
 
Mribp13.ppt
Mribp13.pptMribp13.ppt
Mribp13.ppt
 
Wcre12b.ppt
Wcre12b.pptWcre12b.ppt
Wcre12b.ppt
 
Wcre13a.ppt
Wcre13a.pptWcre13a.ppt
Wcre13a.ppt
 
Quality and Software Design Patterns
Quality and Software Design PatternsQuality and Software Design Patterns
Quality and Software Design Patterns
 
Software Design Patterns in Theory
Software Design Patterns in TheorySoftware Design Patterns in Theory
Software Design Patterns in Theory
 
Wcre13b.ppt
Wcre13b.pptWcre13b.ppt
Wcre13b.ppt
 
Software Design Patterns in Practice
Software Design Patterns in PracticeSoftware Design Patterns in Practice
Software Design Patterns in Practice
 
AsianPLoP'14: How and Why Design Patterns Impact Quality and Future Challenges
AsianPLoP'14: How and Why Design Patterns Impact Quality and Future ChallengesAsianPLoP'14: How and Why Design Patterns Impact Quality and Future Challenges
AsianPLoP'14: How and Why Design Patterns Impact Quality and Future Challenges
 

Similaire à ICSM07.ppt

Clojure for Data Science
Clojure for Data ScienceClojure for Data Science
Clojure for Data ScienceMike Anderson
 
Getting started with Clojure
Getting started with ClojureGetting started with Clojure
Getting started with ClojureJohn Stevenson
 
JavaScript Editions ES7, ES8 and ES9 vs V8
JavaScript Editions ES7, ES8 and ES9 vs V8JavaScript Editions ES7, ES8 and ES9 vs V8
JavaScript Editions ES7, ES8 and ES9 vs V8Rafael Casuso Romate
 
Extracting Executable Transformations from Distilled Code Changes
Extracting Executable Transformations from Distilled Code ChangesExtracting Executable Transformations from Distilled Code Changes
Extracting Executable Transformations from Distilled Code Changesstevensreinout
 
Fuzzy logic and fuzzy time series edited
Fuzzy logic and fuzzy time series   editedFuzzy logic and fuzzy time series   edited
Fuzzy logic and fuzzy time series editedProf Dr S.M.Aqil Burney
 
Ekeko Technology Showdown at SoTeSoLa 2012
Ekeko Technology Showdown at SoTeSoLa 2012Ekeko Technology Showdown at SoTeSoLa 2012
Ekeko Technology Showdown at SoTeSoLa 2012Coen De Roover
 
Gabriele Petronella - FP for front-end development: should you care? - Codemo...
Gabriele Petronella - FP for front-end development: should you care? - Codemo...Gabriele Petronella - FP for front-end development: should you care? - Codemo...
Gabriele Petronella - FP for front-end development: should you care? - Codemo...Codemotion
 
Functional Operations - Susan Potter
Functional Operations - Susan PotterFunctional Operations - Susan Potter
Functional Operations - Susan Potterdistributed matters
 
Loops and functions in r
Loops and functions in rLoops and functions in r
Loops and functions in rmanikanta361
 
Clojure for Java developers - Stockholm
Clojure for Java developers - StockholmClojure for Java developers - Stockholm
Clojure for Java developers - StockholmJan Kronquist
 
A Recommender System for Refining Ekeko/X Transformation
A Recommender System for Refining Ekeko/X TransformationA Recommender System for Refining Ekeko/X Transformation
A Recommender System for Refining Ekeko/X TransformationCoen De Roover
 
Transfer Learning for Performance Analysis of Highly-Configurable Software
Transfer Learning for Performance Analysis of Highly-Configurable SoftwareTransfer Learning for Performance Analysis of Highly-Configurable Software
Transfer Learning for Performance Analysis of Highly-Configurable SoftwarePooyan Jamshidi
 
Structure and interpretation of computer programs modularity, objects, and ...
Structure and interpretation of computer programs   modularity, objects, and ...Structure and interpretation of computer programs   modularity, objects, and ...
Structure and interpretation of computer programs modularity, objects, and ...bdemchak
 
Clojure - A new Lisp
Clojure - A new LispClojure - A new Lisp
Clojure - A new Lispelliando dias
 

Similaire à ICSM07.ppt (20)

Icsm07.ppt
Icsm07.pptIcsm07.ppt
Icsm07.ppt
 
Clojure for Data Science
Clojure for Data ScienceClojure for Data Science
Clojure for Data Science
 
Getting started with Clojure
Getting started with ClojureGetting started with Clojure
Getting started with Clojure
 
JavaScript Editions ES7, ES8 and ES9 vs V8
JavaScript Editions ES7, ES8 and ES9 vs V8JavaScript Editions ES7, ES8 and ES9 vs V8
JavaScript Editions ES7, ES8 and ES9 vs V8
 
Extracting Executable Transformations from Distilled Code Changes
Extracting Executable Transformations from Distilled Code ChangesExtracting Executable Transformations from Distilled Code Changes
Extracting Executable Transformations from Distilled Code Changes
 
Clojure And Swing
Clojure And SwingClojure And Swing
Clojure And Swing
 
Fuzzy logic and fuzzy time series edited
Fuzzy logic and fuzzy time series   editedFuzzy logic and fuzzy time series   edited
Fuzzy logic and fuzzy time series edited
 
Pune Clojure Course Outline
Pune Clojure Course OutlinePune Clojure Course Outline
Pune Clojure Course Outline
 
Ekeko Technology Showdown at SoTeSoLa 2012
Ekeko Technology Showdown at SoTeSoLa 2012Ekeko Technology Showdown at SoTeSoLa 2012
Ekeko Technology Showdown at SoTeSoLa 2012
 
Gabriele Petronella - FP for front-end development: should you care? - Codemo...
Gabriele Petronella - FP for front-end development: should you care? - Codemo...Gabriele Petronella - FP for front-end development: should you care? - Codemo...
Gabriele Petronella - FP for front-end development: should you care? - Codemo...
 
Functional Operations - Susan Potter
Functional Operations - Susan PotterFunctional Operations - Susan Potter
Functional Operations - Susan Potter
 
Loops and functions in r
Loops and functions in rLoops and functions in r
Loops and functions in r
 
Enter The Matrix
Enter The MatrixEnter The Matrix
Enter The Matrix
 
Clojure for Java developers - Stockholm
Clojure for Java developers - StockholmClojure for Java developers - Stockholm
Clojure for Java developers - Stockholm
 
Clojure
ClojureClojure
Clojure
 
A Recommender System for Refining Ekeko/X Transformation
A Recommender System for Refining Ekeko/X TransformationA Recommender System for Refining Ekeko/X Transformation
A Recommender System for Refining Ekeko/X Transformation
 
55j7
55j755j7
55j7
 
Transfer Learning for Performance Analysis of Highly-Configurable Software
Transfer Learning for Performance Analysis of Highly-Configurable SoftwareTransfer Learning for Performance Analysis of Highly-Configurable Software
Transfer Learning for Performance Analysis of Highly-Configurable Software
 
Structure and interpretation of computer programs modularity, objects, and ...
Structure and interpretation of computer programs   modularity, objects, and ...Structure and interpretation of computer programs   modularity, objects, and ...
Structure and interpretation of computer programs modularity, objects, and ...
 
Clojure - A new Lisp
Clojure - A new LispClojure - A new Lisp
Clojure - A new Lisp
 

Plus de Ptidej Team

From IoT to Software Miniaturisation
From IoT to Software MiniaturisationFrom IoT to Software Miniaturisation
From IoT to Software MiniaturisationPtidej Team
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel BriandPtidej Team
 
Manel Abdellatif
Manel AbdellatifManel Abdellatif
Manel AbdellatifPtidej Team
 
Azadeh Kermansaravi
Azadeh KermansaraviAzadeh Kermansaravi
Azadeh KermansaraviPtidej Team
 
CSED - Manel Grichi
CSED - Manel GrichiCSED - Manel Grichi
CSED - Manel GrichiPtidej Team
 
Cristiano Politowski
Cristiano PolitowskiCristiano Politowski
Cristiano PolitowskiPtidej Team
 
Will io t trigger the next software crisis
Will io t trigger the next software crisisWill io t trigger the next software crisis
Will io t trigger the next software crisisPtidej Team
 
Thesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptThesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptPtidej Team
 
Thesis+of+nesrine+abdelkafi.ppt
Thesis+of+nesrine+abdelkafi.pptThesis+of+nesrine+abdelkafi.ppt
Thesis+of+nesrine+abdelkafi.pptPtidej Team
 

Plus de Ptidej Team (20)

From IoT to Software Miniaturisation
From IoT to Software MiniaturisationFrom IoT to Software Miniaturisation
From IoT to Software Miniaturisation
 
Presentation
PresentationPresentation
Presentation
 
Presentation
PresentationPresentation
Presentation
 
Presentation
PresentationPresentation
Presentation
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel Briand
 
Manel Abdellatif
Manel AbdellatifManel Abdellatif
Manel Abdellatif
 
Azadeh Kermansaravi
Azadeh KermansaraviAzadeh Kermansaravi
Azadeh Kermansaravi
 
Mouna Abidi
Mouna AbidiMouna Abidi
Mouna Abidi
 
CSED - Manel Grichi
CSED - Manel GrichiCSED - Manel Grichi
CSED - Manel Grichi
 
Cristiano Politowski
Cristiano PolitowskiCristiano Politowski
Cristiano Politowski
 
Will io t trigger the next software crisis
Will io t trigger the next software crisisWill io t trigger the next software crisis
Will io t trigger the next software crisis
 
MIPA
MIPAMIPA
MIPA
 
Thesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptThesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.ppt
 
Thesis+of+nesrine+abdelkafi.ppt
Thesis+of+nesrine+abdelkafi.pptThesis+of+nesrine+abdelkafi.ppt
Thesis+of+nesrine+abdelkafi.ppt
 
Medicine15.ppt
Medicine15.pptMedicine15.ppt
Medicine15.ppt
 
Qrs17b.ppt
Qrs17b.pptQrs17b.ppt
Qrs17b.ppt
 
Icpc11c.ppt
Icpc11c.pptIcpc11c.ppt
Icpc11c.ppt
 
Icsme16.ppt
Icsme16.pptIcsme16.ppt
Icsme16.ppt
 
Msr17a.ppt
Msr17a.pptMsr17a.ppt
Msr17a.ppt
 
Icsoc15.ppt
Icsoc15.pptIcsoc15.ppt
Icsoc15.ppt
 

Dernier

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Dernier (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

ICSM07.ppt

  • 1. Mining the Lexicon Used by Programmers during Software Evolution Giuliano Antoniol, Yann-Gaël Guéhéneuc, Ettore Merlo, Paolo Tonella From: École Polytechnique de Montréal, Canada University of Montreal, Canada FBK-IRST, Trento, Italy 1
  • 2. Is this one the class you are looking for? public class T01<T02,T03> extends T04<T02,T03> implements T05<T02,T03>, T06, T07 { public T03 m01(T02 x01, T03 x02) { if (x01 == null) return m02(x02); int x03 = m03(x01.m04()); int x04 = m05(x03, x05.x06); for (T08<T02,T03> x07 = x08[x04]; x07 != null; x07 = x07.x09) { T09 x10; if (x07.x11 == x03 && ((x10 = x07.x12) == x01 || x01.m06(x10))) { T03 x13 = x07.x14; x07.x14 = x02; x07.m07(this); return x13; } } x15++; m08(x03, x01, x02, x04); return null; } 2 }
  • 3. Is this one the class you are looking for? public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable { public V put(K key, V value) { if (key == null) return putForNullKey(value); int hash = hash(key.hashCode()); int i = indexFor(hash, table.length); for (Entry<K,V> e = table[i]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; } } modCount++; addEntry(hash, key, value, i); return null; } 3 }
  • 4. Self-documenting identifiers Good identifiers: provide concise clues on the semantics of labeled entities; save programmers from reading the entire code segment; speed up knowledge acquisition; support program understanding (code queries, grep, etc.). To some extent, we know how the structure of a program evolve. How does the lexicon of identifiers evolve? Corollary: When we teach programming, we should never let our students use names such as foo or bar (pippo, pluto) for any program entity. 4
  • 5. Research questions RQ1: How does the stability of the lexicon of identifiers compare to the stability of the program structure as the program evolves? RQ2: What is the frequency of changes to program entities (in particular renaming) due to identifier refactoring? 5
  • 6. Data model (Sub)System or (Sub)Directory class HashMap<K, V> { Entry<K, V> table[]; File Function V put(K key, V value) {…} name } struct lexicon Full lexicon: <hash, map, table, put, key, value> Class Attribute Function name Name: put struct Struct: <10, 1, 2, 31, 0, 0, 24> lexicon Lexicon: <0, 0, 0, 1, 1, 1> Attribute Name: table Struct: <1, …> Lexicon: <0, 0, 1, 0, 0, 0> 6
  • 7. Stability metrics For leaf entities, cosine similarity: StructSim(Ei, Ej) = <struct(Ei), struct(Ej)> / |struct(Ei)| |struct(Ej)| LexicalSim(Ei, Ej) = <lexicon(Ei), lexicon(Ej)> / |lexicon(Ei)| |lexicon(Ej)| Function StructSim(put, put’) = 0.998 Name: put LexicalSim(put, put’) = 1 Struct: <10, 1, 2, 31, 0, 0, 24> Lexicon: <0, 0, 0, 1, 1, 1> For container entities, average similarity Function Name: put’ Similarity between Struct: <11, 2, 1, 30, 0, 0, 22> corresponding entities in the Lexicon: <0, 0, 0, 1, 1, 1> history ► entity traceability required! 7
  • 8. Entity traceability By name: f() f() f() f() f() (1) Traceability by name fails By structure: (2) There is no entity g() in previous release (3) StructSim(f, g) ≥ T (=1) f() f() f() g() g() Renaming detected! 8
  • 9. Metrics and analysis RQ1 (struct vs. lexicon evolution): Null hypothesis: there is no statistically significant difference between the probability distribution of lexical vs. structural stability. Statistical test: non-parametric Wilcoxon paired test. RQ2 (frequency of renamings): RenFreq = DetectedRenamings / Total Entities 9
  • 10. Subject systems System Language Size Versions Identifiers Eclipse Java 2.9 MLOC 19 124187 Mozilla C++ 4.4 MLOC 24 55244 CERN/Alice C++ 0.825 MLOC 13 9002 10
  • 11. Stability 0.75 0.80 0.85 0.90 0.95 1.00 1.0−2.0 2.0.1 2.0.2 2.1 2.1.1 2.1.2 3.0 3.0.1 3.0.2 3.1 3.1.1 3.1.2 3.2 3.2.1 3.3M1 P-value < 0.05 3.3M2 for every release 3.3M3 Stability plot: Eclipse 3.3M4 11
  • 12. Stability 0.980 0.985 0.990 0.995 1.000 1.0−1.1 1.2 1.2.1 1.3 1.4 1.5 1.5a 1.5b 1.6 1.6a 1.6b 1.7 1.7a 1.7b 1.7.2 1.7.3 1.7.5 1.7.6 1.7.7 P-value < 0.05 1.7.8 for every release 1.7.10 1.7.11 Stability plot: Mozilla 1.7.12 12
  • 13. Stability 0.70 0.75 0.80 0.85 0.90 0.95 1.00 3.01−3.02 3.03 3.04 3.05 3.06 3.07 3.08 3.09 4.01 4.02 P-value < 0.05 for every release 4.03 Stability plot: Alice 4.04 13
  • 14. Renaming Eclipse (Java): AvgRenFreq = 7 / 106760 = 0.000065 Mozilla (C++): AvgRenFreq = 0 / 51981 = 0 Alice (C++): AvgRenFreq = 0 / 6736 = 0 14
  • 15. Summary RQ1 (struct vs. lexicon evolution): Lexical and structural changes have different distributions over time; they probably obey different rules. Lexicon is always more stable than structure. Both structural and lexical stabilities tend to increase over time and tend to have correlated instabilities. RQ2 (frequency of renamings): Renamings are rare during the evolution of a software system. 15
  • 16. Discussion (our interpretations) • A different change process holds for lexicon and structure. • Programmers are generally reluctant to change the lexicon. Some possible reasons: – Optimistically, there is no need to do it (domain perfectly modeled by lexicon). – High cognitive burden associated with this kind of change. – No dedicated tool available. • The development environment seems to have an influence on the evolution of the lexicon. A renaming tool available in the IDE may help (Java vs. C++ in our study). – Other tools that may help: glossaries, cross-referencing tools, abbreviation expansion tools, documentation tools (possible using ontologies). Corollary: A program written with a bad lexicon (foo, bar, pippo, pluto and the like) tends to keep its poor identifiers forever. 16 Programmers must adapt to them; the inverse rarely happens.
  • 17. Conclusions and future work RQ1: The lexicon is more stable than the structure. RQ2: Identifier restructuring is rare. Drafting a future work agenda: • The lexicon of a program represents a substantial investment for a company. • However, almost no support is available to preserve and increase such value over time. • Research on techniques and tools for program lexicon analysis and manipulation is strongly needed. 17