SlideShare une entreprise Scribd logo
1  sur  55
Automatically Documenting Program Changes Ray Buse Wes Weimer De  ltaDoc diffs ASE  9.22.2010 Antwerp, Belgium Commit Messages
Change
So Much Change
So Much Change
So Much Change
Managing Change Developers Evaluation Managers
Understanding change remains difficult Peter Hallam. What Do Programmers Really Do Anyway? Microsoft Developer Network (MSDN) – C# Compiler. Jan 2006.
Diff 19c19 , 22 < else return ""; --- > else return pageParts[0]; > // else return ""; JabRef Revision 3066
Side-by-side
Commit Messages Free-form text which may describe What the change was. Why the change was made. Phex 3542 Minor change jfreechart rev 3405     (start): Changed from Date to long,     (end): Likewise,     (getStartMillis): New method,     (getEndMillis): Likewise,     (getStart): Returns new date instance,     (getEnd): Likewise. Jabref rev 2917 Fixed NullPointerException when downloading external file and file directory is undefined.
Toby, Going forward, could I ask you to be more descriptive in your commit messages? Ideally you should state what you've changed and also why (unless it's obvious)... I know you're busy and this takes more time, but it will help anyone who looks through the log ... http://lists.macosforge.org/pipermail/macports-dev/2009-June/008881.html Subject: An appeal for more descriptive commit messages I know there is a lot going on but please can we be a bit more descriptive when committing changes. Recent log messages have included: "some cleanup" "more external service work" "Fixed a bug in wiring" which are a lot less informative than others... http://osdir.com/ml/apache.webservices.tuscany.devel/2006-02/msg00227.html Sorry to be a pain in the neck about this, but could we please use more descriptive commit messages? I do try to read the commit emails, but since the vast majority of comments are "CAY-XYZ", I can't really tell what's going on unless I then look it up. http://osdir.com/ml/java.cayenne.devel/2006-10/msg00044.html
DeltaDoc Describes the observable EFFECT of a change Conditions that trigger the changed code How the change impacts functional behavior and program state Symbolic Execution Summarization Transformations
DeltaDoc Diff 19c19 , 22 < else return ""; --- > else return pageParts[0]; > // else return ""; DeltaDoc When calling LastPageformat(String s)     If s is not nulland s.split ("[-]+").length != 2 returns.split ("[-]+")[0] instead of ""
DeltaDoc Commit message Temporary removed the trade routes from the game menu. DeltaDoc When calling FreeColMenuBarbuildOrdersMenu   No longer     call JMenu.add(getMenuItem("assignTradeRouteAction")) When calling FreeColMenuBarbuildViewMenu   No longer     call JMenu.add(getMenuItem("tradeRouteAction")) Freecol rev 2085
Documenting Change Why was it changed? What was changed?
Commit Messages Why? What? Current Practice
DeltaDoc DeltaDoc Commit Messages Why? What?
Hypothesis This area is large. DeltaDoc Commit Messages Why? What?
With DeltaDoc DeltaDoc Commit Messages Why? What?
The rest of this talk Approach: How DeltaDoc works. Evaluation: Comparing DeltaDoc to Commit messages.
DeltaDoc Architecture
DeltaDoc Architecture Compute symbolic path predicates for each statement.
DeltaDoc Architecture When X,   Do Y Instead of Z Identify statements that have been added, removed, or have a different predicate.
DeltaDoc Architecture Apply summarization transformations until result is sufficiently concise.
Predicate Generation StringsayHello(String name)  { String ret = “”; if(name != null)   { if(System.Lang == ENG)      ret = ret + “Hello ”; else      ret = ret + “Bonjour ”;     ret = ret + name;  } returnret; }
Predicate Generation Enumerate loop-free control flow paths. String ret = “”; name != null System.Lang == ENG ret = ret + “Hello ”; ret = ret + name; returnret; String ret = “”; name != null System.Lang != ENG ret = ret + “Bonjour ”; ret = ret + name; returnret; String ret = “”; name == null returnret;
Predicate Generation Symbolic Execution
Predicate Generation Symbolic Execution
Predicate Generation Symbolic Execution
Store important stmts
Change 31 StringsayHello(String name)  { String ret = “”; if(name != null)   { if(System.Lang == ENG)      ret = ret + “Hello ”; else      ret = ret + “Bonjour ”;     ret = ret + name;  } returnret; } StringsayHello(String name)  { String ret = “”; if(name != null)   { if(System.Lang == ENG)      ret = ret + “Hi ”; else      ret = ret + “Bonjour ”;     ret = ret + name;  } returnret; }
Predicate Generation
Enumerate Changes
Generate Documentation When calling sayHello(String name) Ifname != null ANDSystem.Lang == ENG return“Hi ” + name Instead of“Hello ” + name
Summarization Remove irrelevant path predicates. If s != nulland a is true and b is true and c is true returns If s != null  returns
Summarization Re-arrange terms Simplification Readability enhancements based on Java idioms If P andQ,             DoX  IfR,  DoY If P andQ, DoX IfP andQ andR, DoY
Evaluation Quality Content Size
Benchmarks
Size Comparison Diffs are about 35 lines DeltaDocs are about 9 lines Commit messages are always less than 10 lines
Content Comparison How large is this area? DeltaDoc Commit Messages Why? What?
Content Comparison DeltaDoc CommitMessages
Content Comparison DeltaDoc Commit Messages RelationalForm RelationalForm
Relational Form
Relational Form Example has an insufficient amount of gold getPriceForBuilding() > getOwner().getGold() getPriceForBuilding() > getOwner().getGold() ?    >   gold
Score Metric Conservatively assume only relations from commit messages are important. Reward precision. Used 16 human annotators to validate. Score of 0.5 indicates that the DeltaDoc contained all the information in the commit message.
Example Score = 0.5 Commit Message no need to call clear() DeltaDoc When calling PdfContentByte reset()     If stateList .isEmpty(),       No longer call stateList .clear() iText Rev 3837
Example Score > 0.5 Commit Message Commented unused constant DeltaDoc removed field : EuropePanel : int TITLE_FONT_SIZE Freecol rev 2054
Example Score < 0.5 Commit Message Fixed bug: content selector for ‘editor‘ field uses ‘,' instead of ‘and' as delimiter. DeltaDoc When calling EntryEditorgetExtra()     If ed.getFieldName().equals("editor")         call contentSelectors.add(FieldContentSelector) JabRef Rev 3111
Results
Results
Results About 89% coverage. DeltaDoc Commit Messages Why? What?
Qualitative Evaluation “very useful" “highly useful” "would be a great supplement" “definitely a useful supplement" “can help make the logic clear” “often easier to understand“ “more accurate” “easy to read" “provides more information"
DeltaDoc Limitations Intraprocedural Handling of loops not fully-precise Less precise for large changes Does not address reason for change
DeltaDoc Advantages Cheap Can be computed in about a second on average. Suitable for quick adoption Can supplement or replace many existing commit messages. Structured Suitable for search. Reliable
Questions? DeltaDoc Commit Messages Why? What?

Contenu connexe

Tendances

Let the alpakka pull your stream
Let the alpakka pull your streamLet the alpakka pull your stream
Let the alpakka pull your streamEnno Runne
 
Jetpack Compose a nova forma de implementar UI no Android
Jetpack Compose a nova forma de implementar UI no AndroidJetpack Compose a nova forma de implementar UI no Android
Jetpack Compose a nova forma de implementar UI no AndroidNelson Glauber Leal
 
Сценарій свята "Моя бабуся".Джереджа Р.О.
Сценарій свята "Моя бабуся".Джереджа Р.О.Сценарій свята "Моя бабуся".Джереджа Р.О.
Сценарій свята "Моя бабуся".Джереджа Р.О.dzheredzha_raisa
 
Τελική 'Εκθεση Αξιολόγησης Εκπαιδευτικού 'Εργου 2013-14
Τελική 'Εκθεση Αξιολόγησης Εκπαιδευτικού 'Εργου 2013-14Τελική 'Εκθεση Αξιολόγησης Εκπαιδευτικού 'Εργου 2013-14
Τελική 'Εκθεση Αξιολόγησης Εκπαιδευτικού 'Εργου 2013-14laskosd
 
[C++ Korea] Effective Modern C++ Study item14 16 +신촌
[C++ Korea] Effective Modern C++ Study item14 16 +신촌[C++ Korea] Effective Modern C++ Study item14 16 +신촌
[C++ Korea] Effective Modern C++ Study item14 16 +신촌Seok-joon Yun
 
Grafici na pravi paralelni so x ili y oska
Grafici na pravi paralelni so x ili y oskaGrafici na pravi paralelni so x ili y oska
Grafici na pravi paralelni so x ili y oskaSneze Zlatkovska
 
80542458 безбедност-и-здравље-на-раду
80542458 безбедност-и-здравље-на-раду80542458 безбедност-и-здравље-на-раду
80542458 безбедност-и-здравље-на-радуDaniel Paskov
 
Ενημέρωση γονέων για τον τρόπο λειτουργίας του Γυμνασίου
Ενημέρωση γονέων για τον τρόπο λειτουργίας του ΓυμνασίουΕνημέρωση γονέων για τον τρόπο λειτουργίας του Γυμνασίου
Ενημέρωση γονέων για τον τρόπο λειτουργίας του ΓυμνασίουΒασίλης Μπαρέκος
 
Το Σχολείο του Μέλλοντος - Ένα Σχολείο Ανοιχτό στην Κοινωνία
Το Σχολείο του Μέλλοντος - Ένα Σχολείο Ανοιχτό στην ΚοινωνίαΤο Σχολείο του Μέλλοντος - Ένα Σχολείο Ανοιχτό στην Κοινωνία
Το Σχολείο του Μέλλοντος - Ένα Σχολείο Ανοιχτό στην ΚοινωνίαSofoklis Sotiriou
 
Automatizacija paljenja brodskog parnog kotla
Automatizacija paljenja brodskog parnog kotlaAutomatizacija paljenja brodskog parnog kotla
Automatizacija paljenja brodskog parnog kotlaToni Ku?i?
 
Porting Motif Applications to Qt - Webinar
Porting Motif Applications to Qt - WebinarPorting Motif Applications to Qt - Webinar
Porting Motif Applications to Qt - WebinarICS
 
RxSwift 활용하기 - Let'Swift 2017
RxSwift 활용하기 - Let'Swift 2017RxSwift 활용하기 - Let'Swift 2017
RxSwift 활용하기 - Let'Swift 2017Wanbok Choi
 
Jeux pour l’intérieur pour l’animation d’un groupe en colonie de vacances
Jeux pour l’intérieur pour l’animation d’un groupe en colonie de vacancesJeux pour l’intérieur pour l’animation d’un groupe en colonie de vacances
Jeux pour l’intérieur pour l’animation d’un groupe en colonie de vacancesIzeedor
 
Zgjidhja e detyrave te kapitullit “Fizika Relativiste” te libri “Fizika 12”, ...
Zgjidhja e detyrave te kapitullit “Fizika Relativiste” te libri “Fizika 12”, ...Zgjidhja e detyrave te kapitullit “Fizika Relativiste” te libri “Fizika 12”, ...
Zgjidhja e detyrave te kapitullit “Fizika Relativiste” te libri “Fizika 12”, ...Liridon Muqaku
 
Regular Expressions Cheat Sheet
Regular Expressions Cheat SheetRegular Expressions Cheat Sheet
Regular Expressions Cheat SheetAkash Bisariya
 
петар пан квиз
петар пан квизпетар пан квиз
петар пан квизZivko Petrovski
 
Serializing EMF models with Xtext
Serializing EMF models with XtextSerializing EMF models with Xtext
Serializing EMF models with Xtextmeysholdt
 
Matematika за сите теми за 5 одд print
Matematika за сите теми за 5 одд print Matematika за сите теми за 5 одд print
Matematika за сите теми за 5 одд print Isabelle Parker
 

Tendances (20)

Let the alpakka pull your stream
Let the alpakka pull your streamLet the alpakka pull your stream
Let the alpakka pull your stream
 
Jetpack Compose a nova forma de implementar UI no Android
Jetpack Compose a nova forma de implementar UI no AndroidJetpack Compose a nova forma de implementar UI no Android
Jetpack Compose a nova forma de implementar UI no Android
 
Сценарій свята "Моя бабуся".Джереджа Р.О.
Сценарій свята "Моя бабуся".Джереджа Р.О.Сценарій свята "Моя бабуся".Джереджа Р.О.
Сценарій свята "Моя бабуся".Джереджа Р.О.
 
Τελική 'Εκθεση Αξιολόγησης Εκπαιδευτικού 'Εργου 2013-14
Τελική 'Εκθεση Αξιολόγησης Εκπαιδευτικού 'Εργου 2013-14Τελική 'Εκθεση Αξιολόγησης Εκπαιδευτικού 'Εργου 2013-14
Τελική 'Εκθεση Αξιολόγησης Εκπαιδευτικού 'Εργου 2013-14
 
[C++ Korea] Effective Modern C++ Study item14 16 +신촌
[C++ Korea] Effective Modern C++ Study item14 16 +신촌[C++ Korea] Effective Modern C++ Study item14 16 +신촌
[C++ Korea] Effective Modern C++ Study item14 16 +신촌
 
Grafici na pravi paralelni so x ili y oska
Grafici na pravi paralelni so x ili y oskaGrafici na pravi paralelni so x ili y oska
Grafici na pravi paralelni so x ili y oska
 
80542458 безбедност-и-здравље-на-раду
80542458 безбедност-и-здравље-на-раду80542458 безбедност-и-здравље-на-раду
80542458 безбедност-и-здравље-на-раду
 
Ενημέρωση γονέων για τον τρόπο λειτουργίας του Γυμνασίου
Ενημέρωση γονέων για τον τρόπο λειτουργίας του ΓυμνασίουΕνημέρωση γονέων για τον τρόπο λειτουργίας του Γυμνασίου
Ενημέρωση γονέων για τον τρόπο λειτουργίας του Γυμνασίου
 
Το Σχολείο του Μέλλοντος - Ένα Σχολείο Ανοιχτό στην Κοινωνία
Το Σχολείο του Μέλλοντος - Ένα Σχολείο Ανοιχτό στην ΚοινωνίαΤο Σχολείο του Μέλλοντος - Ένα Σχολείο Ανοιχτό στην Κοινωνία
Το Σχολείο του Μέλλοντος - Ένα Σχολείο Ανοιχτό στην Κοινωνία
 
Automatizacija paljenja brodskog parnog kotla
Automatizacija paljenja brodskog parnog kotlaAutomatizacija paljenja brodskog parnog kotla
Automatizacija paljenja brodskog parnog kotla
 
Porting Motif Applications to Qt - Webinar
Porting Motif Applications to Qt - WebinarPorting Motif Applications to Qt - Webinar
Porting Motif Applications to Qt - Webinar
 
RxSwift 활용하기 - Let'Swift 2017
RxSwift 활용하기 - Let'Swift 2017RxSwift 활용하기 - Let'Swift 2017
RxSwift 활용하기 - Let'Swift 2017
 
Jeux pour l’intérieur pour l’animation d’un groupe en colonie de vacances
Jeux pour l’intérieur pour l’animation d’un groupe en colonie de vacancesJeux pour l’intérieur pour l’animation d’un groupe en colonie de vacances
Jeux pour l’intérieur pour l’animation d’un groupe en colonie de vacances
 
Ερ. Εργασία στην Τεχνολογία Α ΕΠΑΛ
Ερ. Εργασία στην Τεχνολογία Α ΕΠΑΛΕρ. Εργασία στην Τεχνολογία Α ΕΠΑΛ
Ερ. Εργασία στην Τεχνολογία Α ΕΠΑΛ
 
Zgjidhja e detyrave te kapitullit “Fizika Relativiste” te libri “Fizika 12”, ...
Zgjidhja e detyrave te kapitullit “Fizika Relativiste” te libri “Fizika 12”, ...Zgjidhja e detyrave te kapitullit “Fizika Relativiste” te libri “Fizika 12”, ...
Zgjidhja e detyrave te kapitullit “Fizika Relativiste” te libri “Fizika 12”, ...
 
Regular Expressions Cheat Sheet
Regular Expressions Cheat SheetRegular Expressions Cheat Sheet
Regular Expressions Cheat Sheet
 
Περιγραφική αξιολόγηση
Περιγραφική αξιολόγησηΠεριγραφική αξιολόγηση
Περιγραφική αξιολόγηση
 
петар пан квиз
петар пан квизпетар пан квиз
петар пан квиз
 
Serializing EMF models with Xtext
Serializing EMF models with XtextSerializing EMF models with Xtext
Serializing EMF models with Xtext
 
Matematika за сите теми за 5 одд print
Matematika за сите теми за 5 одд print Matematika за сите теми за 5 одд print
Matematika за сите теми за 5 одд print
 

En vedette

Automatically Describing Program Structure and Behavior (PhD Defense)
Automatically Describing Program Structure and Behavior (PhD Defense)Automatically Describing Program Structure and Behavior (PhD Defense)
Automatically Describing Program Structure and Behavior (PhD Defense)Ray Buse
 
Documentation Inference for Exceptions
Documentation Inference for ExceptionsDocumentation Inference for Exceptions
Documentation Inference for ExceptionsRay Buse
 
PhD Proposal talk
PhD Proposal talkPhD Proposal talk
PhD Proposal talkRay Buse
 
The Road Not Taken: Estimating Path Execution Frequency Statically
The Road Not Taken: Estimating Path Execution Frequency StaticallyThe Road Not Taken: Estimating Path Execution Frequency Statically
The Road Not Taken: Estimating Path Execution Frequency StaticallyRay Buse
 
Synthesizing API Usage Examples
Synthesizing API Usage Examples Synthesizing API Usage Examples
Synthesizing API Usage Examples Ray Buse
 
A Metric for Code Readability
A Metric for Code ReadabilityA Metric for Code Readability
A Metric for Code ReadabilityRay Buse
 
MSR End of Internship Talk
MSR End of Internship TalkMSR End of Internship Talk
MSR End of Internship TalkRay Buse
 
Analytics for Software Development
Analytics for Software DevelopmentAnalytics for Software Development
Analytics for Software DevelopmentRay Buse
 
Information Needs for Software Development Analytics
Information Needs for Software Development AnalyticsInformation Needs for Software Development Analytics
Information Needs for Software Development AnalyticsRay Buse
 

En vedette (9)

Automatically Describing Program Structure and Behavior (PhD Defense)
Automatically Describing Program Structure and Behavior (PhD Defense)Automatically Describing Program Structure and Behavior (PhD Defense)
Automatically Describing Program Structure and Behavior (PhD Defense)
 
Documentation Inference for Exceptions
Documentation Inference for ExceptionsDocumentation Inference for Exceptions
Documentation Inference for Exceptions
 
PhD Proposal talk
PhD Proposal talkPhD Proposal talk
PhD Proposal talk
 
The Road Not Taken: Estimating Path Execution Frequency Statically
The Road Not Taken: Estimating Path Execution Frequency StaticallyThe Road Not Taken: Estimating Path Execution Frequency Statically
The Road Not Taken: Estimating Path Execution Frequency Statically
 
Synthesizing API Usage Examples
Synthesizing API Usage Examples Synthesizing API Usage Examples
Synthesizing API Usage Examples
 
A Metric for Code Readability
A Metric for Code ReadabilityA Metric for Code Readability
A Metric for Code Readability
 
MSR End of Internship Talk
MSR End of Internship TalkMSR End of Internship Talk
MSR End of Internship Talk
 
Analytics for Software Development
Analytics for Software DevelopmentAnalytics for Software Development
Analytics for Software Development
 
Information Needs for Software Development Analytics
Information Needs for Software Development AnalyticsInformation Needs for Software Development Analytics
Information Needs for Software Development Analytics
 

Similaire à Automatically Documenting Program Changes

Cleaning your architecture with android architecture components
Cleaning your architecture with android architecture componentsCleaning your architecture with android architecture components
Cleaning your architecture with android architecture componentsDebora Gomez Bertoli
 
Clean code _v2003
 Clean code _v2003 Clean code _v2003
Clean code _v2003R696
 
Evolving a Clean, Pragmatic Architecture at JBCNConf 2019
Evolving a Clean, Pragmatic Architecture at JBCNConf 2019Evolving a Clean, Pragmatic Architecture at JBCNConf 2019
Evolving a Clean, Pragmatic Architecture at JBCNConf 2019Victor Rentea
 
Midiendo la calidad de código en WTF/Min (Revisado EUI Abril 2014)
Midiendo la calidad de código en WTF/Min (Revisado EUI Abril 2014)Midiendo la calidad de código en WTF/Min (Revisado EUI Abril 2014)
Midiendo la calidad de código en WTF/Min (Revisado EUI Abril 2014)David Gómez García
 
Ida python intro
Ida python introIda python intro
Ida python intro小静 安
 
Can't Dance The Lambda
Can't Dance The LambdaCan't Dance The Lambda
Can't Dance The LambdaTogakangaroo
 
C:\Fakepath\Combating Software Entropy 2
C:\Fakepath\Combating Software Entropy 2C:\Fakepath\Combating Software Entropy 2
C:\Fakepath\Combating Software Entropy 2Hammad Rajjoub
 
C:\Fakepath\Combating Software Entropy 2
C:\Fakepath\Combating Software Entropy 2C:\Fakepath\Combating Software Entropy 2
C:\Fakepath\Combating Software Entropy 2Hammad Rajjoub
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development Open Party
 
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache FlinkMaximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache FlinkFlink Forward
 
Core .NET Framework 4.0 Enhancements
Core .NET Framework 4.0 EnhancementsCore .NET Framework 4.0 Enhancements
Core .NET Framework 4.0 EnhancementsRobert MacLean
 
Go 1.10 Release Party - PDX Go
Go 1.10 Release Party - PDX GoGo 1.10 Release Party - PDX Go
Go 1.10 Release Party - PDX GoRodolfo Carvalho
 
닷넷 개발자를 위한 패턴이야기
닷넷 개발자를 위한 패턴이야기닷넷 개발자를 위한 패턴이야기
닷넷 개발자를 위한 패턴이야기YoungSu Son
 
The Ring programming language version 1.6 book - Part 184 of 189
The Ring programming language version 1.6 book - Part 184 of 189The Ring programming language version 1.6 book - Part 184 of 189
The Ring programming language version 1.6 book - Part 184 of 189Mahmoud Samir Fayed
 
Category theory, Monads, and Duality in the world of (BIG) Data
Category theory, Monads, and Duality in the world of (BIG) DataCategory theory, Monads, and Duality in the world of (BIG) Data
Category theory, Monads, and Duality in the world of (BIG) Datagreenwop
 
The Ring programming language version 1.8 book - Part 95 of 202
The Ring programming language version 1.8 book - Part 95 of 202The Ring programming language version 1.8 book - Part 95 of 202
The Ring programming language version 1.8 book - Part 95 of 202Mahmoud Samir Fayed
 
The Ring programming language version 1.7 book - Part 7 of 196
The Ring programming language version 1.7 book - Part 7 of 196The Ring programming language version 1.7 book - Part 7 of 196
The Ring programming language version 1.7 book - Part 7 of 196Mahmoud Samir Fayed
 
Ruslan Platonov - Transactions
Ruslan Platonov - TransactionsRuslan Platonov - Transactions
Ruslan Platonov - TransactionsDmitry Buzdin
 
Visual studio 2008
Visual studio 2008Visual studio 2008
Visual studio 2008Luis Enrique
 

Similaire à Automatically Documenting Program Changes (20)

Cleaning your architecture with android architecture components
Cleaning your architecture with android architecture componentsCleaning your architecture with android architecture components
Cleaning your architecture with android architecture components
 
Clean code _v2003
 Clean code _v2003 Clean code _v2003
Clean code _v2003
 
Evolving a Clean, Pragmatic Architecture at JBCNConf 2019
Evolving a Clean, Pragmatic Architecture at JBCNConf 2019Evolving a Clean, Pragmatic Architecture at JBCNConf 2019
Evolving a Clean, Pragmatic Architecture at JBCNConf 2019
 
Midiendo la calidad de código en WTF/Min (Revisado EUI Abril 2014)
Midiendo la calidad de código en WTF/Min (Revisado EUI Abril 2014)Midiendo la calidad de código en WTF/Min (Revisado EUI Abril 2014)
Midiendo la calidad de código en WTF/Min (Revisado EUI Abril 2014)
 
Ida python intro
Ida python introIda python intro
Ida python intro
 
Can't Dance The Lambda
Can't Dance The LambdaCan't Dance The Lambda
Can't Dance The Lambda
 
C:\Fakepath\Combating Software Entropy 2
C:\Fakepath\Combating Software Entropy 2C:\Fakepath\Combating Software Entropy 2
C:\Fakepath\Combating Software Entropy 2
 
C:\Fakepath\Combating Software Entropy 2
C:\Fakepath\Combating Software Entropy 2C:\Fakepath\Combating Software Entropy 2
C:\Fakepath\Combating Software Entropy 2
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development
 
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache FlinkMaximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
 
Core .NET Framework 4.0 Enhancements
Core .NET Framework 4.0 EnhancementsCore .NET Framework 4.0 Enhancements
Core .NET Framework 4.0 Enhancements
 
Go 1.10 Release Party - PDX Go
Go 1.10 Release Party - PDX GoGo 1.10 Release Party - PDX Go
Go 1.10 Release Party - PDX Go
 
닷넷 개발자를 위한 패턴이야기
닷넷 개발자를 위한 패턴이야기닷넷 개발자를 위한 패턴이야기
닷넷 개발자를 위한 패턴이야기
 
The Ring programming language version 1.6 book - Part 184 of 189
The Ring programming language version 1.6 book - Part 184 of 189The Ring programming language version 1.6 book - Part 184 of 189
The Ring programming language version 1.6 book - Part 184 of 189
 
Refactoring
RefactoringRefactoring
Refactoring
 
Category theory, Monads, and Duality in the world of (BIG) Data
Category theory, Monads, and Duality in the world of (BIG) DataCategory theory, Monads, and Duality in the world of (BIG) Data
Category theory, Monads, and Duality in the world of (BIG) Data
 
The Ring programming language version 1.8 book - Part 95 of 202
The Ring programming language version 1.8 book - Part 95 of 202The Ring programming language version 1.8 book - Part 95 of 202
The Ring programming language version 1.8 book - Part 95 of 202
 
The Ring programming language version 1.7 book - Part 7 of 196
The Ring programming language version 1.7 book - Part 7 of 196The Ring programming language version 1.7 book - Part 7 of 196
The Ring programming language version 1.7 book - Part 7 of 196
 
Ruslan Platonov - Transactions
Ruslan Platonov - TransactionsRuslan Platonov - Transactions
Ruslan Platonov - Transactions
 
Visual studio 2008
Visual studio 2008Visual studio 2008
Visual studio 2008
 

Dernier

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 

Dernier (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 

Automatically Documenting Program Changes

  • 1. Automatically Documenting Program Changes Ray Buse Wes Weimer De ltaDoc diffs ASE 9.22.2010 Antwerp, Belgium Commit Messages
  • 6. Managing Change Developers Evaluation Managers
  • 7. Understanding change remains difficult Peter Hallam. What Do Programmers Really Do Anyway? Microsoft Developer Network (MSDN) – C# Compiler. Jan 2006.
  • 8. Diff 19c19 , 22 < else return ""; --- > else return pageParts[0]; > // else return ""; JabRef Revision 3066
  • 10. Commit Messages Free-form text which may describe What the change was. Why the change was made. Phex 3542 Minor change jfreechart rev 3405 (start): Changed from Date to long, (end): Likewise, (getStartMillis): New method, (getEndMillis): Likewise, (getStart): Returns new date instance, (getEnd): Likewise. Jabref rev 2917 Fixed NullPointerException when downloading external file and file directory is undefined.
  • 11. Toby, Going forward, could I ask you to be more descriptive in your commit messages? Ideally you should state what you've changed and also why (unless it's obvious)... I know you're busy and this takes more time, but it will help anyone who looks through the log ... http://lists.macosforge.org/pipermail/macports-dev/2009-June/008881.html Subject: An appeal for more descriptive commit messages I know there is a lot going on but please can we be a bit more descriptive when committing changes. Recent log messages have included: "some cleanup" "more external service work" "Fixed a bug in wiring" which are a lot less informative than others... http://osdir.com/ml/apache.webservices.tuscany.devel/2006-02/msg00227.html Sorry to be a pain in the neck about this, but could we please use more descriptive commit messages? I do try to read the commit emails, but since the vast majority of comments are "CAY-XYZ", I can't really tell what's going on unless I then look it up. http://osdir.com/ml/java.cayenne.devel/2006-10/msg00044.html
  • 12. DeltaDoc Describes the observable EFFECT of a change Conditions that trigger the changed code How the change impacts functional behavior and program state Symbolic Execution Summarization Transformations
  • 13. DeltaDoc Diff 19c19 , 22 < else return ""; --- > else return pageParts[0]; > // else return ""; DeltaDoc When calling LastPageformat(String s) If s is not nulland s.split ("[-]+").length != 2 returns.split ("[-]+")[0] instead of ""
  • 14. DeltaDoc Commit message Temporary removed the trade routes from the game menu. DeltaDoc When calling FreeColMenuBarbuildOrdersMenu No longer call JMenu.add(getMenuItem("assignTradeRouteAction")) When calling FreeColMenuBarbuildViewMenu No longer call JMenu.add(getMenuItem("tradeRouteAction")) Freecol rev 2085
  • 15. Documenting Change Why was it changed? What was changed?
  • 16. Commit Messages Why? What? Current Practice
  • 17. DeltaDoc DeltaDoc Commit Messages Why? What?
  • 18. Hypothesis This area is large. DeltaDoc Commit Messages Why? What?
  • 19. With DeltaDoc DeltaDoc Commit Messages Why? What?
  • 20. The rest of this talk Approach: How DeltaDoc works. Evaluation: Comparing DeltaDoc to Commit messages.
  • 22. DeltaDoc Architecture Compute symbolic path predicates for each statement.
  • 23. DeltaDoc Architecture When X, Do Y Instead of Z Identify statements that have been added, removed, or have a different predicate.
  • 24. DeltaDoc Architecture Apply summarization transformations until result is sufficiently concise.
  • 25. Predicate Generation StringsayHello(String name) { String ret = “”; if(name != null) { if(System.Lang == ENG) ret = ret + “Hello ”; else ret = ret + “Bonjour ”; ret = ret + name; } returnret; }
  • 26. Predicate Generation Enumerate loop-free control flow paths. String ret = “”; name != null System.Lang == ENG ret = ret + “Hello ”; ret = ret + name; returnret; String ret = “”; name != null System.Lang != ENG ret = ret + “Bonjour ”; ret = ret + name; returnret; String ret = “”; name == null returnret;
  • 31. Change 31 StringsayHello(String name) { String ret = “”; if(name != null) { if(System.Lang == ENG) ret = ret + “Hello ”; else ret = ret + “Bonjour ”; ret = ret + name; } returnret; } StringsayHello(String name) { String ret = “”; if(name != null) { if(System.Lang == ENG) ret = ret + “Hi ”; else ret = ret + “Bonjour ”; ret = ret + name; } returnret; }
  • 34. Generate Documentation When calling sayHello(String name) Ifname != null ANDSystem.Lang == ENG return“Hi ” + name Instead of“Hello ” + name
  • 35. Summarization Remove irrelevant path predicates. If s != nulland a is true and b is true and c is true returns If s != null returns
  • 36. Summarization Re-arrange terms Simplification Readability enhancements based on Java idioms If P andQ, DoX IfR, DoY If P andQ, DoX IfP andQ andR, DoY
  • 39. Size Comparison Diffs are about 35 lines DeltaDocs are about 9 lines Commit messages are always less than 10 lines
  • 40. Content Comparison How large is this area? DeltaDoc Commit Messages Why? What?
  • 41. Content Comparison DeltaDoc CommitMessages
  • 42. Content Comparison DeltaDoc Commit Messages RelationalForm RelationalForm
  • 44. Relational Form Example has an insufficient amount of gold getPriceForBuilding() > getOwner().getGold() getPriceForBuilding() > getOwner().getGold() ? > gold
  • 45. Score Metric Conservatively assume only relations from commit messages are important. Reward precision. Used 16 human annotators to validate. Score of 0.5 indicates that the DeltaDoc contained all the information in the commit message.
  • 46. Example Score = 0.5 Commit Message no need to call clear() DeltaDoc When calling PdfContentByte reset() If stateList .isEmpty(), No longer call stateList .clear() iText Rev 3837
  • 47. Example Score > 0.5 Commit Message Commented unused constant DeltaDoc removed field : EuropePanel : int TITLE_FONT_SIZE Freecol rev 2054
  • 48. Example Score < 0.5 Commit Message Fixed bug: content selector for ‘editor‘ field uses ‘,' instead of ‘and' as delimiter. DeltaDoc When calling EntryEditorgetExtra() If ed.getFieldName().equals("editor") call contentSelectors.add(FieldContentSelector) JabRef Rev 3111
  • 51. Results About 89% coverage. DeltaDoc Commit Messages Why? What?
  • 52. Qualitative Evaluation “very useful" “highly useful” "would be a great supplement" “definitely a useful supplement" “can help make the logic clear” “often easier to understand“ “more accurate” “easy to read" “provides more information"
  • 53. DeltaDoc Limitations Intraprocedural Handling of loops not fully-precise Less precise for large changes Does not address reason for change
  • 54. DeltaDoc Advantages Cheap Can be computed in about a second on average. Suitable for quick adoption Can supplement or replace many existing commit messages. Structured Suitable for search. Reliable
  • 55. Questions? DeltaDoc Commit Messages Why? What?

Notes de l'éditeur

  1. Not a great deal is certain when it comes to software development. But one thing that is certain is …
  2. … change. There are changes to … requirements and specificationswork items and bug lists. documentation. development teams and their managementAnd of course there are changes to code.
  3. … and there are a lot of changes …
  4. Many open source projects had over 1,000 commits last year. Large projects can have more. Mozilla for example had over 10,000 commits, and about 28,000 new bug reports.
  5. To manage and track all these changes developers use many kinds of tools. Bug databases, work item databases, and of course source control repositories. More and more we see these tools integrated together in systems like IBM’s Jazz and Microsoft’s Team Foundation Server. Clearly managing change is critical …
  6. It’s important for facilitating collaboration between developers, especially when they are geographically distributed. Developers might wish to validate changes, locate and triage defects, or simply understand modifications.It’s important for helping managers understand what going on, something that critical to the success of a project. And ultimately, understanding the history of a project is essential to evaluating it’s success.
  7. But even with available tools understanding how a program is changing over time is not a simple proposition. And at it’s core, the problem is that program source code is difficult to understand. Professional developers spend the majority of their time trying to understand code.Let’s take a look at an example of this …
  8. Suppose I’d like to understand this change, maybe I’m looking for a bug, or maybe I’m responsible for this particular file and I want to make sure this change, from some other developer is ok. What we have here is the standard diff output for a change to some java code showing exactly which lines changed. The problem is that diff output is not always easy to understand because we only get a slice of the program. Here we can see that the first element of a local array called pageParts is being returned instead of an empty string … but we don’t know when that happens, and its also not obvious what pageParts contains, or whether this is a reasonable thing to do.Because this tends to happen often there is also “context diff” which prints out some extra lines above and below the changed lines. But in this case pageParts is defined six lines up, so what we probably need to do is use a tool that gives a side-by side comparison …
  9. The problem is that now we’re back to reading the code which is that difficult and time consuming task we want to avoid.
  10. This is one of the reasons we have “commit messages” in version control systems. These are free form text fields that enable developers to summarize their changes so that others can see what they’ve done without having to puzzle through diffs or read code. In many projects they are required. The problem is that not only do they take time to make, these messages aren&apos;t always as complete and accurate as we would like …
  11. Take a look at a few developer message board posts. These comments allude to the observation that although commit messages are useful, they are also burdensome for developers to create. Take the example on the top right which says …So perhaps what we would really like is a tool that’s automatic like diff, but that produces output that’s more to-the-point or summarized and easier to understand like commit messages.
  12. What I’ll talk about today is a tool that we developed called “DeltaDoc” which automatically documents code changes in a way that is often more concise and more readable than existing tools like diff. In fact, we’ll show that it’s really more comparable to commit messages.DeltaDoc describes the EFFECT of a change on the runtime behavior of a program. That is to say “what conditions trigger the change?” and “what are the observable differences in program behavior?” DeltaDoc is based on a combination of symbolic execution and also a set of transformation summarizations that I’ll describe in a bit.In returning to our previous example…
  13. … by applying DeltaDoc We can see the change in the functions behavior phrased in terms of the argument: String s.The DeltaDoc tells us that if s is not null and the result of spiting the string on hyphens is not length 2, then it returns the first item instead of returning an empty string.
  14. Here’s another short example, this time comparing to a human written commit message which says “Temporarily removed the trade routes from the game menu”Below that, the deltadoc shows that trade route menu items are no longer added to these two menus: buildOrders and build view. Of course, the deltadoc doesn&apos;t say that it’s a temporary change, which alludes to the fact that there are really two kinds of information we might document …
  15. In particular we distinguish between information that describes what has changed (for example added function x) from why it was changed (for example, fixed bug number 42) and other information like: “this is temporary.”
  16. Both types of information are important, and a majority of commit messages contain some of both.
  17. The goal of deltadoc is to cover as much of the important “what” information as possible, while still being concise.Note that by design, deltadocdosen’t cover all information that can be documented, and there are some reasons for that including that it is INTRA-procedural, it only documents external visible effects, and it makes some summarizations.
  18. But given that, If the intersection between the information in commit messages and the information in the corresponding deltadoc is large (as we hypothesize). Then a significant portion of the documentation work done by developers could be avoided with the use of deltadoc.
  19. If that’s true then developers could spend more time explaining why the change was made, or they could simply use that time to do other things, like make more changes.
  20. In the rest of this talk I’ll discuss how deltadoc works, and how we evaluated it .. by comparing it directly to commit messages with the help of some human annotators.
  21. First I’ll overview the DeltaDoc algorithm which takes as input two versions of a file and outputs human-readable text describing the change.
  22. The first step is compute symbolic path predicates for each statement of the program before and after the change.
  23. Then identify statements that have been added, removed, or have a different predicate and distill from that a documentation on the form: When X, Do Y, instead of Z
  24. Finally, because our goal is to mimic human-written documentation, we apply a set of transformations which trade-off precision for space. For example, we re-arrange terms to make the final documentation more concise.
  25. Our first step is to obtain intraprocedural path predicates, and we do that with symbolic execution.For an example, consider this function which produces a greeting prefixed by hello or bonjour depending on which language the system is using. Choosing between languages being especially relevant since we are in begium.
  26. In this example there are just three paths through the function to consider: one where the argument name is null, and then one for each of the language options.
  27. The next step is to execute each path symbolically. For each stmt we compute a symbolic description as well as the path predicate which is the conjunction of all the branch guards.
  28. Here’s the same table for the second path where we note, for example, in the last row we return “hello” + name if name !=null and the system language is English.
  29. And then the third path is just what you’d expect, similar to the second.
  30. And finally, we only keep around information about stmts which are relevant to the externally visible behavior. So in this case we just need to remember the return stmts.
  31. Now if there is a change to the function, like Hello is changed to Hi …
  32. We would do predicate generation just like before …
  33. … and notice that there is a difference in the functional behavior … here the predicate is the same, but Hello has changed to Hi
  34. From that it’s fairly straight forward to distill a documentation string like this one [read]Now in some cases, this raw documentation is too complicated or verbose to be a good replacement for commit messages …
  35. … so in an effort to better mimic human-written documentation -- which is rarely longer then a few lines --we apply a set of potentially precision reducing transformations.In one such transformation, we drop parts of the predicate that are not relevant the statements documented. Here, since we’re talking about returning s we guess that the important part of the path predicate is s != null, so we hide the rest of it.
  36. We also preform changes like this one where we reduce redundancy by re-arranging terms into a more hierarchical structure.Other types of transformations simplify programmic expressions or make them more readable.But there are others, so for a more complete description of all the summarization please have a look at the paper.
  37. Now that we’ve seen how deltadoc works, I’ll talk about how we evaluated it in three dimensions …we looked at the size of deltadoc output, the information content, and we used human annotators to qualitatively judge it’s usefulness.
  38. We looked at a total of 1000 recent revisions to 5 popular open-source java projects across several domains. In addition to the size of the projects in lines of code we also note the number of active authors since one of the main reasons for using change documentation is to facilitate collaboration between developers.
  39. This chart compares the size of deltadoc to standard diffs and also commit messages.Our takeaway is that while diffs averaged 35 lines, deltadocs were only about 9 lines, which is a lot closer to human-written commit messages which are usually only a few lines long and never more than 10 in this sample.While this makes deltadoc potentially suitable for supplementing commit messages in plain text, we can also imagine a tool that can allow developers to interact with the documentation … something that could allow them to drill down on demand to find out what’s going on more precisely but without being overwhelmed with too much detail initially.
  40. Returning to our earlier figure, we hypothesized that much of the information contained in commit messages could be found in deltadoc.
  41. In order to test this we need to establish a way to compare the information content in deltadoc to the information contained in commit messages. But because commit messages are natural language and DeltaDoc is not we need to be a little bit clever to do this scientifically …
  42. DeltaDoc and human written commit messages arent directly comparable, so we introduce a third form which we call the “relational form” which represents the information content in each type of documentation, and it is in this form that we can make a precise comparison.
  43. Relational form is so-named because it consists of a very specific set of relations between program variables. This makes translating to relational form (even from human-written text) straightforward and it also makes comparing two artifacts in relational form easily quantifiable.
  44. For a quick example consider comparing the deltadoc “getpriceforbuilding &gt; getowner.getgold” with the human written message “has an insufficient amount of gold”In this case the deltadoc is essentially already in relational form but in the case of the human doc we must manually extract the relation “something” is greater than goldAfter we extract all such relations we manually look for a mapping between them, which ultimately allows us to quantify the difference in information using what we call the “score metric”
  45. For the details, I refer you to our paper. But at a high level we designed the metric to be tough on deltadoc. In particular, we assume that only relations from commit messages are important, and we don’t reward the deltadoc even when it produces other potentially valuable information.The only way for deltadoc to score better than a commit message to be more precise.So something like “updated to revision 5” gets slightly more points than “updated revision” or just “minor change”We used 16 students from a graduate programming language class to validate the comparison.The score can range from 0 to 1, where A score of at least 0.5 indicates that deltadoc contained all of the (what-type) information present in the commit message … that’s the primarly goal.
  46. Add another example
  47. This graph gives the average score metric for the changes sampled from each benchmark.With an average greater than 0.5, we conclude that deltadoc contained at least as much information as corresponding commit messages most of the time.
  48. Here’s another view of the same data, which breaks down the distribution a bit showing how deltadoc is less complete or precise than commit messages only about 10% of them time. For the vast majority of commits, the deltadoc scores 0.5 or better.
  49. So returning to our earlier hypothesis, we find that this intersection is indeed large. But is DeltaDoc useful?
  50. The answer we got from our annotators, who had the chance to read many commit messages and deltadocs, what generally “yes”. In about two thirds of cases they either preferred deltadoc or had no preference in comparison to the existing commit message.
  51. To conclude, like any program analysis deltadoc has both theoretical and practical limitations.
  52. We believe that documentation of this form has a number of advantages. It’s fast and fully automatic, its suitable for supplementing commit messages right now, the output is structured – so it’s amenable to search, and it’s both predictable and reliable.