SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
On the Use of
     Relevance Feedback in
   IR-Based Concept Location


Gregory Gay*, Sonia Haiduc**, Andrian Marcus**, Tim
                     Menzies*

   * West Virginia University, Morgantown, WV, USA
      ** Wayne State University, Detroit, MI, USA
Software change
IR-based concept location
      Query




Ranked list of results   Source code
Challenge: the query




• Text in the query needs to match the text in the
  source code
• Difficult to formulate good queries
  - unfamiliar source code
  - unknown target
  -> hard to describe something that you do not know
Eclipse bug #13926

Bug description:
 JFace Text Editor Leaves a Black
 Rectangle on Content Assist text insertion.
 Inserting a selected completion proposal
 from the context information popup causes
 a black rectangle to appear on top of the
 display.
Queries
• Q1: jface text editor black rectangle insert text

• Q2: jface text editor rectangle insert context
  information

• Q3: jface text editor content assist

• Q4: jface insert selected completion proposal
  context information
Queries and results
• Q1: jface text editor black rectangle insert text
  – position of modified method: 7496
• Q2: jface text editor rectangle insert context
  information
  – position of modified method: 258
• Q3: jface text editor content assist
  – position of modified method: 119
• Q4: jface insert selected completion proposal
  context information
  – position of modified method: 723

  Whole change request: 54
IR CL in unfamiliar software

Developers:
• Rarely begin with a good query: hard to choose
  the right words
• Analyze very briefly list of results before
  reformulating query
• Even after reformulation, vague idea of what to
  look for -> queries not always better
• Can recognize whether the results retrieved are
  relevant or not to the problem
Questions

• Is there a way to make the query formulation
  easier on the developers?

• Is there a way to ensure that the subsequent
  queries lead us in the right direction?

• Can we do this by following the common
  practices of the developers?

• Can we improve IR-based CL using this
  approach?
Relevance feedback

•   Uses developer feedback about relevancy of
    returned results to automatically reformulate
    queries
•   Queries are reformulated by:
    –   Adding terms from relevant documents
    –   Removing terms from irrelevant documents
•   Iterative process
•   Common technique in text retrieval
•   Used also in SE
JFace Text Editor Leaves a Black Rectangle on Content Assist text
   insertion. Inserting a selected completion proposal from the
   context information popup causes a black rectangle to appear on
   top of the display.

1. createContextInfoPopup() in
   org.eclipse.jface.text.contentassist.ContextInformationPopup
2. configure() in
   org.eclipse.jdt.internal.debug.ui.JDIContentAssistPreference
3. showContextProposals() in
   org.eclipse.jface.text.contentassist.ContextInformationPopup


     + words in      documents          - words in      documents


                               New Query
IRRF tool
• IR Engine: Lucene
  – based on the Vector Space Model (VSM)
  – input: methods, query
  – output: a ranked list of methods ordered by their
    textual similarity to the query

• Relevance feedback: Rocchio algorithm
  – the classic algorithm for RF; used also in SE
  – models a way of incorporating relevance
    feedback information into the VSM
Evaluation

• Extracted bug descriptions and set of methods
  modified in the bug fixes from bug tracking
  systems
• Consider bug descriptions as initial queries for IR
• Measure #methods investigated until reaching a
  modified method before and after using RF
• Relevance feedback:
  – one developer provides feedback
  – feedback round ends after marking N methods as
    relevant or irrelevant (N = 1, 3 ,5)
Stop criteria

• Target method in top N results

• More than 50 methods analyzed

• Position of target methods in the ranked list
  of results increases for 2 consecutive rounds
  -> query moving away from wanted methods
Systems


 System     Vers.   LOC    Methods      Classes
 Eclipse     2.0 2,500,000 74,996        7,500
  jEdit     4.2      300,000   5,366     750
Adempiere   3.1.0    330,000   28,622   1,900
Results

 System     RF improves   RF does not
                 IR       improve IR
 Eclipse          6            1

  jEdit         3             3

Adempiere       4             1

   All          13            5
Results
• Eclipse:
 Report #    Baseline    IRRF N=1    IRRF N=3    IRRF N=5
  19686        428        453 (5r)   48 (16r)    46 m(9r)


• jEdit:
 Report #    Baseline    IRRF N=1    IRRF N=3    IRRF N=5
 1607211       354        103(5r)     36 (12r)    28 (6r)


• Adempiere:
 Report #    Baseline    IRRF N=1    IRRF N=3    IRRF N=5
 1628050        52         3 (3r)      5 (2r)      7 (2r)
Questions – revisited (1)
• Is there a way to make the query formulation
  easier on the developers?
  – automatic query formulation


• Is there a way to ensure that the subsequent
  queries lead us in the right direction?
  – add terms from relevant documents, remove terms
    from irrelevant documents
  – stop when we move away from the target (results
    worsen for 2 consecutive rounds)
Questions – revisited (2)
• Can we do this by following the common
  practices of the developers?
  – developers still analyze only a few results in
    the result list before reformulation


• Can we improve IR-based CL using
  relevance feedback?
  – in some cases yes
Current and future work
• Studies involving more systems and more
  developers

• Automatically calibrating the parameters
  for a specific system and a specific set of
  change requests

• Study the circumstances when RF does
  not improve IR

Contenu connexe

Tendances

Softwaretestingstrategies
SoftwaretestingstrategiesSoftwaretestingstrategies
Softwaretestingstrategiessaieswar19
 
SYNTAX Directed Translation Report || Compiler Construction
SYNTAX Directed Translation Report || Compiler ConstructionSYNTAX Directed Translation Report || Compiler Construction
SYNTAX Directed Translation Report || Compiler ConstructionZain Abid
 
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...ijaia
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Praveen Penumathsa
 
Fundamentals of Software Engineering
Fundamentals of Software Engineering Fundamentals of Software Engineering
Fundamentals of Software Engineering Madhar Khan Pathan
 
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...Twitter Inc.
 
Feature Selection Techniques for Software Fault Prediction (Summary)
Feature Selection Techniques for Software Fault Prediction (Summary)Feature Selection Techniques for Software Fault Prediction (Summary)
Feature Selection Techniques for Software Fault Prediction (Summary)SungdoGu
 
Dynamic analysis in Software Testing
Dynamic analysis in Software TestingDynamic analysis in Software Testing
Dynamic analysis in Software TestingSagar Pednekar
 
Mca se chapter_9_formal_methods
Mca se chapter_9_formal_methodsMca se chapter_9_formal_methods
Mca se chapter_9_formal_methodsAman Adhikari
 
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...CS, NcState
 
Introduction To Algorithms
Introduction To AlgorithmsIntroduction To Algorithms
Introduction To AlgorithmsKM Bappi
 
Software testing- an introduction
Software testing- an introductionSoftware testing- an introduction
Software testing- an introductionSanthi Priyan
 
Unit 5 testing -software quality assurance
Unit 5  testing -software quality assuranceUnit 5  testing -software quality assurance
Unit 5 testing -software quality assurancegopal10scs185
 

Tendances (19)

Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Softwaretestingstrategies
SoftwaretestingstrategiesSoftwaretestingstrategies
Softwaretestingstrategies
 
SYNTAX Directed Translation Report || Compiler Construction
SYNTAX Directed Translation Report || Compiler ConstructionSYNTAX Directed Translation Report || Compiler Construction
SYNTAX Directed Translation Report || Compiler Construction
 
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
 
Cyclomatic complexity
Cyclomatic complexityCyclomatic complexity
Cyclomatic complexity
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram
 
Formal Methods
Formal MethodsFormal Methods
Formal Methods
 
Complexity metrics and models
Complexity metrics and modelsComplexity metrics and models
Complexity metrics and models
 
DISE - Programming Concepts
DISE - Programming ConceptsDISE - Programming Concepts
DISE - Programming Concepts
 
Fundamentals of Software Engineering
Fundamentals of Software Engineering Fundamentals of Software Engineering
Fundamentals of Software Engineering
 
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
 
Feature Selection Techniques for Software Fault Prediction (Summary)
Feature Selection Techniques for Software Fault Prediction (Summary)Feature Selection Techniques for Software Fault Prediction (Summary)
Feature Selection Techniques for Software Fault Prediction (Summary)
 
Dynamic analysis in Software Testing
Dynamic analysis in Software TestingDynamic analysis in Software Testing
Dynamic analysis in Software Testing
 
Mca se chapter_9_formal_methods
Mca se chapter_9_formal_methodsMca se chapter_9_formal_methods
Mca se chapter_9_formal_methods
 
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
 
Introduction To Algorithms
Introduction To AlgorithmsIntroduction To Algorithms
Introduction To Algorithms
 
Software testing- an introduction
Software testing- an introductionSoftware testing- an introduction
Software testing- an introduction
 
Testing ppt
Testing pptTesting ppt
Testing ppt
 
Unit 5 testing -software quality assurance
Unit 5  testing -software quality assuranceUnit 5  testing -software quality assurance
Unit 5 testing -software quality assurance
 

En vedette

Functions of information retrival system(1)
Functions of information retrival system(1)Functions of information retrival system(1)
Functions of information retrival system(1)silambu111
 
Information retrieval concept, practice and challenge
Information retrieval   concept, practice and challengeInformation retrieval   concept, practice and challenge
Information retrieval concept, practice and challengeGan Keng Hoon
 
Information Retrieval Challenges
Information Retrieval ChallengesInformation Retrieval Challenges
Information Retrieval ChallengesBruno Pedro
 
Data Mining and Information Retrival
Data Mining and Information RetrivalData Mining and Information Retrival
Data Mining and Information Retrivalketan shete
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithmRupali Bhatnagar
 
Information retrieval system!
Information retrieval system!Information retrieval system!
Information retrieval system!Jane Garay
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval ssilambu111
 

En vedette (7)

Functions of information retrival system(1)
Functions of information retrival system(1)Functions of information retrival system(1)
Functions of information retrival system(1)
 
Information retrieval concept, practice and challenge
Information retrieval   concept, practice and challengeInformation retrieval   concept, practice and challenge
Information retrieval concept, practice and challenge
 
Information Retrieval Challenges
Information Retrieval ChallengesInformation Retrieval Challenges
Information Retrieval Challenges
 
Data Mining and Information Retrival
Data Mining and Information RetrivalData Mining and Information Retrival
Data Mining and Information Retrival
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithm
 
Information retrieval system!
Information retrieval system!Information retrieval system!
Information retrieval system!
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
 

Similaire à Concept Location using Information Retrieval and Relevance Feedback

Combining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationCombining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationSonia Haiduc
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsAravind Sesagiri Raamkumar
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Lucidworks
 
Building a Meta-search Engine
Building a Meta-search EngineBuilding a Meta-search Engine
Building a Meta-search EngineAyan Chandra
 
Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1arthi v
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...David Zibriczky
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenPoo Kuan Hoong
 
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015Harald Steck
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET-  	  Question-Answer Text Mining using Machine LearningIRJET-  	  Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET Journal
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET Journal
 
Test Driven Development
Test Driven DevelopmentTest Driven Development
Test Driven Developmentnikhil sreeni
 
IT1204 - Software Engineering - L12
IT1204 - Software Engineering - L12IT1204 - Software Engineering - L12
IT1204 - Software Engineering - L12BakerTilly US
 
Managing software project, software engineering
Managing software project, software engineeringManaging software project, software engineering
Managing software project, software engineeringRupesh Vaishnav
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia VoulibasiISSEL
 
Software Measurement and Metrics.pptx
Software Measurement and Metrics.pptxSoftware Measurement and Metrics.pptx
Software Measurement and Metrics.pptxubaidullah75790
 
Effective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web CollectionsEffective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web CollectionseXascale Infolab
 

Similaire à Concept Location using Information Retrieval and Relevance Feedback (20)

Combining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationCombining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept Location
 
Apsec 2014 Presentation
Apsec 2014 PresentationApsec 2014 Presentation
Apsec 2014 Presentation
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender Systems
 
Query processing
Query processingQuery processing
Query processing
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
 
Building a Meta-search Engine
Building a Meta-search EngineBuilding a Meta-search Engine
Building a Meta-search Engine
 
Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R Open
 
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET-  	  Question-Answer Text Mining using Machine LearningIRJET-  	  Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine Learning
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine Learning
 
Test Driven Development
Test Driven DevelopmentTest Driven Development
Test Driven Development
 
IT1204 - Software Engineering - L12
IT1204 - Software Engineering - L12IT1204 - Software Engineering - L12
IT1204 - Software Engineering - L12
 
Managing software project, software engineering
Managing software project, software engineeringManaging software project, software engineering
Managing software project, software engineering
 
Development Guideline
Development GuidelineDevelopment Guideline
Development Guideline
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
Software Measurement and Metrics.pptx
Software Measurement and Metrics.pptxSoftware Measurement and Metrics.pptx
Software Measurement and Metrics.pptx
 
Effective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web CollectionsEffective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web Collections
 

Dernier

Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 

Dernier (20)

Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

Concept Location using Information Retrieval and Relevance Feedback

  • 1. On the Use of Relevance Feedback in IR-Based Concept Location Gregory Gay*, Sonia Haiduc**, Andrian Marcus**, Tim Menzies* * West Virginia University, Morgantown, WV, USA ** Wayne State University, Detroit, MI, USA
  • 3. IR-based concept location Query Ranked list of results Source code
  • 4. Challenge: the query • Text in the query needs to match the text in the source code • Difficult to formulate good queries - unfamiliar source code - unknown target -> hard to describe something that you do not know
  • 5. Eclipse bug #13926 Bug description: JFace Text Editor Leaves a Black Rectangle on Content Assist text insertion. Inserting a selected completion proposal from the context information popup causes a black rectangle to appear on top of the display.
  • 6. Queries • Q1: jface text editor black rectangle insert text • Q2: jface text editor rectangle insert context information • Q3: jface text editor content assist • Q4: jface insert selected completion proposal context information
  • 7. Queries and results • Q1: jface text editor black rectangle insert text – position of modified method: 7496 • Q2: jface text editor rectangle insert context information – position of modified method: 258 • Q3: jface text editor content assist – position of modified method: 119 • Q4: jface insert selected completion proposal context information – position of modified method: 723 Whole change request: 54
  • 8. IR CL in unfamiliar software Developers: • Rarely begin with a good query: hard to choose the right words • Analyze very briefly list of results before reformulating query • Even after reformulation, vague idea of what to look for -> queries not always better • Can recognize whether the results retrieved are relevant or not to the problem
  • 9. Questions • Is there a way to make the query formulation easier on the developers? • Is there a way to ensure that the subsequent queries lead us in the right direction? • Can we do this by following the common practices of the developers? • Can we improve IR-based CL using this approach?
  • 10. Relevance feedback • Uses developer feedback about relevancy of returned results to automatically reformulate queries • Queries are reformulated by: – Adding terms from relevant documents – Removing terms from irrelevant documents • Iterative process • Common technique in text retrieval • Used also in SE
  • 11. JFace Text Editor Leaves a Black Rectangle on Content Assist text insertion. Inserting a selected completion proposal from the context information popup causes a black rectangle to appear on top of the display. 1. createContextInfoPopup() in org.eclipse.jface.text.contentassist.ContextInformationPopup 2. configure() in org.eclipse.jdt.internal.debug.ui.JDIContentAssistPreference 3. showContextProposals() in org.eclipse.jface.text.contentassist.ContextInformationPopup + words in documents - words in documents New Query
  • 12. IRRF tool • IR Engine: Lucene – based on the Vector Space Model (VSM) – input: methods, query – output: a ranked list of methods ordered by their textual similarity to the query • Relevance feedback: Rocchio algorithm – the classic algorithm for RF; used also in SE – models a way of incorporating relevance feedback information into the VSM
  • 13. Evaluation • Extracted bug descriptions and set of methods modified in the bug fixes from bug tracking systems • Consider bug descriptions as initial queries for IR • Measure #methods investigated until reaching a modified method before and after using RF • Relevance feedback: – one developer provides feedback – feedback round ends after marking N methods as relevant or irrelevant (N = 1, 3 ,5)
  • 14. Stop criteria • Target method in top N results • More than 50 methods analyzed • Position of target methods in the ranked list of results increases for 2 consecutive rounds -> query moving away from wanted methods
  • 15. Systems System Vers. LOC Methods Classes Eclipse 2.0 2,500,000 74,996 7,500 jEdit 4.2 300,000 5,366 750 Adempiere 3.1.0 330,000 28,622 1,900
  • 16. Results System RF improves RF does not IR improve IR Eclipse 6 1 jEdit 3 3 Adempiere 4 1 All 13 5
  • 17. Results • Eclipse: Report # Baseline IRRF N=1 IRRF N=3 IRRF N=5 19686 428 453 (5r) 48 (16r) 46 m(9r) • jEdit: Report # Baseline IRRF N=1 IRRF N=3 IRRF N=5 1607211 354 103(5r) 36 (12r) 28 (6r) • Adempiere: Report # Baseline IRRF N=1 IRRF N=3 IRRF N=5 1628050 52 3 (3r) 5 (2r) 7 (2r)
  • 18. Questions – revisited (1) • Is there a way to make the query formulation easier on the developers? – automatic query formulation • Is there a way to ensure that the subsequent queries lead us in the right direction? – add terms from relevant documents, remove terms from irrelevant documents – stop when we move away from the target (results worsen for 2 consecutive rounds)
  • 19. Questions – revisited (2) • Can we do this by following the common practices of the developers? – developers still analyze only a few results in the result list before reformulation • Can we improve IR-based CL using relevance feedback? – in some cases yes
  • 20. Current and future work • Studies involving more systems and more developers • Automatically calibrating the parameters for a specific system and a specific set of change requests • Study the circumstances when RF does not improve IR