SlideShare une entreprise Scribd logo
1  sur  44
Regression Testing Minimisation, Selection
and Prioritisation: A Survey
S. Yoo, M. Harman
1JOHN REESE
2
JOHN REESESENIOR SOFTWARE ENGINEER
Introduction
3
Survey of 159 papers on test suite minimization, regression test selection, and test
case prioritization.
Intention is not to undertake a systematic review, but rather to provide a broad
state-of-the-art view on these related fields.
Note: I’m going to go back and forth between spelling minimization and
prioritisation with s’ and z’s
Introduction
4
Regression Testing: Provide confidence that the newly introduced changes do not
obstruct the behaviors of the existing, unchanged part of the software.
Difficulties Include:
• Black-box development with 3rd party applications
• Agile development
Note: Most straightforward approach is “retest-all”, but may not be viable in all
scenarios
Introduction
5
A number of different approaches have been studied to aid the regression testing
process. Three major branches include:
Test Suite Minimization: Process that seeks to identify and then eliminate the
obsolete or redundant test cases from the test suite.
Test Case Selection: Select a subset of test cases that will be used to test the
changed parts of the software.
Test Case Prioritization: Identify the “ideal” ordering of test cases that maximizes the
desirable properties, such as early fault detection.
Overview
6
1. Motivation
2. Background
3. Test Case Selection
4. Test Suite Minimization
5. Test Case Prioritization
6. Summary and Conclusion
7. Suggestions
8. Lessons Learned
Motivation
7
Why is this the right set of topics for a survey?
• Each topic is related by a common thread of optimization of already existing test
cases.
• All differ from areas that focus on test case generation.
• Intimate relationship between the topics (e.g. minimization could be performed
by prioritizing a set of cases and choosing the first N).
Is there already a recent survey in this area?
• Most similar paper was a survey on Regression Test Selection techniques in 1996.
• No previous survey paper that consider Prioritization, Selection, and Minimization
collectively.
Background
8
Redefine regression testing and further elaborate on the distinction of each
optimization technique defined in the introduction.
Classification of Test Cases
Reusable – Only execute parts of the program that remain unchanged. Not valuable
for new changes, but assist with future regression checks.
Retestable – Test cases that are still valid after a set of changes and can validate if
any regression has occurred.
Obsolete – Could be rendered obsolete due to: input/output is no longer relevant
and/or no longer test the desired specification (i.e. requirements changed).
9
Test Case Selection
Which tests are relevant to be executed
Test Case Selection
10
Compare Test Case Selection vs. Test Case Minimisation
• Very similar to one another; both revolve around choosing a subset of test cases
from the test suite.
• Test suite minimization often based on metrics (e.g. code coverage) of an entire
application.
• Test case selection based on finding relevant tests to be run.
Test Case Selection
11
Integer Programming Approach
Optimization program in which all of the variables are restricted to integers.
• Heavily relies on two matrices that describe the relation between program
segments and test cases. (program segment can be defined as a single-entry,
single exit block of code)
• Matrix function represented as am1x1 + am2x2 >= bm (aij equal to 1 if the segment-
test case relation exists, 0 otherwise)
• Results in a decision vector (subset of selected test cases) < x1, … , xn > where xi is
equal to 1 if the ith test case is included.
Problematic with control flow changes. Entire test suite has to be run again.
Test Case Selection
12
Data-flow Analysis Approach
Technique for gathering information about the possible set of values calculated at
various points in a computer program. (i.e. How does input flow through the
application)
Seek to identity new, modified or deleted definition-use pairs in the new version of
the program; then select those cases that exercise these pairs. (Does the new code
impact the test data being used?)
Problematic with modifications that are unrelated to data-flow change. These test
scenarios will not be selected for testing.
Test Case Selection
13
Symbolic Execution Approach
A means of analyzing a program to determine what inputs cause each part of a
program to execute.
function(f) {
if f == 2 then return fail();
else return success();
}
1. Find all input partitions.
2. Produce test cases so that each input partition is executed at least once.
3. Given information on where the code has been modified (e.g. a diff), return
modified code segments and the test cases that execute these segments.
Drawback is the algorithmic complexity of symbolic execution as well as how
expensive it can be to execute.
Test Case Selection
14
Graph-Walk Approach
1. Parse P and P’ into graph data structures.
2. Traverse each graph and compare the nodes
3. If a node in P is not the same as the node in P’, select all the test cases that
execute the code within that node.
Problematic since there is no data dependence, the approach could include test
cases that provide little to no value.
Test Case Selection
15
Textual Difference Approach
A very similar approach to the Graph-Walk approach
• Uses the diff tool provided by Unix.
• Code sanitized to remove any characters that would not introduce change (e.g.
whitespace)
Test Case Selection
16
Path Analysis
• Construct exemplar paths from P and P’
• Paths in P’ are categorized as new, modified, cancelled, or unmodified.
• Since all test cases and the paths they execute in P are known, the test cases that
traverse the modified paths in P’ are selected.
The authors had a poor definition of “modified”. Test cases that executed new or
cancelled code was not chosen. However, these paths could lead to regression.
Test Case Selection
17
Modification-based Technique
Yet another similar approach to Graph-Walking
• Introduced a testing framework called TestTube.
• Partitions P and P’ into program entities (nodes), then monitors the test cases to
find out the code that each test case executes.
• Those entities that were different are selected.
Since the entities include not only functions, but variables, any test case that
executes modified functions will be selected.
This differs from the data-independent Graph-Walking approach described
previously. Modification-based technique encompasses data as well.
Test Case Selection
18
Firewall Approach
Draw a firewall around the modules of the system that need to be retested.
A given module M can be represented as:
• No Change NoCh(M)
• Only Code CodeCh(M)
• Spec Change SpecCh(M)
Considering integrations between module A and module B
• Ignore NoCh(A) ^ NoCh(B)
• If A and B are modified in either code or in spec (CodeCh or SpecCh respectively)
the tests should be rerun.
Test Case Selection
19
Design-based Approach
• Black-box, design level regression test selection that used UML-based designs.
• Requires traceability between design and test cases
• Leveraged obsolete, retestable, and reusable as highlighted in the background
Possible to select test cases that provide no value as a UML diagram does not
encapsulate all code interactions. (e.g. change a method, but diagram doesn’t
dictate it is ever called, just exists)
20
Test Suite Minimization
Techniques that aim to identify redundant test cases
Test Suite Minimization
21
Heuristics
Essential test cases
• If a test requirement can be satisfied by one and only one test case
Redundant test cases
• If a test case satisfies only a subset of the test requirements satisfied by another
test case.
GE Heuristic
Select the test case that satisfies the maximum number of unsatisfied test
requirements.
GRE Heuristic
Remove all redundant test cases in the test suite (which may make some test cases
essential). Then run the GE heuristic.
Test Suite Minimization
22
Heuristics
Empirical evidence suggests no single approach is better
• Concerned with heuristics more so than preciseness.
Vast majority of presented literature focused on the minimal hitting set problem.
Most minimization techniques are based on coverage criteria, there were
exceptions.
• Minimizing the test itself (start with a failed test).
• Black-box approach to program input/output (research in state machines).
Different inputs may not flow through different branches.
23
Test Case Prioritisation
Test case prioritisation seeks to find the ideal ordering
of test cases for testing.
Test Case Prioritisation
24
Coverage-based Prioritisation (code)
Structural coverage often used as prioritization criterion. The more code a test
executes, the higher chance of finding a fault.
Approaches include:
• Branch-total (number of branches covered by test cases)
• Branch-additional (number of additional branches a test case would execute)
• Statement-total
• Statement-additional
Test Case Prioritisation
25
Interaction Testing (black box)
Necessary when the system under test involves multiple combinations of different
components. (consider the application environment, Operating System)
Research focused on findings those interactions that impact a higher user base. (e.g.
prioritize Windows testing over Linux).
Additional research done in GUI-based programs.
• Take a sequence of inputs and find the case that executes the most code.
• Consider user interaction data for prioritisation (heat map).
Test Case Prioritisation
26
Distribution-based Approach
Profile test cases based on a dissimilarity metric, a real number representing the
degree of dissimilarity between two inputs.
Cluster test cases according to their similarities which can reveal:
• Similar profiles may indicate a group of redundant test cases
• Isolated clusters may contain test cases in unusual conditions (fault-proneness)
Test Case Prioritisation
27
History-based Approach
• Based on association clusters of software artifacts.
• If two files are often modified together, they will be clustered together.
• Each file is also associated with test cases that impact or execute it.
Non-source file (e.g. media, documentation) defects can be as severe as source
code defects.
Test Case Prioritisation
28
Requirement-based Approach
• Test cases are mapped to software requirements
• Prioritisation mapped by customer-assigned priority and/or implementation
complexity.
Makes the prioritization very subjective (customers will have conflicting priorities)
Test Case Prioritisation
29
Model-based Approach
• Test cases classified into a high priority set TSH and a low priority set TSL
• Initial prioritization was randomly assigned
• Test case is assigned high priority if it is relevant to the modification made in the
model.
Similar approach to the UML based approach when selecting test cases.
Test Case Prioritisation
30
Session-based Approach
• Recorded user sessions from the previous version of the (web) application.
• Thought to be ideal for testing web applications as it reflects actual use.
• Metrics such as number of HTTP requests, frequency of visits.
• Better than random selection, but no single prioritization criterion is always the
best.
Test Case Prioritisation
31
Cost-Aware Approach
Typical prioritization approaches assume equal fault level and cost.
Areas of focus similarly categorized:
• Time based (tests that take a long time, need a way to fit X tests in Y units of time)
• Fault level (prioritize most catastrophic tests first, not necessarily any fault)
Meta-Empirical Studies
32
• Empirical evaluation considered post-hoc, knowledge of faults is known. Without
previous knowledge of faults, not possible to perform a controlled experiment.
• Studies done in regards to seeded vs. real faults (concluded seeded faults can be
safely used in place of hand-seeded faults).
• Frequency of regression testing has a significant impact of the cost-effectiveness
of RTS techniques. The longer the window between tests, the more tests are
selected, lowering the value-add.
• Research efforts attempting to apply an RTS technique based on the type of
program (no silver bullet; Session-based for web applications, Model-based that
had its source generated from UML)
33
Summary and Discussion
Analysis of Current Global Trends
34
Consider the graph as not a representation of the number of publications, but
trends of research popularity (single publication can count towards two categories)
Analysis of Current Global Trends
35
• 60% of studies included less than 10,000 lines of code.
• 70% of studies included less than 1,000 test cases.
Analysis of Current Global Trends
36
State of the Art
37
• Among the class of RTS techniques, the graph walk approach is the most
predominant. Intuitive and incredibly generic.
• Two ideas played essential roles: test case classification and safe regression test
selection (if a modification occurred, it will be selected).
• Greedy algorithms are prominent in the selected literature (as much as possible,
as soon as possible).
Trends
38
• Emphasis on models (early adoption was very code focused)
• Increased domains (e.g. web applications)
• Cost-awareness – more and more literature are starting to consider test time and
amount of fault.
Issues / Limitations
39
Limited subjects (60% from the SIR repository). Hard to prove the proposed research
techniques can be generalized.
Solutions
• Design a method that will allow a realistic simulation of real software faults.
• Engage with open source and Industry
Technology Transfer observations of the literature suggests the community may have
reached maturity and its time to transfer.
Out of 159 papers, only 31 of them have an author involved in industry.
Out of 159 papers, only 12 consider industrial software.
Future Direction
40
Orchestrating regression testing techniques with test data generation
• Self healing tests
Multi-Objective Regression Testing
• Group tests requiring a given environment together, reduce cost.
Consideration of Other Domains
• Most were white-box
Tool Support
• No readily available tools means practical adoption will remain limited
Conclusion
41
Trends in literature show..
• The community is focused on prioritization, especially Graph-Walking.
• The community it moving towards assessment of complex trade offs (cost and
value)
• More are becoming interested in the research area. Number of publications
continue to rise.
Suggestions
42
• More literature on Minimization and/or clearer content.
• Briefly describe the references used. Felt a lot of the references forced you to
read the paper.
For a paper meant to give an overview of the state of the art.. it did just that
Lessons Learned
43
• How big of an area of research regression testing is
• Symbolic execution
• Consider binaries to be a source of fault
44
Questions / Feedback?

Contenu connexe

Tendances

Test design techniques
Test design techniquesTest design techniques
Test design techniques
Pragya Rastogi
 
Test Case Design
Test Case DesignTest Case Design
Test Case Design
acatalin
 
A study on the efficiency of a test analysis method utilizing test-categories...
A study on the efficiency of a test analysis method utilizing test-categories...A study on the efficiency of a test analysis method utilizing test-categories...
A study on the efficiency of a test analysis method utilizing test-categories...
Tsuyoshi Yumoto
 

Tendances (18)

Test Case Design and Technique
Test Case Design and TechniqueTest Case Design and Technique
Test Case Design and Technique
 
Black box software testing
Black box software testingBlack box software testing
Black box software testing
 
Testing Fundamentals
Testing FundamentalsTesting Fundamentals
Testing Fundamentals
 
Black box testing
Black box testingBlack box testing
Black box testing
 
Software Testing Techniques
Software Testing TechniquesSoftware Testing Techniques
Software Testing Techniques
 
Test design techniques: Structured and Experienced-based techniques
Test design techniques: Structured and Experienced-based techniquesTest design techniques: Structured and Experienced-based techniques
Test design techniques: Structured and Experienced-based techniques
 
Software Testing Techniques
Software Testing TechniquesSoftware Testing Techniques
Software Testing Techniques
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniques
 
Test design techniques
Test design techniquesTest design techniques
Test design techniques
 
Test Case Design
Test Case DesignTest Case Design
Test Case Design
 
ISTQB Advanced Study Guide - 4
ISTQB Advanced Study Guide - 4ISTQB Advanced Study Guide - 4
ISTQB Advanced Study Guide - 4
 
A study on the efficiency of a test analysis method utilizing test-categories...
A study on the efficiency of a test analysis method utilizing test-categories...A study on the efficiency of a test analysis method utilizing test-categories...
A study on the efficiency of a test analysis method utilizing test-categories...
 
A Test Analysis Method for Black Box Testing Using AUT and Fault Knowledge.
A Test Analysis Method for Black Box Testing Using AUT and Fault Knowledge.A Test Analysis Method for Black Box Testing Using AUT and Fault Knowledge.
A Test Analysis Method for Black Box Testing Using AUT and Fault Knowledge.
 
White box testing
White box testingWhite box testing
White box testing
 
White Box Testing
White Box TestingWhite Box Testing
White Box Testing
 
Software Testing
Software Testing Software Testing
Software Testing
 
Software Quality Testing
Software Quality TestingSoftware Quality Testing
Software Quality Testing
 
Whitebox
WhiteboxWhitebox
Whitebox
 

Similaire à The Current State of the Art of Regression Testing

Newsoftware testing-techniques-141114004511-conversion-gate01
Newsoftware testing-techniques-141114004511-conversion-gate01Newsoftware testing-techniques-141114004511-conversion-gate01
Newsoftware testing-techniques-141114004511-conversion-gate01
Mr. Jhon
 
New software testing-techniques
New software testing-techniquesNew software testing-techniques
New software testing-techniques
Fincy V.J
 

Similaire à The Current State of the Art of Regression Testing (20)

Testing
TestingTesting
Testing
 
11 whiteboxtesting
11 whiteboxtesting11 whiteboxtesting
11 whiteboxtesting
 
Testing
TestingTesting
Testing
 
CS8494 SOFTWARE ENGINEERING Unit-4
CS8494 SOFTWARE ENGINEERING Unit-4CS8494 SOFTWARE ENGINEERING Unit-4
CS8494 SOFTWARE ENGINEERING Unit-4
 
Bd36334337
Bd36334337Bd36334337
Bd36334337
 
Ijsea04031006
Ijsea04031006Ijsea04031006
Ijsea04031006
 
Seii unit6 software-testing-techniques
Seii unit6 software-testing-techniquesSeii unit6 software-testing-techniques
Seii unit6 software-testing-techniques
 
Test Case Design
Test Case DesignTest Case Design
Test Case Design
 
Test Case Design & Technique
Test Case Design & TechniqueTest Case Design & Technique
Test Case Design & Technique
 
Newsoftware testing-techniques-141114004511-conversion-gate01
Newsoftware testing-techniques-141114004511-conversion-gate01Newsoftware testing-techniques-141114004511-conversion-gate01
Newsoftware testing-techniques-141114004511-conversion-gate01
 
Test Levels & Techniques
Test Levels & TechniquesTest Levels & Techniques
Test Levels & Techniques
 
Software engineering Testing technique,test case,test suit design
Software engineering Testing technique,test case,test suit designSoftware engineering Testing technique,test case,test suit design
Software engineering Testing technique,test case,test suit design
 
New software testing-techniques
New software testing-techniquesNew software testing-techniques
New software testing-techniques
 
Lecture (Software Testing).pptx
Lecture (Software Testing).pptxLecture (Software Testing).pptx
Lecture (Software Testing).pptx
 
Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles Baker
 
software engineering-best placement guarentee
software engineering-best placement guarenteesoftware engineering-best placement guarentee
software engineering-best placement guarentee
 
Testing ppt
Testing pptTesting ppt
Testing ppt
 
Software testing- an introduction
Software testing- an introductionSoftware testing- an introduction
Software testing- an introduction
 
Thetheoryofsoftwaretesting
ThetheoryofsoftwaretestingThetheoryofsoftwaretesting
Thetheoryofsoftwaretesting
 
Configuration Navigation Analysis Model for Regression Test Case Prioritization
Configuration Navigation Analysis Model for Regression Test Case PrioritizationConfiguration Navigation Analysis Model for Regression Test Case Prioritization
Configuration Navigation Analysis Model for Regression Test Case Prioritization
 

Dernier

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 

Dernier (20)

%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 

The Current State of the Art of Regression Testing

  • 1. Regression Testing Minimisation, Selection and Prioritisation: A Survey S. Yoo, M. Harman 1JOHN REESE
  • 3. Introduction 3 Survey of 159 papers on test suite minimization, regression test selection, and test case prioritization. Intention is not to undertake a systematic review, but rather to provide a broad state-of-the-art view on these related fields. Note: I’m going to go back and forth between spelling minimization and prioritisation with s’ and z’s
  • 4. Introduction 4 Regression Testing: Provide confidence that the newly introduced changes do not obstruct the behaviors of the existing, unchanged part of the software. Difficulties Include: • Black-box development with 3rd party applications • Agile development Note: Most straightforward approach is “retest-all”, but may not be viable in all scenarios
  • 5. Introduction 5 A number of different approaches have been studied to aid the regression testing process. Three major branches include: Test Suite Minimization: Process that seeks to identify and then eliminate the obsolete or redundant test cases from the test suite. Test Case Selection: Select a subset of test cases that will be used to test the changed parts of the software. Test Case Prioritization: Identify the “ideal” ordering of test cases that maximizes the desirable properties, such as early fault detection.
  • 6. Overview 6 1. Motivation 2. Background 3. Test Case Selection 4. Test Suite Minimization 5. Test Case Prioritization 6. Summary and Conclusion 7. Suggestions 8. Lessons Learned
  • 7. Motivation 7 Why is this the right set of topics for a survey? • Each topic is related by a common thread of optimization of already existing test cases. • All differ from areas that focus on test case generation. • Intimate relationship between the topics (e.g. minimization could be performed by prioritizing a set of cases and choosing the first N). Is there already a recent survey in this area? • Most similar paper was a survey on Regression Test Selection techniques in 1996. • No previous survey paper that consider Prioritization, Selection, and Minimization collectively.
  • 8. Background 8 Redefine regression testing and further elaborate on the distinction of each optimization technique defined in the introduction. Classification of Test Cases Reusable – Only execute parts of the program that remain unchanged. Not valuable for new changes, but assist with future regression checks. Retestable – Test cases that are still valid after a set of changes and can validate if any regression has occurred. Obsolete – Could be rendered obsolete due to: input/output is no longer relevant and/or no longer test the desired specification (i.e. requirements changed).
  • 9. 9 Test Case Selection Which tests are relevant to be executed
  • 10. Test Case Selection 10 Compare Test Case Selection vs. Test Case Minimisation • Very similar to one another; both revolve around choosing a subset of test cases from the test suite. • Test suite minimization often based on metrics (e.g. code coverage) of an entire application. • Test case selection based on finding relevant tests to be run.
  • 11. Test Case Selection 11 Integer Programming Approach Optimization program in which all of the variables are restricted to integers. • Heavily relies on two matrices that describe the relation between program segments and test cases. (program segment can be defined as a single-entry, single exit block of code) • Matrix function represented as am1x1 + am2x2 >= bm (aij equal to 1 if the segment- test case relation exists, 0 otherwise) • Results in a decision vector (subset of selected test cases) < x1, … , xn > where xi is equal to 1 if the ith test case is included. Problematic with control flow changes. Entire test suite has to be run again.
  • 12. Test Case Selection 12 Data-flow Analysis Approach Technique for gathering information about the possible set of values calculated at various points in a computer program. (i.e. How does input flow through the application) Seek to identity new, modified or deleted definition-use pairs in the new version of the program; then select those cases that exercise these pairs. (Does the new code impact the test data being used?) Problematic with modifications that are unrelated to data-flow change. These test scenarios will not be selected for testing.
  • 13. Test Case Selection 13 Symbolic Execution Approach A means of analyzing a program to determine what inputs cause each part of a program to execute. function(f) { if f == 2 then return fail(); else return success(); } 1. Find all input partitions. 2. Produce test cases so that each input partition is executed at least once. 3. Given information on where the code has been modified (e.g. a diff), return modified code segments and the test cases that execute these segments. Drawback is the algorithmic complexity of symbolic execution as well as how expensive it can be to execute.
  • 14. Test Case Selection 14 Graph-Walk Approach 1. Parse P and P’ into graph data structures. 2. Traverse each graph and compare the nodes 3. If a node in P is not the same as the node in P’, select all the test cases that execute the code within that node. Problematic since there is no data dependence, the approach could include test cases that provide little to no value.
  • 15. Test Case Selection 15 Textual Difference Approach A very similar approach to the Graph-Walk approach • Uses the diff tool provided by Unix. • Code sanitized to remove any characters that would not introduce change (e.g. whitespace)
  • 16. Test Case Selection 16 Path Analysis • Construct exemplar paths from P and P’ • Paths in P’ are categorized as new, modified, cancelled, or unmodified. • Since all test cases and the paths they execute in P are known, the test cases that traverse the modified paths in P’ are selected. The authors had a poor definition of “modified”. Test cases that executed new or cancelled code was not chosen. However, these paths could lead to regression.
  • 17. Test Case Selection 17 Modification-based Technique Yet another similar approach to Graph-Walking • Introduced a testing framework called TestTube. • Partitions P and P’ into program entities (nodes), then monitors the test cases to find out the code that each test case executes. • Those entities that were different are selected. Since the entities include not only functions, but variables, any test case that executes modified functions will be selected. This differs from the data-independent Graph-Walking approach described previously. Modification-based technique encompasses data as well.
  • 18. Test Case Selection 18 Firewall Approach Draw a firewall around the modules of the system that need to be retested. A given module M can be represented as: • No Change NoCh(M) • Only Code CodeCh(M) • Spec Change SpecCh(M) Considering integrations between module A and module B • Ignore NoCh(A) ^ NoCh(B) • If A and B are modified in either code or in spec (CodeCh or SpecCh respectively) the tests should be rerun.
  • 19. Test Case Selection 19 Design-based Approach • Black-box, design level regression test selection that used UML-based designs. • Requires traceability between design and test cases • Leveraged obsolete, retestable, and reusable as highlighted in the background Possible to select test cases that provide no value as a UML diagram does not encapsulate all code interactions. (e.g. change a method, but diagram doesn’t dictate it is ever called, just exists)
  • 20. 20 Test Suite Minimization Techniques that aim to identify redundant test cases
  • 21. Test Suite Minimization 21 Heuristics Essential test cases • If a test requirement can be satisfied by one and only one test case Redundant test cases • If a test case satisfies only a subset of the test requirements satisfied by another test case. GE Heuristic Select the test case that satisfies the maximum number of unsatisfied test requirements. GRE Heuristic Remove all redundant test cases in the test suite (which may make some test cases essential). Then run the GE heuristic.
  • 22. Test Suite Minimization 22 Heuristics Empirical evidence suggests no single approach is better • Concerned with heuristics more so than preciseness. Vast majority of presented literature focused on the minimal hitting set problem. Most minimization techniques are based on coverage criteria, there were exceptions. • Minimizing the test itself (start with a failed test). • Black-box approach to program input/output (research in state machines). Different inputs may not flow through different branches.
  • 23. 23 Test Case Prioritisation Test case prioritisation seeks to find the ideal ordering of test cases for testing.
  • 24. Test Case Prioritisation 24 Coverage-based Prioritisation (code) Structural coverage often used as prioritization criterion. The more code a test executes, the higher chance of finding a fault. Approaches include: • Branch-total (number of branches covered by test cases) • Branch-additional (number of additional branches a test case would execute) • Statement-total • Statement-additional
  • 25. Test Case Prioritisation 25 Interaction Testing (black box) Necessary when the system under test involves multiple combinations of different components. (consider the application environment, Operating System) Research focused on findings those interactions that impact a higher user base. (e.g. prioritize Windows testing over Linux). Additional research done in GUI-based programs. • Take a sequence of inputs and find the case that executes the most code. • Consider user interaction data for prioritisation (heat map).
  • 26. Test Case Prioritisation 26 Distribution-based Approach Profile test cases based on a dissimilarity metric, a real number representing the degree of dissimilarity between two inputs. Cluster test cases according to their similarities which can reveal: • Similar profiles may indicate a group of redundant test cases • Isolated clusters may contain test cases in unusual conditions (fault-proneness)
  • 27. Test Case Prioritisation 27 History-based Approach • Based on association clusters of software artifacts. • If two files are often modified together, they will be clustered together. • Each file is also associated with test cases that impact or execute it. Non-source file (e.g. media, documentation) defects can be as severe as source code defects.
  • 28. Test Case Prioritisation 28 Requirement-based Approach • Test cases are mapped to software requirements • Prioritisation mapped by customer-assigned priority and/or implementation complexity. Makes the prioritization very subjective (customers will have conflicting priorities)
  • 29. Test Case Prioritisation 29 Model-based Approach • Test cases classified into a high priority set TSH and a low priority set TSL • Initial prioritization was randomly assigned • Test case is assigned high priority if it is relevant to the modification made in the model. Similar approach to the UML based approach when selecting test cases.
  • 30. Test Case Prioritisation 30 Session-based Approach • Recorded user sessions from the previous version of the (web) application. • Thought to be ideal for testing web applications as it reflects actual use. • Metrics such as number of HTTP requests, frequency of visits. • Better than random selection, but no single prioritization criterion is always the best.
  • 31. Test Case Prioritisation 31 Cost-Aware Approach Typical prioritization approaches assume equal fault level and cost. Areas of focus similarly categorized: • Time based (tests that take a long time, need a way to fit X tests in Y units of time) • Fault level (prioritize most catastrophic tests first, not necessarily any fault)
  • 32. Meta-Empirical Studies 32 • Empirical evaluation considered post-hoc, knowledge of faults is known. Without previous knowledge of faults, not possible to perform a controlled experiment. • Studies done in regards to seeded vs. real faults (concluded seeded faults can be safely used in place of hand-seeded faults). • Frequency of regression testing has a significant impact of the cost-effectiveness of RTS techniques. The longer the window between tests, the more tests are selected, lowering the value-add. • Research efforts attempting to apply an RTS technique based on the type of program (no silver bullet; Session-based for web applications, Model-based that had its source generated from UML)
  • 34. Analysis of Current Global Trends 34 Consider the graph as not a representation of the number of publications, but trends of research popularity (single publication can count towards two categories)
  • 35. Analysis of Current Global Trends 35 • 60% of studies included less than 10,000 lines of code. • 70% of studies included less than 1,000 test cases.
  • 36. Analysis of Current Global Trends 36
  • 37. State of the Art 37 • Among the class of RTS techniques, the graph walk approach is the most predominant. Intuitive and incredibly generic. • Two ideas played essential roles: test case classification and safe regression test selection (if a modification occurred, it will be selected). • Greedy algorithms are prominent in the selected literature (as much as possible, as soon as possible).
  • 38. Trends 38 • Emphasis on models (early adoption was very code focused) • Increased domains (e.g. web applications) • Cost-awareness – more and more literature are starting to consider test time and amount of fault.
  • 39. Issues / Limitations 39 Limited subjects (60% from the SIR repository). Hard to prove the proposed research techniques can be generalized. Solutions • Design a method that will allow a realistic simulation of real software faults. • Engage with open source and Industry Technology Transfer observations of the literature suggests the community may have reached maturity and its time to transfer. Out of 159 papers, only 31 of them have an author involved in industry. Out of 159 papers, only 12 consider industrial software.
  • 40. Future Direction 40 Orchestrating regression testing techniques with test data generation • Self healing tests Multi-Objective Regression Testing • Group tests requiring a given environment together, reduce cost. Consideration of Other Domains • Most were white-box Tool Support • No readily available tools means practical adoption will remain limited
  • 41. Conclusion 41 Trends in literature show.. • The community is focused on prioritization, especially Graph-Walking. • The community it moving towards assessment of complex trade offs (cost and value) • More are becoming interested in the research area. Number of publications continue to rise.
  • 42. Suggestions 42 • More literature on Minimization and/or clearer content. • Briefly describe the references used. Felt a lot of the references forced you to read the paper. For a paper meant to give an overview of the state of the art.. it did just that
  • 43. Lessons Learned 43 • How big of an area of research regression testing is • Symbolic execution • Consider binaries to be a source of fault