3. Confidential 3
Goals
• Overview coverage as white-box testing
• Learn variety of coverage metrics
• Familiarize coverage implementation in real tools
3
4. Confidential 4
• Do you know about The Heartbleed Bug?
This is a serious vulnerability in the popular OpenSSL cryptographic
software library. This weakness allows stealing the information
protected, under normal conditions, by the SSL/TLS encryption used
to secure the Internet.
But first of all
4
6. Confidential 6
What is Code Coverage?
•What is code coverage?
•White box and black box testing techniques
•Assertions improve code quality a lot, but don't
improve coverage at all
6
7. Confidential 7
What do we need coverage for?
• To know how well our tests actually test our code
• To determine areas where additional tests are
required
• To identify the dead code
• To maintain the test quality over the lifecycle of a
project (avoid test entropy)
7
8. Confidential 8
Coverage Percentage
Coverage generally follows an 80-20 rule
•60-70% - poorly tested software
•80-90% - good enough
•100% - not profitable
What metric should adhere this requirement? (people
often use line coverage)
What tests should adhere this requirement (people
often use unit-tests coverage threshold)
8
9. Confidential 9
Don’t strive for 100% Coverage
Why not 100% code coverage?
1) You should strive for well tested code, not 100% code
coverage
2) A trade off between development effort vs.
uncovered bugs
3) Some parts are generally preferable not to test:
– Accessor methods / value objects
– Overloaded constructors
– User interface code
9
10. Confidential 10
Do strive for 100% coverage
Make your goal 100% coverage, nothing less:
• Even simple getters and setters are tested
• Use mocks
• Sometimes count how many lines of test code you
have: percentage of test code between 40 to 50
percent of the total code base
• Treat the unit test code with the same level of
importance as the production code
• Design for testability
10
http://homepage.mac.com/hey.you/lessons.html [24]
11. Confidential 11
Fixed Minimum Code Coverage Goal
• 85% is a common number, where it came from?
• Don’t pursue fixed minimum code coverage goal
• Managers should expect a high level of coverage,
not to require one
11
12. Confidential 12
Recommendations
• Test interface but not private implementation –
black box (first step)
• Check result of the method in the test
• Write java doc
• Formulas, Calculation is a goal for testing
• Write tests before refactoring
12
13. Confidential 13
Recommendations
• Bug in production -> new tests
• Remove unused methods with tests
• Make team statistic
• Remove some modules from statistic (UI, utility etc)
• Use reflection for calling a lot production methods
(hack)
13
15. Confidential 15
Coverage Metrics
The topic of which coverage metric is "better“ could be
somewhat religious
Metrics:
• Method coverage (function coverage)
• Class coverage
• Statement coverage (line coverage, basic block)
• Decision coverage (branch coverage)
• Conditional coverage
• Path coverage (predicate coverage)
Coverage tools terminology confusion
15
16. Confidential 16
Statement Coverage (Line Coverage, Basic Block)
16
Statement coverage:
Has each executable statement of source code been
executed?
Line coverage:
Has each line of source code been executed?
Basic block coverage:
Has each basic block of source code been executed?
(a basic block is a sequence of bytecode instructions
without any jumps)
17. Confidential 17
Statement Coverage Disadvantages
Experts generally recommend to only use statement
coverage if nothing else is available.
Any other metric is better
Disadvantages:
• Statement coverage doesn’t take branching into
account
• Sensitivity to block length
• Loop termination conditions aren’t checked
17
See demo 17
18. Confidential 19
Decision Coverage (Branch Coverage)
19
Decision coverage:
Has each control structure (such as an if statement)
evaluated both to true and false?
Features:
+ 100% decision coverage implies 100% statement
coverage
+ Decision coverage is more focused on branches in
algorithm that on statements
– Problems due to short-circuit operators (see later)
See demo 19, 20
19. Confidential 22
Conditional Coverage
22
Conditional coverage:
Has each boolean sub-expression evaluated both to true
and false?
Features:
+ Conditional coverage has better sensitivity to the
control flow than decision coverage
– Though, full condition coverage does not guarantee full
decision coverage
Full condition coverage does not guarantee full decision coverage
See demo 22
20. Confidential 24
Path Coverage (Predicate Coverage)
Path Coverage:
Has every possible path from start (method entry) to
finish (return statement, thrown exception) been
executed?
Features:
+ 100% path coverage implies 100% decision coverage.
– Number of paths is exponential to the number of
branches
– Many paths are impossible to exercise due to
relationships of data
24
See demo 24. 28
21. Confidential 27
Coding Practices for Path Coverage
• Keep your code simple (avoid methods with cyclomatic
complexity greater than 10)
• Avoid duplicate decisions
• Avoid data dependencies
27
See demo extra
23. Confidential 29
How does coverage work in real?
1) Instrumentation
2) Running the use cases (test cases) on the application
3) Getting the statistics
29
27. Confidential 33
Lightweight Coverage Implementation
IDE integrated coverage tools
Eclipse
EclEmma, Coverlipse, EclipsePro
Test, CoViewDeveloper, Clover …
IDEA IJ Idea, Emma, JaCoCo, Clover
NetBeans
TikiOne JaCoCoverage, Unit Tests
Code Coverage Plugin, Maven
Test Coverage
Visual Studio VS embedded plugin, Clover.NET
33
28. Confidential 34
General Recommendations
• Good code coverage doesn’t release from responsibility to
write good tests
• Don't use code coverage in test design
• Coverage – for better tests, not for managers
• Strive for well tested code, not 100% code coverage
34
29. Confidential 35
General Recommendations
• Don’t pursue fixed minimum code coverage goal
• Start with simple metrics (statement) and move on to the
more powerful ones later (branch, path).
• Provide developers with IDE integrated coverage tool
35
34. Confidential 40
Links
Nr. Document Author, Date, Location
[11] Design for testability 1. http://en.wikipedia.org/wiki/Design_For_Test
2. http://www.swqual.com/SQGNE/presentations/2006-07/Rakitin%20Oct
%202006.pdf
3. http://www.io.com/~wazmo/papers/design_for_testability_PNSQC.pdf
[12] Additional information about coverage http://www.coveragemeter.com/faq.html
[13] “Software Unit Test Coverage and
Adequacy” article
http://www.cs.bris.ac.uk/Teaching/Resources/COMS30114/testAdequacy.p
df
[14] “Leveraging disposable
instrumentation to reduce Coverage
Collection Overhead” article.
More about instrumenting coverage.
Particularly useful for coverage tools
developers.
http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=7F5416016A08E
31E150E888E7FD3393A?doi=10.1.1.61.5166&rep=rep1&type=pdf
[15] “Is Code Coverage Important?” article http://architects.dzone.com/articles/is-code-coverage-important
[16] “Find software bugs, defects using
code coverage” article
http://searchsoftwarequality.techtarget.com/news/article/0,289142,sid92_
gci1244258,00.html
[17] “The effectiveness of code coverage
tools in software testing” article
http://searchsoftwarequality.techtarget.com/tip/0,289483,sid92_gci130649
5,00.html
[18] “Code Coverage, what is it good for?”
article
http://blog.schauderhaft.de/2008/10/20/code-coverage-what-is-it-good-
for/
40
35. Confidential 41
Links
Nr. Document Author, Date, Location
[19] “Increasing Code Coverage May Be Harmful”
article
http://www.dcmanges.com/blog/increasing-code-
coverage-may-be-harmful
[20] “EXPERIENCE WITH THE COST OF DIFFERENT
COVERAGE GOALS FOR TESTING” article
http://www.exampler.com/testing-
com/writings/experience.pdf
[21] “How to Misuse Code Coverage” article http://www.exampler.com/testing-
com/writings/coverage.pdf
[22] “In pursuit of code quality: Don't be fooled by the
coverage report” article
http://www-128.ibm.com/developerworks/java/library/j-
cq01316/?ca=dnw-704
[23] Here is an example of what code CoverageMeter
includes into source code
http://www.coveragemeter.com/codecoverage.html
[24] “Lessons learned on the road to 100% code
coverage” article,
recommend 100% coverage as a goal
http://homepage.mac.com/hey.you/lessons.html
[25] “Software Negligence and Testing Coverage”
article, it states 110 coverage metrics
http://www.kaner.com/coverage.htm
[26] Some other valuable docs http://www.coveragemeter.com/faq.html
http://firstclassthoughts.co.uk/ant/code_coverage.htm
[27] “What is Wrong with Statement Coverage” article http://www.bullseye.com/statementCoverage.html#a4
41
Notes de l'éditeur
Code coverage measurement simply determines those statements in a body of code have been executed through a test run and those which have not. In general, a code coverage system collects information about the running program and then combines that with source information to generate a report on test suite's code coverage.
Code coverage is part of a feedback loop in the development process. As tests are developed, code coverage highlights aspects of the code which may not be adequately tested and which require additional testing. This loop will continue until coverage meets some specified target.
What does "executed" mean here?
It means you exercise your application code using a driver of some kind. The driver could be a JUnit test suite, a test suite in some other test framework, or it could even be a human sitting in front of your application and clicking on buttons acccording to some use case script. The driver thus induces coverage. The application code itself runs more or less according to how it is meant to run: as a standalone application, as a J2EE app, inside a J2EE container simulator, maybe even distributed over several machines.
Structural Testing and Functional Testing
Code coverage analysis is a structural testing technique (AKA glass box testing and white box testing). Structural testing compares test program behavior against the apparent intention of the source code. This contrasts with functional testing (AKA black-box testing), which compares test program behavior against a requirements specification. Structural testing examines how the program works, taking into account possible pitfalls in the structure and logic. Functional testing examines what the program accomplishes, without regard to how it works internally.
Structural testing is also called path testing since you choose test cases that cause paths to be taken through the structure of the program. Do not confuse path testing with the path coverage metric, explained later.
At first glance, structural testing seems unsafe. Structural testing cannot find errors of omission. However, requirements specifications sometimes do not exist, and are rarely complete. This is especially true near the end of the product development time line when the requirements specification is updated less frequently and the product itself begins to take over the role of the specification. The difference between functional and structural testing blurs near release time.
То, что это структурная методология накладывает много ограничений, то есть само по себе покрытие не показывает насколько хорошо наши тесты тестируют продукт, они показывают насколько полно вызывается production код в ходе тестов.
We might write tests that will just call our production code without checking the logic of the program, without checking return values, and this tests will show good coverage while they are almost useless (they still can find abnormal termination points, i.e. exceptions)
TDD is great, but code coverage is just one aspect: it helps finding dead and untested code. But high coverage doesn't mean good software. If all you want is good code coverage, write randomized tests (fuzz tests). Randomized tests help a lot, but the problem is: they don't usually test correctness. Assertions improves code quality a lot, but don't improve coverage at all. So I think that most tests should be feature driven, not coverage driven.
I could write 1 unit test with lots of reflection that will give me 100% code coverage, yet will not give me any value as a test.
We might write tests that will just call our production code without checking the logic of the program, without checking return values, and this tests will show good coverage while they are almost useless (they still can find abnormal termination points, i.e. exceptions)
To know how well our tests actually test our code
To know whether we have enough testing in particular code part
To identify the dead code
Нулевое покрытие некоторых методов или классов может говорить (разработчик все равно решает сам) что код мертвый и нигде не используется.
To maintain the test quality over the lifecycle of a project
Avoid test entropy: as your code goes through multiple release cycles, there can be a tendency for unit tests to atrophy
60-70% correspond to poorly tested software. Expect undiscovered bugs in such software (to be discovered by your users, of course). Because of this, "good" software companies instill internal processes whereby a team cannot release a piece of software unless it passes release gates like "line coverage must be 80% or higher", etc.
Reaching for 100.0% coverage isn't profitable either. You just get a lot less quality improvement for considerably more effort to reach such perfection.
Coverage close to 85-90% is "good enough" for all practical purposes. Some customers require good coverage results.
Code coverage is not a panacea. Coverage generally follows an 80-20 rule. Increasing coverage values becomes difficult with new tests delivering less and less incrementally. If you follow defensive programming principles where failure conditions are often checked at many levels in your software, some code can be very difficult to reach with practical levels of testing. Coverage measurement is not a replacement for good code review and good programming practices.
It’s usually a good idea to start with the most simple metric (usually statement coverage) and move on to the more powerful ones later (branch, path).
We achieved coverage for the CPE module when all the use cases are executed is as follows:
Function coverage -- 100%
Line coverage -- 82%
Why not 100% code coverage
So why is it not worth striving for the 100% test coverage? The main argument is that its a waste af time. Tests take time to write. Tests takes time to refactor when the code changes. Tests takes time to execute. Hence, you should only test the parts of your code where you gain leverage. The following are examples of code that I in the general prefer not to test
Accessor methods / value objects.
Overloaded constructors
User interface code
1. Designing your initial test suite to achieve 100% coverage is an even worse idea. It’s a
sure way to create a test suite weak at finding those all-important faults of omission.
Further, the coverage tool can no longer tell you where your test suite is weak - because
it's uniformly weak in precisely the way that coverage can't directly detect.
1. Occasionally I would manually test the production results just to confirm. And occasionally I would encounter a bug that wasn't picked up in the unit tests even though I had 100% coverage. But, this was far far far less than the number of bugs I was used to with more traditional / heavyweight methodologies.
2. Problems on a way to 100% coverage:
The project started unit testing an existing code base of thousands of lines of java code with no automated unit tests. This means that when we wrote our first automated unit test, we had coverage of 0.01% (or less). Partly because of this code base, we never got anywhere near even 33% code coverage although some classes and packages did reach 100%.
My own skills in unit testing patterns were not well developed.
Many developers seemed resistant to unit testing, I regularly heard excuses about why a particular method or approach could not be unit tested. Developers generally did not / do not design for testability. This is especially true if they are not developing test first.
3. This means that even simple getters and setters are tested.
В принципе какой-то «программер» вполне может в геттер или сеттер напихать бизнес логики (в этом случае название метода будет не соответствовать его контракту). Тут он реально прав.
4. Typically, when testing a database application, you need to set up a database (often not a trivial task), load seed data, load test data and finally run the unit tests.
Автор не понимает различия между unit-tests и другими видами тестов.
5. Sometimes count how many lines of test code you have
I attended an excellent course several years back on Agile development by Martin Fowler. He provided an interesting statistic. He stated that the "best practice" projects he has seen where the bug count is extremely low by industry standards (no more than a few bugs every few months), the percentage of test code is between 40 to 50 percent of the total code base. A number of times I've counted the total number of lines (either in a class or a project) and the number of lines of test code is pretty darn close to 50%. See what your count is!
1. I expect a high level of coverage. Sometimes managers require one. There's a subtle
difference.
Suppose a manager requires some level of coverage, perhaps 85%, as a "shipping gate".
The product is not done - and you can't ship - until you have 85% coverage.7
The problem with this approach is that people optimize their performance according to
how they’re measured. You can get 85% coverage by looking at the coverage conditions,
picking the ones that seem easiest to satisfy, writing quick tests for them, and iterating
until done. That's faster than thinking of coverage conditions as clues pointing to
weaknesses in the test design. It's especially faster because thinking about test design
might lead to "redundant" tests that don't increase coverage at all. They only find bugs.
2. Coverage numbers (like many numbers) are dangerous because they're objective but incomplete.
They too often distort sensible action.
3. 85% is a common number. People seem to pick it because that's the number other respectable companies use. I
once asked someone from one of those other respectable companies why they used 85%. He said, "When our
division started using coverage, we needed a number. Division X has a good reputation, so we thought we'd use
the number they use." I didn't follow the trail back through Division X. I have the horrible feeling that, if I traced
it all the way back to the Dawn of Time, I'd find someone who pulled 85% out of a hat. I don't know of any
particular reasons to prefer one high number over another. Some claim that 85% is the point at which achieving
higher coverage gets too hard. Based on my experience, I'd say that whether and where a "knee" in the effort graph
appears is highly dependent on the particular application and the approach to test implementation.
1. I expect a high level of coverage. Sometimes managers require one. There's a subtle
difference.
Suppose a manager requires some level of coverage, perhaps 85%, as a "shipping gate".
The product is not done - and you can't ship - until you have 85% coverage.7
The problem with this approach is that people optimize their performance according to
how they’re measured. You can get 85% coverage by looking at the coverage conditions,
picking the ones that seem easiest to satisfy, writing quick tests for them, and iterating
until done. That's faster than thinking of coverage conditions as clues pointing to
weaknesses in the test design. It's especially faster because thinking about test design
might lead to "redundant" tests that don't increase coverage at all. They only find bugs.
2. Coverage numbers (like many numbers) are dangerous because they're objective but incomplete.
They too often distort sensible action.
3. 85% is a common number. People seem to pick it because that's the number other respectable companies use. I
once asked someone from one of those other respectable companies why they used 85%. He said, "When our
division started using coverage, we needed a number. Division X has a good reputation, so we thought we'd use
the number they use." I didn't follow the trail back through Division X. I have the horrible feeling that, if I traced
it all the way back to the Dawn of Time, I'd find someone who pulled 85% out of a hat. I don't know of any
particular reasons to prefer one high number over another. Some claim that 85% is the point at which achieving
higher coverage gets too hard. Based on my experience, I'd say that whether and where a "knee" in the effort graph
appears is highly dependent on the particular application and the approach to test implementation.
1. I expect a high level of coverage. Sometimes managers require one. There's a subtle
difference.
Suppose a manager requires some level of coverage, perhaps 85%, as a "shipping gate".
The product is not done - and you can't ship - until you have 85% coverage.7
The problem with this approach is that people optimize their performance according to
how they’re measured. You can get 85% coverage by looking at the coverage conditions,
picking the ones that seem easiest to satisfy, writing quick tests for them, and iterating
until done. That's faster than thinking of coverage conditions as clues pointing to
weaknesses in the test design. It's especially faster because thinking about test design
might lead to "redundant" tests that don't increase coverage at all. They only find bugs.
2. Coverage numbers (like many numbers) are dangerous because they're objective but incomplete.
They too often distort sensible action.
3. 85% is a common number. People seem to pick it because that's the number other respectable companies use. I
once asked someone from one of those other respectable companies why they used 85%. He said, "When our
division started using coverage, we needed a number. Division X has a good reputation, so we thought we'd use
the number they use." I didn't follow the trail back through Division X. I have the horrible feeling that, if I traced
it all the way back to the Dawn of Time, I'd find someone who pulled 85% out of a hat. I don't know of any
particular reasons to prefer one high number over another. Some claim that 85% is the point at which achieving
higher coverage gets too hard. Based on my experience, I'd say that whether and where a "knee" in the effort graph
appears is highly dependent on the particular application and the approach to test implementation.
The topic of which coverage metric is "better" could be somewhat religious. There have been academic studies showing that, for example, path coverage at a certain level detects somewhat more bugs than, say, line coverage at the same level. I personally think the actual metric definition is not that important. I'd rather empower all developers on my team with a free and fast tool so that they can track their own coverage (of some kind) early and frequently. An experienced developer will look at the coverage report that links to the source code, drill down a bit, look at the "red" areas, and figure out which, if any, areas of the product he left somewhat under-tested. This is why EMMA opts for a set of simple metrics that are easy to obtain without a lot of runtime overhead.
Function coverage is a measure for verifying that each function (method) is invoked during test execution. In all its simplicity, function coverage is a very easy way to spot the biggest gaps in your code coverage.
Statement Coverage
This metric reports whether each executable statement is encountered.
Also known as: line coverage, segment coverage [Ntafos1988], C1 [Beizer1990 p.75] and basic block coverage. Basic block coverage is the same as statement coverage except the unit of code measured is each sequence of non-branching statements.
I highly discourage using the undescriptive name C1. People sometimes incorrectly use the name C1 to identify decision coverage. Therefore this term has become ambiguous.
The chief advantage of this metric is that it can be applied directly to object code and does not require processing source code. Performance profilers commonly implement this metric.
The chief disadvantage of statement coverage is that it is insensitive to some control structures. For example, consider the following C/C++ code fragment:
int* p = NULL;
if (condition)
p = &variable;
*p = 123;
Without a test case that causes condition to evaluate false, statement coverage rates this code fully covered. In fact, if condition ever evaluates false, this code fails. This is the most serious shortcoming of statement coverage. If-statements are very common.
Statement coverage does not report whether loops reach their termination condition - only whether the loop body was executed. With C, C++, and Java, this limitation affects loops that contain break statements.
Since do-while loops always execute at least once, statement coverage considers them the same rank as non-branching statements.
Statement coverage is completely insensitive to the logical operators (|| and &&).
Statement coverage cannot distinguish consecutive switch labels.
Test cases generally correlate more to decisions than to statements. You probably would not have 10 separate test cases for a sequence of 10 non-branching statements; you would have only one test case. For example, consider an if-else statement containing one statement in the then-clause and 99 statements in the else-clause. After exercising one of the two possible paths, statement coverage gives extreme results: either 1% or 99% coverage. Basic block coverage eliminates this problem.
One argument in favor of statement coverage over other metrics is that bugs are evenly distributed through code; therefore the percentage of executable statements covered reflects the percentage of faults discovered. However, one of our fundamental assumptions is that faults are related to control flow, not computations. Additionally, we could reasonably expect that programmers strive for a relatively constant ratio of branches to statements.
In summary, this metric is affected more by computational statements than by decisions.
Basic block coverage considers each sequence of non-branching statements as its unit of code instead of individual statements.
Does CoverageMeter support line coverage?
CoverageMeter does not support line coverage because this kind of measurement and statistic is not accurate.
This metric depends on how you format the code. For example, take the following function:
1 int main() 2 { 3 if (true) return 1; 4 foo(); 5 return 0; 6 }
Execute it and the line code coverage will produce:
1 int main() 2 { 3 HIT if (true) return 1; 4 MIS foo(); 5 MIS return 0; 6 }
Line Coverage: 33%Reformat the code as follows:
1 int main() 2 { 3 if (true) 4 return 1; 5 foo(); 6 return 0; 7 }
Execute it and the line code coverage will produce:
1 int main() 2 { 3 HIT if (true) 4 HIT return 1; 5 MIS foo(); 6 MIS return 0; 7 }
Line Coverage: 50%Reformat the code as follows:
1 int main() 2 { 3 if (true) 4 return 1; 5 foo(); return 0; 6 }
Execute it and the line code coverage will produce:
1 int main() 2 { 3 HIT if (true) 4 HIT return 1; 5 MIS foo(); return 0; 6 }
Line Coverage: 66%Reformat the code as follows:
1 int main() 2 { 3 if (true) return 1; foo(); return 0; 4 }
Execute it and the line code coverage will produce:
1 int main() 2 { 3 HIT if (true) return 1; foo(); return 0; 4 }
Line Coverage: 100%This small example shows that line coverage produces very different result depending on how the source code is formated. The decision coverage provided by CoverageMeter is independent of the coding style.
Как пример loop termination – IndexOutOfBoundsException, < меняем на <= и получаем Exception при получении элемента массива
Loop Termination Decisions
Statement coverage does not call for testing loop termination decisions. Statement coverage only calls for executing loop bodies. In a loop that stops with a C++/C break statement, this deficiency hides test cases needed to expose bugs related to boundary checking and off-by-one mistakes.
off-by-one error ошибка диапазона в программировании - ошибка, возникающая при занижении или завышении на единицу числа выполнения каких либо действий
Loop Termination Decision Example
The C++ function below copies a string from one buffer to another.
char output[100];
for (int i = 0; i <= sizeof(output); i++) {
output[i] = input[i];
if (input[i] == '\0') {
break;
}
}
The main loop termination decision, i <= sizeof(output), intends to prevent overflowing the output buffer. You can achieve full statement coverage without testing this condition. The overflow decision really ought to use operator < rather than operator <=, so a buffer overflow could occur post-release. You get full statement coverage of this code with any input string of length 100 or less, without exposing the bug.
100% decision coverage implies 100% statement coverage.
Decision Coverage
This metric reports whether boolean expressions tested in control structures (such as the if-statement and while-statement) evaluated to both true and false. The entire boolean expression is considered one true-or-false predicate regardless of whether it contains logical-and or logical-or operators. Additionally, this metric includes coverage of switch-statement cases, exception handlers, and interrupt handlers.
Also known as: branch coverage, all-edges coverage [Roper1994 p.58], basis path coverage [Roper1994 p.48], C2 [Beizer1990 p.75], decision-decision-path testing [Roper1994 p.39]. "Basis path" testing selects paths that achieve decision coverage.
I discourage using the undescriptive name C2 because of the confusion with the term C1.
This metric has the advantage of simplicity without the problems of statement coverage.
A disadvantage is that this metric ignores branches within boolean expressions which occur due to short-circuit operators. For example, consider the following C/C++/Java code fragment:
if (condition1 && (condition2 || function1()))
statement1;
else
statement2;
This metric could consider the control structure completely exercised without a call to function1. The test expression is true when condition1 is true and condition2 is true, and the test expression is false when condition1 is false. In this instance, the short-circuit operators preclude a call to function1.
Decision coverage is more focused on branches in algorithm that on statements. Say, you have two branches, first with one statement and second with 100 statements. If test hit the just first branch – line coverage will show 0.01% and branch coverage 50%.
The downside of decision coverage is that the measure doesn't take into consideration how the boolean value was gotten -- whether a logical OR was short-circuited or not, for example, leaving whatever code was in the latter part of the statement unexecuted.
Condition Coverage
Condition coverage reports the true or false outcome of each boolean sub-expression, separated by logical-and and logical-or if they occur. Condition coverage measures the sub-expressions independently of each other.
This metric is similar to decision coverage but has better sensitivity to the control flow.
However, full condition coverage does not guarantee full decision coverage. For example, consider the following C++/Java fragment.
bool f(bool e) { return false; }
bool a[2] = { false, false };
if (f(a && b)) ...
if (a[int(a && b)]) ...
if ((a && b) ? false : false) ...
All three of the if-statements above branch false regardless of the values of a and b. However if you exercise this code with a and b having all possible combinations of values, condition coverage reports full coverage.
Condition coverage is not a true superset of decision coverage because it considers each sub-expression independently, not minding about whether the complete expression is evaluated both ways.
100% path coverage implies 100% decision coverage.
Path coverage measures whether each possible path from start (method entry) to finish (return statement, thrown exception) is covered.
Path Coverage
This metric reports whether each of the possible paths in each function have been followed. A path is a unique sequence of branches from the function entry to the exit.
Also known as predicate coverage. Predicate coverage views paths as possible combinations of logical conditions [Beizer1990 p.98].
Since loops introduce an unbounded number of paths, this metric considers only a limited number of looping possibilities. A large number of variations of this metric exist to cope with loops. Boundary-interior path testing considers two possibilities for loops: zero repetitions and more than zero repetitions [Ntafos1988]. For do-while loops, the two possibilities are one iteration and more than one iteration.
Path coverage has the advantage of requiring very thorough testing. Path coverage has two severe disadvantages. The first is that the number of paths is exponential to the number of branches. For example, a function containing 10 if-statements has 1024 paths to test. Adding just one more if-statement doubles the count to 2048. The second disadvantage is that many paths are impossible to exercise due to relationships of data. For example, consider the following C/C++ code fragment:
if (success)
statement1;
statement2;
if (success)
statement3;
Path coverage considers this fragment to contain 4 paths. In fact, only two are feasible: success=false and success=true.
Researchers have invented many variations of path coverage to deal with the large number of paths. For example, n-length sub-path coverage reports whether you exercised each path of length n branches. Others variations include linear code sequence and jump (LCSAJ) coverage and data flow coverage.
Keep your code simple. Avoid methods with cyclomatic complexity greater than 10. Not only does this reduce the number of basis paths that you need to test, but it reduces the number of decisions along each path.
Avoid duplicate decisions.
Avoid data dependencies.
Consider the following example:
{see picture on the slide}
The variable x depends indirectly on the object1 parameter, but the intervening code makes it difficult to see the relationship. As a method grows more complex, it may be nearly impossible to see the relationship between the method's input and the decision expression.
How do they work?
InstrumentationThe coverage tools will first instrument the application under test dlls and exes. Instrumentation is a process of inserting additional code in to the compiled program for the purpose of collecting measurement data while the program is running.
Running the use cases (test cases) on the applicationOnce the instrumentation process is over, the coverage tool will bring up the application under test. The next step is to execute all the different use cases on the application under test. These use cases can be either automated or manual. While the user runs different use cases on the application, in the background the coverage tool will analyze the coverage of the application code w.r.t of the use cases we ran.
Getting the statistics
Source Code Instrumentation - This approach adds instrumentation statements to the source code and compiles the code with the normal compile tool chain to produce an instrumented assembly.
If it's done to source code, most tools don't change the original source; instead, they modify a copy
that's then passed to the compiler in a way that makes it look like the original
Intermediate code Instrumentation - Here the compiled class files are instrumented by adding new bytecodes and a new instrumented class generated.
Runtime Information collection- This approach collects information from the runtime environment as the code executes to determine coverage information
На каком этапе встраивается код сбора покрытия
Первые два требуют рекомпиляции
More times than I can count, I have heard someone say "We're trying for XX percent code coverage with our unit tests". If XX is 100% are you guaranteed to have no defects?
The first thing that comes to my mind when people start rambing on about unit tests is: What about integration and system testing? There are a number of blog entries out there that talk about the difference between the types of testing. It suffices to say that pure unit testing, even at 100% code coverage, is only going to reveal a small percentage of an application's defects.
So what are you actually trying to do?
Test for a reason -- don't just test to test. You can get into a cycle where you're writing unit test apon unit test and still releasing applications that are defect riddled. First you need to define a quality metric (a way to measure the quality of the application). "As few bugs as possible" isn't it! Ask yourself questions such as: What's a bug or defect? and How is the priority and severity of a defect determined? (e.g. Are spelling errors bugs? They may not be to a developer, but they may be show stoppers to a client.) Everyone wants to jump in and test up the wazoo but the problem is, if you test the wrong things and you don't know what a bug or defect is then your testing is potentially wasting time.
Let's put this another way: let's say you have 80% coverage with all of your tests. What's preventing that 20% that you're not testing from containing 90% of the defects and 100% of the P1 show-stoppers?
Your goal is to identify the cross product of the areas of highest quality risks and the areas that are most important to the user of the application. These areas must have enough testing to ensure that the desired level of quality is met. For example, what if the 20% that you did not test just so happens to be the login page? The client cannot log into the application! So what's the point of testing or even writing the rest of the app? This is an obvious overtrivialization as the login page is an obvious element to test. But in reality, the show stoppers commonly fall into a trivial category. A number of projects that I've performed triage in the past suffered from this problem to an alarming degree. Months of work and testing when into components of the product that, at the end of the day, were not high on the clients list and were overshadowed by trivial defects.
Plan ahead. Identify what determines quality in your application. Test effectively.
More times than I can count, I have heard someone say "We're trying for XX percent code coverage with our unit tests". If XX is 100% are you guaranteed to have no defects?
The first thing that comes to my mind when people start rambing on about unit tests is: What about integration and system testing? There are a number of blog entries out there that talk about the difference between the types of testing. It suffices to say that pure unit testing, even at 100% code coverage, is only going to reveal a small percentage of an application's defects.
So what are you actually trying to do?
Test for a reason -- don't just test to test. You can get into a cycle where you're writing unit test apon unit test and still releasing applications that are defect riddled. First you need to define a quality metric (a way to measure the quality of the application). "As few bugs as possible" isn't it! Ask yourself questions such as: What's a bug or defect? and How is the priority and severity of a defect determined? (e.g. Are spelling errors bugs? They may not be to a developer, but they may be show stoppers to a client.) Everyone wants to jump in and test up the wazoo but the problem is, if you test the wrong things and you don't know what a bug or defect is then your testing is potentially wasting time.
Let's put this another way: let's say you have 80% coverage with all of your tests. What's preventing that 20% that you're not testing from containing 90% of the defects and 100% of the P1 show-stoppers?
Your goal is to identify the cross product of the areas of highest quality risks and the areas that are most important to the user of the application. These areas must have enough testing to ensure that the desired level of quality is met. For example, what if the 20% that you did not test just so happens to be the login page? The client cannot log into the application! So what's the point of testing or even writing the rest of the app? This is an obvious overtrivialization as the login page is an obvious element to test. But in reality, the show stoppers commonly fall into a trivial category. A number of projects that I've performed triage in the past suffered from this problem to an alarming degree. Months of work and testing when into components of the product that, at the end of the day, were not high on the clients list and were overshadowed by trivial defects.
Plan ahead. Identify what determines quality in your application. Test effectively.