Computer Science classes don't teach testing. Testing is as critical to software engineering as writing code. Here I show what CS programs should have taught, but didn't.
10. 10
Without Tests Design & Enhancabilty
Degrade, Refactoring is Unsafe
Cost of change: C Cost of change: C + n Cost of change: C x n
n
?
?
Cost of change: C
n
Cost of change: Cn
Example: Dave Nicolette
16. Code Test
The Waterfall Ideal Writes All the Code
and then Tests it
Ref: “BDD in 5 minutes” video by Corey Haines
"How test-driven development works" blog by J. B. Rainsberger
16
29. But what if our initial vision was flawed?
Code
Test
Design
Requirements
Analysis
Deployed
High Level
Requirements
Business
Vision
Product Concept
wrong !!!
Entire effort fails at
great expense !!!
29
31. Who Are Tests For?
• Ourselves
• Our team
• Other devs who use our code
• Anyone who has to maintain our code later
• Other stakeholders (security, production support,
call center, managers, etc.)
• Our company or organization
• Our end users
31
33. What Type of Tests
• Tests with a business focus
• Tests with a technology focus
• Tests that support software development
• Tests that critique the product we built
• Most types of tests should be automated
or tools
• A few types of tests must be manual
33
36. Strive For Testing Pyramid
Auto.
GUI Tests
BDD
Tests
Integration
Tests
Automated
Unit Tests
36
Painfully
Slow
Lightening
Fast
T
e
s
t
S
p
e
e
d
37. Where Automated Tests are Run
• Our Desktops or Laptops
• The Continuous Integration Server
• A Performance Test Server
• A Security Test Server
• Reliability and Failover Test Server
• Etc.
37
Developer
38. 4. Update
Build
CI Testing Cycle with Additional Testing
Developer
Deploy
Server
Version Control
Repository
1. Run Tests Locally
2. Check-in
CI Server3. Monitor changes
5. Build
6. Run Tests
7. Send Build
& Test Report
8b. Deploy App
at Release
Performance
Test Server
Frequently
8a. Performance,
Security, etc.
Test Apps
38
39. Black Box vs. White Box Testing
39
Black
Box
White
Box
Input
Input
Output
Output
• Use Black
Box Tests.
• White Box
Tests are
extremely
brittle !!!
40. Test Driven Development vs. Test After
40
Black
Box
TDD • Build code in working
baby steps.
• Check-in frequently.
• Deploy frequently, if
desired.
• Constant testing forces
loose coupling & better
design.
• Frequent refactoring
further improves design.
• Adding features is
simplified
41. Test Driven Development vs. Test After
41
Black
Box
Test After
• Test after often creates
false positives, where
tests pass for the wrong
reasons.
• It is harder to write tests
after, because of the
code wasn’t designed
for testability.
• Never enough time to
write the tests afterward.
• Fixing bugs is a
nightmare.
42. Behavior Driven Development vs. Test After
42
Black
Box
BDD • Similar to TDD process,
but at a higher level
• Active Product Owner.
• Unambiguous spec.
• Automates spec, better
than untestable
documentation.
• Catches broken features.
• Proves when feature
complete.
• Adding features is
simplified.
43. 43
Black
Box
Test After • Requirements unclear
until too late.
• Code written to wrong
requirements.
• Never know when you
are really done.
• Often manual and
arbitrary.
• Never enough time to
write the tests afterward.
• Bugs undetected.
• Fixing bugs is a
nightmare.
Behavior Driven Development vs. Test After
45. Don’t Use Manual Tests - Automate
Iterations
Time
Time Needed
Time Available
New Features Regression
Simplification: e.g. Doesn't
include interaction of new &
existing features
45
46. Bug Growth in Software without
Automated Regression Testing
T i m e
B
u
g
s
code fix code codecode codecodecode fixfixfixfix fix
release 1
release 2
release 3
release 4
release 5
release 6
release 7
47. 47
Delay in Testing
(both manual & capture/playback)
Dev. Months Test
Dev. Test
Dev. Test
Dev. Test
Days
Weeks
Even in the best case testing is after development coding.
48. Test Frequently ! ! !
Ref: ”Lean from the Trenches” by Henrick Kniberg
Release Testing vs. Continuous Testing
48
49. End Testing vs. Continuous Testing
Ref: ”Lean from the Trenches” by Henrick Kniberg
The more often you test, the more time you save.
Use TDD & BDD to test every few minutes
49
52. In BDD, Team and Business
Collaborate on Acceptance Criteria
User Story: Vetting Sighting
In order to show confirmation
of a sighting,
As a US-CERT analyst,
I want to mark a sighting
as vetted
• WannaCry sighting is viewed by Charley US-CERT analyst
• Charley marks sighting as vetted
• Tracy from Treasury searches for sightings vetted today
• Tracy sees WannaCry as a confirmed sighting
52
53. Acceptance Criteria Becomes Executable Tests
Feature: Trusted partner finds un-vetted sighting
In order to show confirmation of a sighting,
As a US-CERT analyst,
I want to mark a sighting a vetted
Scenario: Un-vetted Sighting
Given that "WannaCry" is a sighting
And analyst has not vetted that sighting
When the status of "WannaCry" is checked
Then the status of "WannaCry" should be un-vetted
Scenario: Vetted Sighting
Given that "WannaCry" is a sighting
And an analyst has vetted that sighting
When the status of "WannaCry" is checked
Then the status of "WannaCry" should be vetted
Feature: Trusted partner finds vetted sighting
In order to show confirmation of a sighting,
As a US-CERT analyst,
I want to mark a sighting a vetted
Scenario: Un-vetted Sighting
Given that "WannaCry" is a sighting
And analyst has not vetted that sighting
When the status of "WannaCry" is checked
Then the status of "WannaCry" should be un-vetted
Scenario: Vetted Sighting
Given that "WannaCry" is a sighting
And an analyst has vetted that sighting
When the status of "WannaCry" is checked
Then the status of "WannaCry" should be vetted
Feature: Analyst vets a sighting
In order to show confirmation of a sighting,
As a US-CERT analyst,
I want to mark a sighting as vetted
Scenario: Un-vetted Sighting
Given that "WannaCry" is a sighting
And analyst has not vetted that sighting
When the status of "WannaCry" is checked
Then the status of "WannaCry" should be un-vetted
Scenario: Vetted Sighting
Given that "WannaCry" is a sighting
And an analyst has vetted that sighting
When the status of "WannaCry" is checked
Then the status of "WannaCry" should be vetted
53
54. Automated Acceptance Tests Results
Feature: Imports TEWI sightings
let(:purchase) { LineItem.new("1 book at 12.49") }
it "should have a name of 'book'" do
purchase.name.should == 'book'
end
it "should have a quanity of 1" do
purchase.quantity.should == 1
end
it "should have a price of 12.49" do
purchase.price.should == 12.49
end
end
Feature: CERT analyst finds a recent TEWI sighting
context "a simple line item should know its name, quanity and price" do
let(:purchase) { LineItem.new("1 book at 12.49") }
it "should have a name of 'book'" do
purchase.name.should == 'book'
end
it "should have a quanity of 1" do
purchase.quantity.should == 1
end
it "should have a price of 12.49" do
purchase.price.should == 12.49
end
end
Feature: Trusted partner finds un-vetted sighting
context "a simple line item should know its name, quanity and price" do
let(:purchase) { LineItem.new("1 book at 12.49") }
it "should have a name of 'book'" do
purchase.name.should == 'book'
end
it "should have a quanity of 1" do
purchase.quantity.should == 1
end
it "should have a price of 12.49" do
purchase.price.should == 12.49
end
end
Feature Trusted partner finds vetted sighting
context "a simple line item should know its name, quanity and price" do
let(:purchase) { LineItem.new("1 book at 12.49") }
it "should have a name of 'book'" do
purchase.name.should == 'book'
end
it "should have a quanity of 1" do
purchase.quantity.should == 1
end
it "should have a price of 12.49" do
purchase.price.should == 12.49
end
end
Feature: Analyst vets a sighting
In order to show confirmation of a sighting,
As a US-CERT analyst,
I want to mark a sighting a vetted
Scenario: Un-vetted Sighting
Given that "WannaCry" is a sighting
And analyst has not vetted that sighting
When the status of "WannaCry" is checked
Then the status of "WannaCry" should be un-vetted
Scenario: Vetted Sighting
Given that "WannaCry" is a sighting
And an analyst has vetted that sighting
When the status of "WannaCry" is checked
Then the status of "WannaCry" should be vetted
54
55. An ATM App User Stories
Cash Withdrawal :
In order to get money
when the bank is closed
As a bank customer
I want to withdraw cash at the ATM
Check Deposit :
In order to deposit my checks
when the bank is closed
As a bank customer
I want to deposit checks at the ATM
Transfer to Savings:
In order to earn interest even
when the bank is closed
As a bank customer
I want to transfer money from
checking to savings at the ATM
Transfer to Checking:
In order to not overdraw my
account when the bank is closed
As a bank customer
I want to transfer money from
savings to checking at the ATM
55
56. High Level Scenario Titles become
Acceptance Criteria when they have details
Feature: Cash Withdrawal
In order to get money when the
bank is closed
As a bank customer
I want to withdraw cash at the ATM
• Successful Withdrawal
• Less than balance
• Equal to balance
• Withdrawal Failed Due to
Insufficient Funds
• Withdrawal Failed Because
Cash Dispenser Doesn’t
Dispense One Dollar Bills
• Withdrawal Failed Because
Account Closed
First Cut Acceptance Criteria
56
57. • Successful Withdrawal:
less than balance
Given my account has
starting balance of $100
When I withdraw $20
Then $20 should be
dispensed
And the ending balance of
my account should be $80
Given, When, Then format –
Business language with unambiguous detail
Detailed Acceptance Criteria
These detailed acceptance criteria are concrete
that become executable tests 57
61. 61
In TDD Develop Everything
Including Tests in Baby Steps
• Write a tiny test
• Watch it fail
• Write only enough
code to make it
pass - KISS
• Watch it pass
• If the code gets
messy, refactor in
baby steps
• Repeat
62. In TDD First Write a New Test
62
pytest (Python Test Tool) Example
69. Short of the TDD ideal: Bugs
Problem: Bugs
• The code already exists
• Bug behavior is what the code shouldn’t do
Solution: Write a failing “RED” test for the bug
• Write an automated test before doing anything else (do NOT touch the
code yet)
• Test would be pass, if the code worked, but won’t because it doesn’t
– Automated test may be unit test, integration test, acceptance test, etc.
– Mostly this will be a higher level test, because you don’t know where the bug
lives
• Run the test (which will fail “RED”, because the bug still exists)
• Find the cause of the bug and fix it
• Run the test (which will pass “GREEN” now)
• Check everything in and add test to automated regression test suite (CI)
70. Test Data, Expected Results,
Actual Results and Pass/Fail Outcome
in a Simple Division Example
Test Input Data
(Numerator)
Test Input Data
(Denominator)
Expected Result
(Quotient)
Actual Result
(Quotient )
Pass / Fail
10 2 5 5 Pass
100 4 25 25 Pass
12.6 3 4.2 4 Fail
70
71. Erroneous Conditions to Check
• Bogus data, such as files names like “!*W:X&Gi/w~WQ@”
• Poorly formatted data, like an email address without a top level
domain (e.g. “fred@foobar.”)
• Empty or missing values (0, 0.0, “”, null)
• Values far in excess of reasonable expectations, such as a
person’s age of 10,000 years
• Duplicates in lists that shouldn’t have duplicates
• Ordered lists that aren’t and vice-versa (can a sort method
handle a pre-sorted list or one in reverse sorted order?)
• Things that happen out of expected order, such as using
functionality that requires login without logging in
71
72. State Based Testing
The next state is dependent on the
current state of the system
72
Current Color Next Color
Red Green
Green
Red
Use transition tables to think about
test data and expected results
73. State Based Testing with Ruby’s RSpec
Use the the table for the test data and
expected results
73
Current Color Next Color
Green
74. • Consider the requirements in the State of
Maryland on for using a learner’s permit to learn to
drive.
• 4 Conditions
• Cell Phone not on
• Passenger must be licensed
• Passenger must be 21 or older
• Passenger must have 3 years as a licensed driver
Test Only Critical Boolean Combinations
74
To be legal, a driver with a Learner’s Permit must not
use his or her cell phone in any way, so to be safe the
cell shouldn’t be on. Also the learner must be
accompanied by a licensed driver 21 years of age or
older, who has at least 3 years of fully licensed driving
experience.
76. 76
OK to Drive? Cell Phone
On?
Licensed
Passenger?
Passenger 21
Or Older?
Passenger Driving
3 or more Years
False True
Don’t
Care
Don’t
Care
Don’t
Care
False
Don’t
Care
False
Don’t
Care
Don’t
Care
False
Don’t
Care
Don’t
Care
False
Don’t
Care
False
Don’t
Care
Don’t
Care
Don’t
Care
False
True False True True True
Simplify to 5 Critical Combinations
78. Data Focused Tests (On, Off, In, Out)
• On Point – value is on a boundary
• Off Point – value is not on a boundary
• In Point – value within, but not on a boundary
• Out Point – value is outside all boundaries
78
79. An Integer Between 5 and 15 Inclusive
79
Off Off
Out Out
OnOn
Off
In
Never check a floating point number for exact equality.
Always check that a floating point is within some range e.g.
x <= 5.49, x >= 9.99 , 5.0 <= x && x <= 15.0
80. • Think of a Test Double as a Stand In for something
that is difficult or expensive to test
• Similar to a Stunt Double who takes the place of an
staring Actor
What’s a Test Double
80
81. 81
Test Doubles: Dummies and Fakes
Gerard Meszaros’ Test Doubles come in four basic types:
Dummies, Fakes, Stubs and Mocks.
Dummy objects are passed around but never actually used.
Usually they are just used to fill parameter lists. For instance
when testing controller logic in isolation, you don’t need a real
Session object to test the behavior. Even statically typed
languages like Java only require a type match. Instead of using
a real session, create a dummy class that implements the
HttpSession interface with empty methods and pass that.
Fake objects actually have working implementations, but usually
take some shortcut which makes them not suitable for
production (an in memory in memory database is a good
example).
82. 82
Test Doubles: Stubs and Mocks
Stubs provide canned answers to calls made during the test,
usually not responding at all to anything outside what's
programmed in for the test. Stubs may also record information
about calls, such as an email gateway stub that remembers the
messages it 'sent', or maybe only how many messages it 'sent'.
Mocks are pre-programmed with expectations which form a
specification of the calls they are expected to receive. They can
throw an exception if they receive a call they don't expect and
are checked during verification to ensure they got all the calls
they were expecting.
Many people don’t differentiate between the type of Test
Doubles and use the word Mock for all types of Test Doubles.
83. When to Use Test Doubles
• Use Mocks or other Test Doubles to test classes and
methods in isolation
• Use Mocks or other Test Doubles to make unit test fast
– e.g. Mock out a database interaction
• Use Mocks or other Tests Doubles to simulate
environment error conditions
– e.g. Handling a network failure
• Use Mocks or other Test Doubles to simulate timing
related issues
– e.g. Midnight, January 1, 2000 – test handling of Y2K
83
84. When NOT to Use Test Doubles
• Don’t use Mocks or other Test Doubles for
integration testing
• Don’t use Mocks or other Test Doubles for smoke
testing
• Don’t use Mocks or other Test Doubles for
performance testing
• Don’t use Mocks or other Test Doubles when you
need to test the real system
84
85. 85
Funny Time Class Uses Real Time of Day
• This class
uses a Helper
method to get
time.
• That helper
uses the Date
class to call
system time.
• System time
will constantly
vary, causing
testing
challenges.
86. 86
Mock Object Substituted for the Real Time
• Automated
tests can’t be
dependent on
the time of
day to pass.
• So here
Mockito is
used to mock
out 10:23 and
Noon.
87. Legacy Critical to your Organization, But …
• Not under source control
• No tests = unsafe to change
• Comments and documents out of date
• Very hard to understand
87
88. Legacy Critical to your Organization, But …
• Very hard to change – big ball of mud
• Buggy an insecure
• In languages few understand or very old
versions of existing languages
• Need to migrate to web, cloud, mobile,
etc., but can’t
88
91. • What about legacy code that has no tests?
• Can’t stop writing new features to write tests
for millions of lines of code
• But, if parts of the code work and are never
changed, leave those parts alone.
Legacy Code Issues
91
92. Write Tests Around Code to be Changed
• Write tests for the code you need to
modify to create new features.
• Writing tests around the code you
need to change, make that code
easier to understand.
• It can be much faster to write tests
around changing legacy code, then to
try to understand it just by reading it.
• And with tests, the code is safe to
enhance and refactor.
92
93. Find Seams to Break Code Apart
• In Legacy Code a seam is:
– “A place you can change the code’s behavior without
changing the source code”
• For testing, isolate some code from the other code that
it depends on.
93
Michael Feathers
94. Starting Gilded Rose Legacy Code
94
• Confusing
mess.
• It’s actually
worse than it
looks and it
looks bad.
• To change it,
is to break it.
98. Code Coverage for Gilded Rose Tests
98
• Getting tests fully around the code,
now make it safe to refactor and to
enhance.
• Having written the tests the
developers now understand the
code.
99. Smoke Test the Application Infrastructure
Light weight, non-exhaustive tests, which prove the
infrastructure your app needs is there. Test:
– Are all needed databases available?
– Are needed sockets, ports and networks available?
– Is the web server up?
– Are all needed services available?
– Is Login Services and other Credentials up and available?
– Is the CI server, Source Control, etc. running and available?
– Is the app running
– etc.
99
If there is a Dev Ops team, you may write smoke tests together.
101. Use the Best Code Coverage Available
C0 – Line coverage analysis measures which lines of code have
been executed. C0 coverage is typically used it to find the areas
of your program that have not been sufficiently tested, i.e. those
that were not run by any your test cases.
C1 – Branch coverage analysis measures which of the different
possible branches of conditional statements have been tested. It
is easy to have 100% C0 line coverage and only partial C1
branch coverage, because if, then and if, then, else
statements may be contained all on one line in many languages.
C2 - Path coverage measures which of the different possible
execution paths through your code were tested. Paths are a
combination of linear code execution with alternate branches.
Since each new conditional that is encountered gives rise to new
path choices, the permutations of possible unique paths gets
huge.
101
102. Code Coverage Tools
C0- Every automated code coverage tool measures at least Line
Coverage.
C1 - Most automated code coverage tools (both open source and
commercial) measure Branch coverage. For instance open
source tools rcov for Ruby and Cobertura and Emma for Java
measure both Line and Branch as does Clover, a commercial
Java test tool.
C2 – Complete path coverage throughout an entire large
application is almost unheard of because of its complexity and
cost. Path coverage through multiple limited sections of code is
much more feasible. All path coverage tools also include line and
branch coverage. All are commercial.
102
103. C0 – Line Coverage
public class CoverageExample {
// This code is logically equivalent to the next example, but because of the
// code structure, untested branches are detected by line coverage.
public int echo (int x, boolean state1, boolean state2, boolean state3) {
if (state1) {
x++;
}
if (state2) {
x--;
}
if (state3) {
x = x;
}
return x;
}
}
// This test provides 100% Line coverage, but it doesn’t provide any branch coverage.
public class CoverageTestLineOnly {
CoverageExample lineBranchPath;
@Before
public void runBeforeEveryTest() {
lineBranchPath = new CoverageExample();
}
@Test
public void testReturnInput0FalseFalseFalse() {
assertEquals(0, lineBranchPath.echo(0, false, false, false);
}
} 103
104. C0 – Line Coverage Limitations
public class CoverageExample
// This code is logically equivalent to the prior example, but because of the
// code structure, untested branches are not detected by line coverage.
public int echo (int x, boolean state1, boolean state2, boolean state3) {
if (state1) { x++; }
if (state2) { x--; }
if (state3) { x = x; }
return x;
}
}
// This test provides 100% Line coverage, but it doesn’t provide any branch coverage.
public class CoverageTestLineOnly {
CoverageExample lineBranchPath;
@Before
public void runBeforeEveryTest() {
lineBranchPath = new CoverageExample();
}
@Test
public void allFalse() {
assertEquals(0, lineBranchPath.echo(0, false, false, false);
}
} 104
105. C1 – Branch Coverage
public class CoverageExample
// This code is identical to the prior example.
public int echo (int x, boolean state1, boolean state2, boolean state3) {
if (state1) { x++; }
if (state2) { x--; }
if (state3) { x = x; }
return x;
}
}
// Adding the new test provide 100% branch coverage, but doesn’t detect the bug.
public class CoverageTestLineAndBranch {
CoverageExample lineBranchPath;
@Before
public void runBeforeEveryTest() {
lineBranchPath = new CoverageExample();
}
@Test
public void allFalse() {
assertEquals(0, lineBranchPath.echo(0, false, false, false);
}
@Test
public void allTrue() {
assertEquals(0, lineBranchPath.echo(0, true, true, true);
}
} 105
106. C2 – Path Coverage Explosion
false false false
false false true
false true false
false true true
true false false
true false true
true true false
true true true
To test all possible paths of N
Boolean conditionals is 2 ^ N.
So for our example of 3
Booleans that’s 2x2x2 or 8. As
shown in the example on the
left, it’s not too bad.
However, the set of all
possible paths grows
exponentially, so a
combination of 10 Booleans,
becomes
2x2x2x2x2x2x2x2x2x2 or
1024 possible paths, an
unmanageable number of
tests.
106
107. C2 – Basis Path Simplification
false false false
true false false
false true false
false false true
Basis path testing is a simplification
of path testing, that significantly
lowers the number of needed tests
A complete set of basis path sets are
the number of boolean decisions +1.
It grows linearly instead of
exponentially. that’s 3+1 or 4 instead
of 8 for our example and 10+1 or 11
instead of 1024.
Path 1: Any path will do for the
baseline, so pick all trues or all
falses for simplification. We are
picking false. This is the first path in
our basis set.
Path 2: To find the next basis path,
we flip the first decision (only) in our
baseline.
Path 3: Next we flip the second
decision (only) in our baseline path,
the first baseline decision remains
fixed with the false outcome.
Path 4: Finally, you flip the third
decision in your baseline path.
Again, the first baseline decision
remains fixed with the false
outcome.
107
108. Code Coverage
Line, Branch & Basis Path
// JUnit 4 tests with 100% line, 100% branch coverage, and 100% basis path.
public class CoverageTestPath {
CoverageExample lineBranchPath;
@Before
public void runBeforeEveryTest() {
lineBranchPath = new CoverageExample();
}
@After
public void runBeforeEveryTest() {
lineBranchPath = null ;
}
@Test
// First basis path.
public void allFalse() {
assertEquals ("All false inputs ", 0, lineBranchPath.echo(0, false, false, false);
}
@Test
// Second basis path.
public void trueFalseFalse() {
assertEquals ("True-False-False inputs ", 0, lineBranchPath.echo(0, true, false, false);
}
@Test
// Third basis path.
public void falseTrueFalse() {
assertEquals ("False-True-False inputs ", 0, lineBranchPath.echo(0, false, true, false);
}
@Test
// Fourth basis path.
public void falseFalseTrue() {
assertEquals ("False-False-True inputs ", 0, lineBranchPath.echo, false, false, true);
}
}
108
110. Talk to Me
• For info about free talks, workshops, chat or
have me come and help your teams, contact me.
• Email: cbell@CamilleBellConsulting.com
(best way to contact me)
• Twitter: @agilecamille
• Slideshare: camille_bell
• LinkedIn: https://www.linkedin.com/in/camillebell/
110