Automated Developer Testing: Achievements and Challenges

Automated Developer Testing:
Achievements and Challenges
Tao Xie
North Carolina State University
contact: taoxie@gmail.com

Automation in Developer Testing
• Background on developer testing
– http://www.developertesting.com/
– Kent Beck’s 2004 talk on “Future of Developer Testing”
http://www.itconversations.com/shows/detail301.html
• This talk focuses on developer testing
– Not system testing etc. conducted by testers
• Unit Test Automation commonly referred to writing unit
test cases manually, executed automatically
• Automation here is broad, including automatic test
generation 2

Software Testing Setup
=
?
Outputs Expected
Outputs
Program
+
Test
inputs
Test Oracles
3

Software Testing Problems
=
?
Outputs Expected
Outputs
Program
+
Test
inputs
Test Oracles
4
• Faster: How can tools help developers create and run tests faster?

=
?
Outputs Expected
Outputs
Program
+
Test
inputs
Test Oracles
5
• Better Test Inputs: How can tools help generate new better test inputs?

=
?
Outputs Expected
Outputs
Program
+
Test
inputs
Test Oracles
6
• Better Test Inputs: How can tools help generate new better test inputs?
• Better Test Oracles: How can tools help generate better test oracles?

Example Unit Test Case
=
?
Outputs Expected
Outputs
Program
+
Test
inputs
Test Oracles
7
void addTest() {
ArrayList a = new ArrayList(1);
Object o = new Object();
a.add(o);
AssertTrue(a.get(0) == o);
}
• Appropriate method sequence
• Appropriate primitive argument values
• Appropriate assertions
Test Case = Test Input + Test Oracle

Levels of Test Oracles
• Expected output for an individual test input
– In the form of assertions in test code
• Properties applicable for multiple test inputs
– Crash (uncaught exceptions) or not, related to
robustness issues, supported by most tools
– Properties in production code: Design by Contract
(precondition, postcondition, class invariants)
supported by Parasoft Jtest, Google CodePro AnalytiX
– Properties in test code: Parameterized unit tests
supported by MSR Pex, AgitarOne
X. Xiao, S. Thummalapenta, and T. Xie. Advances on Improving Automation in Developer Testing.
In Advances in Computers, 2012 http://people.engr.ncsu.edu/txie/publications.htm#ac12-
devtest

Economics of Test Oracles
9
• Expected output for an individual test input
– Easy to manually verify for one test input
– Expensive/infeasible to verify for many test inputs
– Limited benefits: only for one test input
• Properties applicable for multiple test inputs
– Not easy to write (need abstraction skills)
– But once written, broad benefits for multiple test
inputs

Assert behavior of multiple test inputs
Design by Contract
• Example tools: Parasoft Jtest, Google CodePro AnalytiX,
MSR Code Contracts, MSR Pex
• Class invariant: properties being satisfied by an object (in
a consistent state) [AgitarOne allows a class invariant
helper method used as test oracles]
• Precondition: conditions to be satisfied (on receiver
object and arguments) before a method can be invoked
• Postcondition: properties being satisfied (on receiver
object and return) after the method has returned
• Other types of specs also exist
http://research.microsoft.com/en-us/projects/contracts/

Microsoft Research Code Contracts
Features
public virtual int Add(object value)
{
Contract.Requires( value != null );
Contract.Ensures( Count == Contract.OldValue(Count) + 1 );
Contract.Ensures( Contract.Result<int>() == Contract.OldValue(Count) );
if (count == items.Length) EnsureCapacity(count + 1);
items[count] = value;
return count++;
}
- Slide adapted from MSR RiSEhttp://research.microsoft.com/en-us/projects/contracts/

Features
 Language expression syntax
{
return count++;
}

Features
 Type checking / IDE
{
return count++;
}

Features
 Declarative
{
return count++;
}

Features
 Declarative
 Special Encodings
 Result and Old
{
return count++;
}

[ContractInvariantMethod]
void ObjectInvariant() {
Contract.Invariant( items != null );
}
Features
 Declarative
 Special Encodings
 Result and Old
{
return count++;
}

Parameterized Unit Testing
void TestAdd(List list, int item) {
Assume.IsTrue(list != null);
var count = list.Count;
list.Add(item);
Assert.AreEqual(count + 1, list.Count);
}
• Parameterized Unit Test =
Unit Test with Parameters
• Separation of concerns
– Data is generated by a tool
– Developer can focus on functional specification
[Tillmann&Schulte ESEC/FSE 05]
http://research.microsoft.com/apps/pubs/default.aspx?id=77419

Parameterized Unit Tests are
Formal Specifications
Algebraic Specifications• A Parameterized Unit Test can be read as a
universally quantified, conditional axiom.
void TestReadWrite(Res r, string name, string data) {
Assume.IsTrue(r!=null & name!=null && data!=null);
r.WriteResource(name, data);
Assert.AreEqual(r.ReadResource(name), data);
}
∀ string name, string data, Res r:
r ≠ null ⋀ name ≠ null ⋀ data ≠ null ⇒
equals(
ReadResource(WriteResource(r, name, data).state, name),
data)

http://research.microsoft.com/pex/
Parameterized Unit Tests in Pex

Parameterized Unit Testing
Getting PopularParameterized Unit Tests (PUTs) commonly supported by
various test frameworks
• .NET: Supported by .NET test frameworks
– http://www.mbunit.com/
– http://www.nunit.org/
– …
• Java: Supported by JUnit 4.X
– http://www.junit.org/
Generating test inputs for PUTs supported by tools
• .NET: Supported by Microsoft Research Pex
– http://research.microsoft.com/Pex/
• Java: Supported by Agitar AgitarOne
– http://www.agitar.com/

Parameterized
Test-Driven Development
Write/refine Contract
as PUT
Write/refine Code
of Implementation
Fix-it (with Pex),
Debug with generated tests
Use Generated Tests
for Regression
Run Pex
Bug in PUT
Bug in Code
failures
no failures

Software Agitation in AgitarOne
Code
- Slide adapted from Agitar Software Inc.
http://www.agitar.com/

Code
Software
Agitation

Code
Software
Agitation
Observations
on code behavior,
plus
Test Coverage data

Code
Software
Agitation
Observations
on code behavior,
plus
Test Coverage data
If an Observation
reveals a bug, fix it

Code
Software
Agitation
Observations
on code behavior,
plus
Test Coverage data
If an Observation
If it describes desired behavior,
click to create a Test Assertion

Code
Software
Agitation
Observations
on code behavior,
plus
Test Coverage data
If an Observation
If it describes desired behavior,
click to create a Test AssertionCode
Compile
Review
Agitate

18
Image from http://www.agitar.com/

Automated Test Generation
19
 Recent advanced technique: Dynamic
Symbolic Execution/Concolic Testing
 Instrument code to explore feasible paths
 Example tool: Pex from Microsoft Research
(for .NET programs)
P. Godefroid, N. Klarlund, and K. Sen. DART: directed automated random testing. In Proc. PLDI
2005
K. Sen, D. Marinov, and G. Agha. CUTE: a concolic unit testing engine for C. In Proc. ESEC/FSE
2005
N. Tillmann and J. de Halleux. Pex - White Box Test Generation for .NET. In Proc. TAP 2008

void CoverMe(int[] a)
{
if (a == null) return;
if (a.Length > 0)
if (a[0] == 1234567890)
throw new Exception("bug");
}
Dynamic Symbolic Execution in Pex
http://pex4fun.com/HowDoesPexWork

{
if (a.Length > 0)
if (a[0] == 1234567890)
}
Input
null

{
if (a.Length > 0)
if (a[0] == 1234567890)
}
T
a==null
Input
null
Execute&Monitor
Observed constraints
a==null

{
if (a.Length > 0)
if (a[0] == 1234567890)
}
T
a==null
Constraints to solve
a!=null
Input
null
Execute&Monitor
Choose next path
a==null

{
if (a.Length > 0)
if (a[0] == 1234567890)
}
T
a==null
a!=null
Input
null
{}
Execute&MonitorSolve
Choose next path
a==null

{
if (a.Length > 0)
if (a[0] == 1234567890)
}
a.Length>0
F
TF
a==null
a!=null
Input
null
{}
Choose next path
a==null
a!=null &&
!(a.Length>0)

{
if (a.Length > 0)
if (a[0] == 1234567890)
}
a.Length>0
F
TF
a==null
a!=null
a!=null &&
a.Length>0
Input
null
{}
Choose next path
a==null
a!=null &&
!(a.Length>0)

{
if (a.Length > 0)
if (a[0] == 1234567890)
}
a.Length>0
a[0]==123…
TF
T
F
F
a==null
a!=null
a!=null &&
a.Length>0
Input
null
{}
{0}
Choose next path
a==null
a!=null &&
!(a.Length>0)

{
if (a.Length > 0)
if (a[0] == 1234567890)
}
a.Length>0
a[0]==123…
TF
T
F
F
a==null
a!=null
a!=null &&
a.Length>0
Input
null
{}
{0}
Choose next path
a==null
a!=null &&
!(a.Length>0)
a==null &&
a.Length>0 &&
a[0]!=1234567890

{
if (a.Length > 0)
if (a[0] == 1234567890)
}
a.Length>0
a[0]==123…
TF
T
F
F
a==null
a!=null
a!=null &&
a.Length>0
a!=null &&
a.Length>0 &&
a[0]==123456890
Input
null
{}
{0}
Choose next path
a==null
a!=null &&
!(a.Length>0)
a==null &&
a.Length>0 &&
a[0]!=1234567890

{
if (a.Length > 0)
if (a[0] == 1234567890)
}
a.Length>0
a[0]==123…
TF
T
F
F
a==null
a!=null
a!=null &&
a.Length>0
a!=null &&
a.Length>0 &&
a[0]==123456890
Input
null
{}
{0}
{123…}
Choose next path
a==null
a!=null &&
!(a.Length>0)
a==null &&
a.Length>0 &&
a[0]!=1234567890

{
if (a.Length > 0)
if (a[0] == 1234567890)
}
a.Length>0
a[0]==123…
TF
T
F
F
a==null
T
a!=null
a!=null &&
a.Length>0
a!=null &&
a.Length>0 &&
a[0]==123456890
Input
null
{}
{0}
{123…}
Choose next path
a==null
a!=null &&
!(a.Length>0)
a==null &&
a.Length>0 &&
a[0]!=1234567890
a==null &&
a.Length>0 &&
a[0]==1234567890

{
if (a.Length > 0)
if (a[0] == 1234567890)
}
a.Length>0
a[0]==123…
TF
T
F
F
a==null
T
a!=null
a!=null &&
a.Length>0
a!=null &&
a.Length>0 &&
a[0]==123456890
Input
null
{}
{0}
{123…}
Choose next path
a==null
a!=null &&
!(a.Length>0)
a==null &&
a.Length>0 &&
a[0]!=1234567890
a==null &&
a.Length>0 &&
a[0]==1234567890
Done: There is no path left.

Automating Test Generation
• Method sequences
– MSeqGen/Seeker [Thummalapenta et al. OOSPLA 11, ESEC/FSE 09],
Covana [Xiao et al. ICSE 2011], OCAT [Jaygarl et al. ISSTA 10],
Evacon [Inkumsah et al. ASE 08], Symclat [d'Amorim et al. ASE 06]
• Environments e.g., db, file systems, network, …
– DBApp Testing [Taneja et al. ESEC/FSE 11], [Pan et al. ASE 11]
– CloudApp Testing [Zhang et al. IEEE Soft 12]
• Loops
– Fitnex [Xie et al. DSN 09]
@NCSU ASE
http://people.engr.ncsu.edu/txie/publications.htm

Pex on MSDN DevLabs
Incubation Project for Visual Studio
Download counts (20 months)
(Feb. 2008 - Oct. 2009 )
Academic: 17,366
Devlabs: 13,022
Total: 30,388
http://research.microsoft.com/projects/pex/

Open Source Pex extensions
http://pexase.codeplex.com/
Publications: http://research.microsoft.com/en-us/projects/pex/community.aspx#publications

Writing Test Oracles 
Learning Formal Methods!?
• Parameterized Unit Test =
Unit Test with Parameters
• Separation of concerns
– Data is generated by a tool
– Developer can focus on functional specification
void TestAdd(List list, int item) {
Assume.IsTrue(list != null);
var count = list.Count;
list.Add(item);
Assert.AreEqual(count + 1, list.Count);
}

Automatic Test Generation 
Human Assistance to Test Generation?!
Running Symbolic PathFinder ...
…
============================================
========== results
no errors detected
============================================
========== statistics
elapsed time: 0:00:02
states: new=4, visited=0,
backtracked=4, end=2
search: maxDepth=3,
constraints=0
choice generators: thread=1, data=2
heap: gc=3, new=271, free=22
instructions: 2875
max memory: 81MB
loaded code: classes=71, methods=884
…
25

Challenges
Faced by Test Generation Tools
 object-creation problems (OCP) - 65%
 external-method call problems (EMCP) – 27%
Total block coverage achieved is 50%, lowest coverage 16%.
26
 Example: Dynamic Symbolic Execution/Concolic Testing
 Challenge: path explosion

 A graph example from
QuickGraph library
00: class Graph : IVEListGraph { …
03: public void AddVertex (IVertex v) {
04: vertices.Add(v); // B1 }
06: public Edge AddEdge (IVertex v1, IVertex v2) {
07: if (!vertices.Contains(v1))
08: throw new VNotFoundException("");
09: // B2
12: // B3
14: Edge e = new Edge(v1, v2);
15: edges.Add(e); } }
//DFS:DepthFirstSearch
18: class DFSAlgorithm { …
23: public void Compute (IVertex s) { ...
24: if (graph.GetEdges().Size() > 0) { // B4
25: isComputed = true;
26: foreach (Edge e in graph.GetEdges()) {
27: ... // B5
28: }
29: } } } [Thummalapenta et al. OOPSLA 11]

QuickGraph library
 Includes two classes
Graph
DFSAlgorithm
09: // B2
12: // B3
27: ... // B5
28: }

QuickGraph library
Graph
DFSAlgorithm
 Graph
AddVertex
09: // B2
12: // B3
27: ... // B5
28: }

QuickGraph library
Graph
DFSAlgorithm
 Graph
AddVertex
AddEdge: requires
both vertices to be
in graph
09: // B2
12: // B3
27: ... // B5
28: }

56
09: // B2
12: // B3
27: ... // B5
28: }

57
 Test target: Cover true
branch (B4) of Line 2400: class Graph : IVEListGraph { …
09: // B2
12: // B3
27: ... // B5
28: }

58
branch (B4) of Line 24
 Desired object
state: graph should
include at least one
edge
09: // B2
12: // B3
27: ... // B5
28: }

59
branch (B4) of Line 24
 Desired object
state: graph should
include at least one
edge
 Target sequence:
Graph ag = new Graph();
Vertex v1 = new Vertex(0);
Vertex v2 = new Vertex(1);
ag.AddVertex(v1);
ag.AddVertex(v2);
ag.AddEdge(v1, v2);
DFSAlgorithm algo = new
DFSAlgorithm(ag);
algo.Compute(v1);
09: // B2
12: // B3
27: ... // B5
28: }

Challenges
Faced by Test Generation Tools
 object-creation problems (OCP) - 65%
 external-method call problems (EMCP) – 27%
Total block coverage achieved is 50%, lowest coverage 16%.
29
 Example: Dynamic Symbolic Execution/Concolic (Pex)
 Challenge: path explosion

Example External-Method Call
Problems (EMCP)
30
1
2
3

Problems (EMCP)
 Example 1:
 File.Exists has data dependencies
on program input
 Subsequent branch at Line 1 using
the return value of File.Exists.
30
1
2
3

Problems (EMCP)
 Example 1:
on program input
 Example 2:
 Path.GetFullPath has data
dependencies on program input
 Path.GetFullPath throws
exceptions.
30
1
2
3

Problems (EMCP)
 Example 1:
on program input
 Example 2:
 Path.GetFullPath has data
dependencies on program input
 Path.GetFullPath throws
exceptions.
 Example 3: String.Format do
not cause any problem
30
1
2
3

Human Can Help!
Object Creation Problems (OCP)
Tackle object-creation problems with Factory Methods
31

Human Can Help!
External-Method Call Problems (EMCP)
Tackle external-method call problems with Mock Methods or
Method Instrumentation
Mocking System.IO.File.ReadAllText
32

State-of-the-Art/Practice
Testing Tools
Running Symbolic PathFinder ...
…
============================================
========== results
no errors detected
============================================
========== statistics
elapsed time: 0:00:02
states: new=4, visited=0,
backtracked=4, end=2
search: maxDepth=3,
constraints=0
choice generators: thread=1, data=2
heap: gc=3, new=271, free=22
instructions: 2875
max memory: 81MB
loaded code: classes=71, methods=884
…
Tools typically don’t
communicate challenges faced
by them to enable cooperation
between tools and users.
We typically don’t teach people
how to cooperate with tools.
33
X. Xiao, T. Xie, N. Tillmann, and J. de Halleux. Precise Identification of Problems for Structural
Test Generation. In Proc. ICSE 2011
http://people.engr.ncsu.edu/txie/publications/icse11-covana.pdf

Coding Duels
1,206,095 clicked 'Ask Pex!'

Coding Duels
Pex computes “semantic diff” in cloud
code written in browser vs.
secret reference implementation
You win when Pex finds no differences

Behind the Scene of Pex for Fun
Secret Implementation
class Secret {
public static int Puzzle(int x) {
if (x <= 0) return 1;
return x * Puzzle(x-1);
}
}
Player Implementation
class Player {
public static int Puzzle(int x) {
return x;
}
}
class Test {
public static void Driver(int x) {
if (Secret.Puzzle(x) != Player.Puzzle(x))
throw new Exception(“Mismatch”);
}
}
behavior
Secret Impl == Player Impl
36

Coding Duels
Fun and Engaging
Iterative gameplay
Adaptive
Personalized
No cheating
Clear winning criterion

Example User Feedback
“It really got me *excited*. The part that got me most is
about spreading interest in teaching CS: I do think that it’s
REALLY great for teaching | learning!”
“I used to love the first person shooters and the
satisfaction of blowing away a whole team of
Noobies playing Rainbow Six, but this is far more
fun.”
“I’m afraid I’ll have to constrain myself to spend just an hour
or so a day on this really exciting stuff, as I’m really stuffed
with work.”
Released since 2010
X

Coding Duel Competition
@ICSE 2011
http://pexforfun.com/icse2011

Coding Duels for Automatic Grading
@Grad Software Engineering Course
http://pexforfun.com/gradsofteng

Coding Duels for Training Testing
public static string Puzzle(int[] elems, int capacity, int elem) {
if ((maxsize <= 0) || (elems == null) || (elems.Length > (capacity + 1)))
return "Assumption Violation!";
Stack s= new Stack(capacity);
for (int i = 0; i < elems.Length; i++)
s.Push(elems[i]);
int origSize = s.GetNumOfElements();
//Please fill in below test scenario on the s stack
//The lines below include assertions to assert the program behavior
PexAssert.IsTrue(s.GetNumOfElements() == origSize + 1);
PexAssert.IsTrue(s.Top() == elem); PexAssert.IsTrue(!s.IsEmpty());
PexAssert.IsTrue(s.IsMember(elem));
return s.GetNumOfElements().ToString() + "; “ + s.Top().ToString() + "; “
+ s.IsMember(elem).ToString() + "; " + s.IsEmpty();
}
Set up a stack with some elements
Cache values used in assertions

Usage Scenarios of Pex4Fun
• Massive Open Online Courses (MOOC): Challenges
– Grading, addressed by Pex4Fun
– Cheating [Open Challenge]
• Course assignments (students/professionals)
– E.g., intro programming, software engineering
• Student/professional competitions
– E.g., coding-duel competition at ICSE 2011
• Assessment of testing/programming/problem
solving skills for job applicants
– Not just final results of problem solving but also process!

More Reading
Nikolai Tillmann, Jonathan De Halleux, Tao Xie,
Sumit Gulwani and Judith Bishop
Teaching and Learning Programming and
Software Engineering via Interactive Gaming
In Proceedings of the 35th International
Conference on Software Engineering (ICSE 2013),
Software Engineering Education (SEE), San
Francisco, CA, May 2013.
http://people.engr.ncsu.edu/txie/publications/ic
se13see-pex4fun.pdf

Conclusion
• Software testing is important and yet costly; needs
automation
• Better Test Inputs: help generate new better test
inputs
– Generate method arguments
– Generate method sequences
• Better Test Oracles: help generate better test oracles
– Assert behavior of individual test inputs
– Assert behavior of multiple test inputs
• Software Testing  Educational Gaming
– http://www.pexforfun.com/
45

Example Industrial
Developer Testing Tools
• Agitar AgitatorOne http://www.agitar.com/
• Parasoft Jtest http://www.parasoft.com/
• Google CodePro AnalytiX https://developers.google.com/java-
dev-tools/codepro/doc/
• SilverMark Test Mentor http://www.silvermark.com/
• Microsoft Research Pex (for .NET)
http://research.microsoft.com/Pex/
• Microsoft Research Spec Explorer (for .NET)
http://research.microsoft.com/specexplorer/
46

Trends in Practice
• Regression Test Selection/Prioritization
• Cloud Computing for Test Execution, e.g.,
http://www.skytap.com/
• Crowdsourcing for Testing, e.g.,
http://www.utest.com/
• Mocking Environments
– Google: EasyMock
– Microsoft VS: Fake/Moles
http://research.microsoft.com/en-us/projects/pex/
• Automatic Test Generation
– Microsoft: Pex, SAGE http://research.microsoft.com/en-
us/um/people/pg/

Q & A
Thank you!
contact: taoxie@gmail.com
Acknowledgments: NSF grants CCF-0845272, CCF-0915400, CNS-0958235, CNS-1160603,
a Microsoft Research SEIF Award, and a Microsoft Research Award.

Automated Combinatorial Testing
Goals – reduce testing cost, improve cost-benefit ratio
Accomplishments – huge increase in performance,
scalability, 200+ users, most major IT firms and others
Also non-testing applications – modelling and
simulation, genome
http://csrc.nist.gov/groups/SNS/acts/index.html

Failure-triggering Interactions
• Additional
studies
consistent
• > 4,000
failure reports
analyzed
• Conclusion:
failures
triggered by
few variables

NIST ACTS Tool
• Covering array generator
• Coverage analysis - what is the combinatorial coverage of
existing test set?
• .NET configuration file generator
• Fault characterization -
ongoing
Current
users
http://csrc.nist.gov/groups/SNS/acts/documents/comparison-report.html
approximately 200 users as of
July 2009, in IT, defense, finance,
telecom, and many other
industries

Automated Developer Testing: Achievements and Challenges

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Automated Developer Testing: Achievements and Challenges

Similar to Automated Developer Testing: Achievements and Challenges (20)

More from Tao Xie

More from Tao Xie (20)

Recently uploaded

Recently uploaded (20)

Automated Developer Testing: Achievements and Challenges