Testing, fixing, and proving with contracts

Testing, Fixing, and Proving
with Contracts
Carlo A. Furia
Chair of Software Engineering, ETH Zurich
bugcounting.net @bugcounting

The (AlpTransit) Gotthard tunnel
The tunnel
• 57 km long
• construction at both ends
• underneath the Gotthard massif
2
Erstfeld
• canton Uri
• German-speaking
• weather probably cloudy
Bodio
• canton Ticino
• Italian-speaking
• weather probably sunny

Users with different requirements
Joe the programmer
• little or no background in formal techniques
• weak and simple (incomplete) specifications
• design not optimal for verification
• bugs: full verification is unattainable
• looks for low hanging fruits of verification
Verification expert
• fluent in formal logic techniques
• strong, often complete, specifications
• design for full verification
• could use automation of simpler steps
• aims at the holy grail of verified software
3

The Eiffel Verification Environment
4
Inspector
AutoTest
AutoFix
AutoProof
GUI
Verification
Assistant

The Eiffel Verification Environment
5
GUI
Verification
Assistant
CLI
ComCom
(web)Inspector
AutoTest
AutoFix
AutoProof

A key ingredient: contracts
Contracts are a form of lightweight specification:
• Assertions (pre- and postconditions, invariants)
• Contract language = Boolean expressions
• Executable: bring immediate benefits for testing,
debugging, and so on
Verification tools in EVE take advantage of
(simple) functional specifications
in the form of contracts.

Auto-active user/tool interaction
1. Code + Annotations 2. Push button
3. Verification outcome
4. Correct/Revise
7

Roadmap
AutoTest: find faults automatically
8
AutoFix: patch faults automatically
Verification assistant: combine tests & proofs
Two-step verification: help debug failed proofs
AutoProof: prove realistic programs
1.
2.
3.
4.
5.

Next stop: AutoTest
9
1.
2.
3.
4.
5.

AutoTest in a nutshell
AutoTest is a push-button generator of unit tests
• Test = sequence of method calls on objects
• Contracts as oracles: target call o.m
– Invalid test: o does not satisfy m’s precondition
– Passing test: all contracts evaluate to True
– Failing test: some contract evaluates to False
10
Similar tools:
• Korat (Java + assertions)
• QuickCheck (Haskell)

How AutoTest works
11
Random
object o
Random
method m
call o.m
Invalid test
Failing test:
bug found
• Existing object from object pool
• Fresh object of primitive type (e.g. random integer)
• New object of class type (call constructor)
Passing test
Add any new objects to object pool
Classification based on
runtime contract checking

Test generation strategies
AutoTest is a push-button generator of unit tests
• Basic generation strategy: random
• Other strategies as extensions:
– Random+
– Adaptive-random (object distance)
– Precondition satisfaction
– Stateful testing
12

Demo example: Bank Account
class ACCOUNT
balance: INTEGER
deposit (amount: INTEGER)
require 0 <= amount
ensure balance = old balance + amount
withdraw (amount: INTEGER)
require 0 <= amount
ensure
balance_set:
amount <= old balance implies balance = old balance - amount
balance_not_set:
amount > old balance implies balance = old balance
invariant
balance_nonnegative: balance >= 0 13

Demo 1: bug finding
AutoTest finds a bug in the implementation of
withdraw that violates postcondition
balance_not_set.
withdraw (amount: INTEGER)
require 0 <= amount
do
balance := balance + amount
ensure
balance_set:
amount <= old balance implies
balance = old balance - amount
balance_not_set:
amount > old balance implies balance = old balance
14

Demo 1: bug finding
AutoTest finds a bug in the implementation of
withdraw that violates postcondition
balance_not_set.
15

Next stop: AutoFix
16
1.
3.
4.
5.
2.

AutoFix in a nutshell
AutoFix is a push-button generator of fixes
17
AutoFix
Coding
code + contracts
bugs + patches
Similar tools:
• GenProg, Kali (C)
• PAR (Java)

How AutoFix works
Program
+
Contracts
Test
suite
Suspicious
states
AutoTest
Candidate
fixes
Valid
fixes
Validation
& rankingAnalysis Synthesis
 count = 1
 count = 2
 count = 0
count = 0 @ L4 if count = 0 then ...

AutoFix: Components
Program state abstraction:
• snapshots: location, predicate, value
Fault localization:
• static information: proximity to failing
location/expression
• dynamic information: number of
failing/passing tests
19

AutoFix: Components
Program state abstraction:
• snapshots : location, predicate, value
Synthesis:
• enumeration of common replacement
expressions and instructions
• conditional execution:
@ location:
if predicate = value then some fix action
20

AutoFix: Components
Validation:
• regression testing with all available tests for
method being fixed
• valid fix: passes all available tests
Ranking:
• based on suspiciousness score of snapshots
21

Demo 1b: bug fixing
AutoFix builds fixes for the bug in the
implementation of withdraw.
A “high-quality” (proper, correct) fix:
22

Demo 1b: bug fixing
AutoFix builds fixes for the bug in the
implementation of withdraw.
A fix that just happens to pass all tests:
23

Experiments with AutoFix
Source programs: standard data-structure
libraries, text library, card game.
LOC
of source +
contracts
#
Unique
errors
%
Fixed
errors
%
High-quality
fixes
Time:
test + fix
[minutes]
Fix implementation:
73’000 204 42% 25% 17 + 3
Fix contracts:
24’500 44 95% 25% 31 + 3

Experiments with AutoFix
Source programs: standard data-structure
libraries, text library, card game.
GenProg, according to
the analysis by [Qui+, ISSTA’15]:
< 2%
LOC
of source +
contracts
#
Unique
errors
%
Fixed
errors
%
High-quality
fixes
Time:
test + fix
[minutes]
Fix implementation:
73’000 204 42% 25% 17 + 3

Next stop: Verification assistant
26
1.
4.
5.
2.
3.

Integrating different tools
A verification assistant manages individual tools
– Select tools and program parts to be verified
– Collect results and aggregate them
Classes Data pool Tools
Verification Assistant
.
.
.
AutoTest
AutoProof
C1
C2
Cn AutoFix
AT
n
AT
2
AT
1 …
AP
n
AT
2
AP
1 …
AInAT
2
AI1 …
AF
n
AT
2
AF
1 … 27
Inspector

Scores: aggregated verification results
Each method & class receives a correctness
score
• A value in the interval [-1, 1]
• Estimate of evidence for correctness
-1 0 1
Evidence of
incorrectness
Evidence of
correctness
Lack of
evidence
Conclusive
evidence
Conclusive
evidence
28

Score for testing
• Failing test case: conclusive evidence of
incorrectness
• Passing test case: increases evidence of correctness
• Absolute value may vary according to other metrics
– used heuristics, coverage, testing time, …
-1 0 1
29

Score for testing
incorrectness
-1 0 1
Failing test case
30

Score for testing
incorrectness
-1 0 1
Failing test case
Passing test
case
31

Score for testing
incorrectness
-1 0 1
Failing test case
Passing test
case
Passing test
case
32

Score for testing
incorrectness
-1 0 1
Failing test case
Passing test
case
Passing test
case
Passing test
case
33

Score for correctness proofs
AutoProof is sound but incomplete:
– Timeout: score 0
– Failed proof: score -0.2
-1 0 1
Failed proof for a
complete tool
Successful proof
for a sound tool
34

Combining scores of different tools
• Running each tool determines a score for each
method
• Overall score for a class: weighted average
• Weights depend on the relative confidence in
reliability of tools
– may be application and configuration dependent
• Overall score of modules (packages) may also
weigh components differently according to
their criticality
35

Demo 2: combined testing and proving
The verification assistant runs on the version of
ACCOUNT patched by AutoFix:
deposit does not verify, but passes all tests
 reasonable confidence in its correctness.
36

Next stop: Two-step verification
37
1.
5.
2.
3.
4.

Modular proofs
Verifiers such as AutoProof perform modular
reasoning
• Effects of a call to method m within the caller
= m’s specification (pre, post, frame)
38
require
0 <= amount
do
update_balance (amount)
How we wrote it: How AutoProof sees it:
require
0 <= amount
do
assert update_balance.pre
havoc update_balance.frame
assume update_balance.post

Modular proofs in practice
Verifiers such as AutoProof perform modular
reasoning
• Necessary for scalability
• Consistent with design-by-contract and
information hiding
• But providing the detailed specifications
necessary for verification may be tedious or
overly complex
39

Specification writing fatigue
Providing the specification necessary for
verification may be tedious, especially in the
most straightforward cases.
require
0 <= amount
do
ensure
balance = old balance + amount
How we wrote it: How we thought about it:
40
require
0 <= amount
do
ensure

Debugging failed verification
When verification fails with verifiers such as
AutoProof (modular, sound, incomplete):
• There is a bug?
• The program is correct, but the specification is
insufficient?
To help debug failed verification attempts
AutoProof features two-step verification.
41

Two-step verification
Two-step verification improves user feedback,
especially in the presence of little specification.
1. First verification step
– Standard modular verification
2. Second verification step
– Ignore specification of called routines and loops
– Uses inlining and unrolling
Feedback: combination of outcomes of 1 & 2
42

Step 1: modular verification
update_balance (a: INTEGER)
do
balance := balance + a
end
require
0 <= amount
do
ensure
Postcondition violated
Modular verification fails.
43
No postcondition of callee:
effect on balance undefined

Step 2: verification with inlining
Verification with inlining succeeds.
Attribute balance is
incremented by amount.
Feedback: change (strengthen) the
specification of update_balance.
44
update_balance (a: INTEGER)
do
balance := balance + a
end
require
0 <= amount
do
ensure

Demo 2b: two-step verification
AutoProof with two-step verification runs on
the version of ACCOUNT patched by AutoFix:
deposit verifies after inlining update_balance
• Provide postcondition to update_balance
or
• Direct AutoProof to use update_balance inlined
45
Follow this demo at http://bit.do/tap-tutorial
(Switch to tab account2.e)

Two-step verification: feedback
r
require Pr
do
s
ensure Qr
s
require Ps
do
:
ensure Qs
Step 1: modular Step 2: inlined
Suggestion
Verify r Verify s Verify r
Ps fails Succeeds Succeeds Weaken Ps or use inlined
Qr fails Succeeds Succeeds Strengthen Qs or use inlined
Succeeds Qs fails Succeeds Strengthen Ps / Weaken Qs

r
require Pr
do
s
ensure Qr
s
require Ps
do
:
ensure Qs
Suggestion
1

r
require Pr
do
s
ensure Qr
s
require Ps
do
:
ensure Qs
Suggestion
2

r
require Pr
do
s
ensure Qr
s
require Ps
do
:
ensure Qs
Suggestion
3

Next stop: AutoProof
50
1.
2.
3.
5.
4.

AutoProof in a nutshell
AutoProof is an auto-active verifier for Eiffel
• Prover for functional properties
• All-out support of object-oriented idiomatic
structures (e.g. patterns)
– Based on class invariants
• Flexible: incrementality
– Proving simple properties requires little annotations
– Proving complex properties is possible with more
effort
51

Demo 3: a taste of AutoProof
AutoProof verifies method transfer with suitable
specification
transfer (amount: INTEGER; other: ACCOUNT)
-- Transfer `amount' from this account to `other'.
require
amount_non_negative: 0 <= amount
amount_available: amount <= balance
do
withdraw (amount)
other.deposit (amount)
ensure
deposit_done: other.balance = old other.balance + amount
withdrawal_done: balance = old balance - amount
52

Sound program verifiers compared
53
more
complex
properties
more
automation
static analysis
interactive (KIV)
ESC/Java2
OpenJML
Spec#
VCC
Chalice
Dafny
KeY VeriFast

Reasoning with class invariants
Class invariants are a natural way to reason
about object-oriented programs:
invariant = consistency of objects
54
ACCOUNT
invariant
balance >= 0

LIST
ACCOUNT
Multi-object structures
Object-oriented programs involve multiple
objects (duh!), whose consistency is often
mutually dependent
55
invariant
balance >= 0
balance = sum (transactions)
transactions

AUDITOR
LIST
ACCOUNT
Consistency of multi-object structures
Mutually dependent object structures require
extra care to enforce, and reason about,
consistency (cmp. encapsulation)
56
invariant
balance >= 0
transactions

AUDITOR
LIST
ACCOUNT
Consistency of multi-object structures
Mutually dependent object structures require
extra care to enforce, and reason about,
consistency (cmp. encapsulation)
57
invariant
balance >= 0
transactions

Open and closed objects
When (at which program points) must class
invariants hold? To provide flexibility, objects in
AutoProof can be open or closed
58
CLOSED OPEN
Object: Consistent Inconsistent
State: Stable Transient
Invariant: Holds May not hold

LIST
ACCOUNT
Ownership
For hierarchical object structures, AutoProof
offers an ownership protocol
59
invariant
balance >= 0
owns = [ transactions ]
transactions
owns

LIST
ACCOUNT
Ownership
60
invariant
balance >= 0
transactions
AUDITOR
owns

add_node
LIST
ACCOUNT
Ownership
61
transactions
AUDITOR
owns
invariant
balance >= 0

add_node
LIST
ACCOUNT
Ownership
62
transactions
AUDITOR
owns
invariant
balance >= 0

add_node
LIST
ACCOUNT
Ownership
63
transactions
AUDITOR
owns
invariant
balance >= 0

add_node
LIST
ACCOUNT
Ownership
64
transactions
AUDITOR
owns
update_balance
invariant
balance >= 0

LIST
ACCOUNT
Ownership
65
invariant
balance >= 0
transactions
AUDITOR
owns

Demo 4: ownership in AutoProof
AutoProof verifies the ACCOUNT with
an owned list of transactions
transactions: SIMPLE_LIST [INTEGER]
-- History of transactions:
-- positive integer = deposited amount
-- negative integer = withdrawn amount
-- latest transactions in back of list
66

ACCOUNT
Semantic collaboration
For collaborative object structures, AutoProof
offers a novel protocol: semantic collaboration
67
invariant
interest_rate = bank.rate
BANK
bank

bank
bank
ACCOUNT
For collaborative object structures, AutoProof
offers a novel protocol: semantic collaboration
68
invariant
BANK
bank

subjects
observers
• Subjects = objects my consistency depends on
• Observers = objects whose consistency depends
on me
69
invariant
subjects = [ bank ]
Current in bank.observers
-- Implicit in AutoProof
bank
bank
ACCOUNTBANK
bank

Demo 5: collaboration in AutoProof
AutoProof verifies the ACCOUNT with
a BANK that sets a master interest rate
bank: BANK
-- Provider of this account
invariant
non_negative_rate: 0 <= interest_rate
bank_exists: bank /= Void
consistent_rate: interest_rate = bank.master_rate
70
(Switch to tabs account5.e sand bank5.e)

AutoProof on realistic software
Verification benchmarks:
EiffelBase2 – a realistic container library:
# programs LOC SPEC/CODE Verification time
25 4400 Lines: 1.0
Tokens: 1.9
Total: 3.4 min
Longest method: 12 sec
Average method: < 1 sec
# classes LOC SPEC/CODE Verification time
46 8400 Lines: 1.4
Tokens: 2.7
Total: 7.2 min
Longest method: 12 sec
Average method: < 1 sec

Testing, fixing, and proving
with contracts: acknowledgements
72
Julian Tschannen Nadia Polikarpova
Yu (Max) Pei
Yi (Jason) Wei
Andreas Zeller
Bertrand MeyerIlinca Ciupa-MoserAndreas Leitner

Testing, fixing, and proving
with contracts (in Eiffel)
1. AutoTest
73
2. AutoFix
3. Verif. assist.
4. Two-step
5. AutoProof
http://se.inf.ethz.ch/research/
eve/
http://cloudstudio.ethz.ch/
comcom/
See TAP 2015’s proceedings for
references to technical papers

Testing, fixing, and proving with contracts

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (19)

En vedette

En vedette (20)

Similaire à Testing, fixing, and proving with contracts

Similaire à Testing, fixing, and proving with contracts (20)

Dernier

Dernier (20)

Testing, fixing, and proving with contracts