Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Splint the C code static checker
1. Splint the C code static checker
Pedro Pereira Ulisses Costa
Formal Methods in Software Engineering
May 28, 2009
Pedro Pereira, Ulisses Costa Splint the C code static checker
2. Sum´rio
a
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
3. Lint for detecting anomalies in C programs
Statically checking C programs
Unused declarations
Type inconsistencies
Use before definition
Unreachable code
Ignored return values
Execution paths with no return
Infinite loops
Pedro Pereira, Ulisses Costa Splint the C code static checker
4. Splint
Specification Lint and Secure Programming Lint
Annotations
Functions
Variables
Parameters
Types
Pedro Pereira, Ulisses Costa Splint the C code static checker
5. Sum´rio
a
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
6. Unused variables
Splint detects instances where the value of a location is used
before it is defined.
Annotations can be used to describe what storage must be
defined and what storage may be undefined at interface
points.
All storage reachable is defined before and after a function
call.
global variable
parameter to a function
function return value
Pedro Pereira, Ulisses Costa Splint the C code static checker
7. Undefined Parameters
Sometimes, function parameters or return values are expected to
reference undefined or partially defined storage.
out annotation denotes a pointer to storage that may be
undefined
in annotation can be used to denote a parameter that must
be completely defined
1 extern void setVal (/*@out@*/ int * x ) ;
2 extern int getVal (/*@in@*/ int * x ) ;
3 extern int mysteryVal ( int * x ) ;
> splint usedef . c
4
usedef . c :7: Value * x used before
5 int dumbfunc (/*@out@*/ int *x , int i ) { definition
6 if ( i > 3) usedef . c :9: Passed storage x not
7 return * x ; completely defined
8 else if ( i > 1) (* x is undefined ) : getVal ( x )
9 return getVal ( x ) ; usedef . c :11: Passed storage x not
10 else if ( i == 0) completely defined
11 return mysteryVal ( x ) ; (* x is undefined ) : mysteryVal
12 else { (x)
13 setVal ( x ) ; Finished checking --- 3 code warnings
14 return * x ;
15 }
16 }
Pedro Pereira, Ulisses Costa Splint the C code static checker
8. Sum´rio
a
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
9. Types
Strong type checking often reveals programming
errors. Splint can check primitive C types more strictly
and flexibly than typical compilers.
Built in C Types
Splint supports stricter checking of built-in C types. The char and
enum types can be checked as distinct types, and the different
numeric types can be type-checked strictly.
Characters
The primitive char type can be type-checked as a distinct type. If
char is used as a distinct type, common errors involving assigning
ints to chars are detected.
If charint is on (+), char types are indistinguishable from ints.
Pedro Pereira, Ulisses Costa Splint the C code static checker
10. Types - Enums
An error is reported if:
a value that is not an enumerator member is assigned to the
enum type
if an enum type is used as an operand to an arithmetic
operator
If the enumint flag is on, enum and int types may be used
interchangeably.
Pedro Pereira, Ulisses Costa Splint the C code static checker
11. Sum´rio
a
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
12. Memory management
About half the bugs in typical C programs can be
attributed to memory management problems.
Some only appear sporadically
And some may only be apparent when compiled on a different
platform
Splint detects many memory management errors at compile time
Using storage that may have been deallocated
Memory leaks
Returning a pointer to stack-allocated storage
Pedro Pereira, Ulisses Costa Splint the C code static checker
13. Memory management - Memory Model
An object is a typed region of storage;
Some objects use a fixed amount of storage (that is allocated
and deallocated by the compiler);
Other objects use dynamic memory storage that must be
managed by the program.
Storage is undefined if it has not been assigned a value
and defined after it has been assigned a value.
An object is completely defined if all storage that may be
reached from it is defined.
Pedro Pereira, Ulisses Costa Splint the C code static checker
14. Memory management - Memory Model (cont.)
What storage is reachable from an object depends on the type and
value of the object.
Example
If p is a pointer to a structure, p is completely defined if the value
of p is NULL, or if every field of the structure p points to is
completely defined.
Pedro Pereira, Ulisses Costa Splint the C code static checker
15. Memory management - Memory Model (cont.)
Left side of an assignment
When an expression is used as the left side of an assignment
we say it is an lvalue;
Its location in memory is used, but not its value;
Undefined storage may be used as an lvalue since only its
location is needed.
Right side of an assignment
When storage is used in any other way:
on the right side of an assignment;
as an operand to a primitive operator;
as a function parameter.
we say it is used as an rvalue;
It is an anomaly to use undefined storage as an rvalue.
Pedro Pereira, Ulisses Costa Splint the C code static checker
16. Memory management - Deallocation Errors
Deallocating storage when there are other live references to
the same storage
Failing to deallocate storage before the last reference to it is
lost
Solution
Obligation to release storage
This obligation is attached to the reference to which the
storage is assigned
The only annotation is used to indicate that a reference is the
only pointer to the object it points to:
1 /* @only@ */ /* @null@ */ void * malloc ( size_t size ) ;
Pedro Pereira, Ulisses Costa Splint the C code static checker
17. Memory management - Memory Leaks
> splint only . c
1 extern /* @only@ */ int * glob ; only . c :4: Only storage glob ( type int *)
2 not released
before assignment : glob = y
3 /* @only@ */ int * f ( /* @only@ */ only . c :1: Storage glob becomes only
int *x , int *y , int * z ) { only . c :4: Implicitly temp storage y
4 int * m = ( int *) malloc ( assigned to only :
glob = y
sizeof ( int ) ) ; only . c :6: Dereference of possibly null
5 glob = y ; // Memory leak pointer m : * m
only . c :8: Storage m may become null
6 free ( x ) ; only . c :6: Variable x used after being
7 *m = *x; // Use after released
free only . c :5: Storage x released
only . c :7: Implicitly temp storage z
8 return z ; // Memory leak returned as only : z
detected only . c :7: Fresh storage m not released
9 } before return
only . c :3: Fresh storage m allocated
Pedro Pereira, Ulisses Costa Splint the C code static checker
18. Memory management - Stack References
A memory error occurs if a pointer into stack is live after the
function returns
Splint detects errors involving stack references exported from
a function through return values or assignments to references
reachable from global variables or actual parameters
No annotations are needed to detect stack reference errors. It is
clear from declarations if storage is allocated on the function stack.
1 int * glob ; > splint stack . c
2 stack . c :9: Stack - allocated storage & loc
3 int * f ( int ** x ) { reachable
from return value : & loc
4 int sa [2] = { 0 , 1 }; stack . c :9: Stack - allocated storage * x
5 int loc = 3; reachable from
6 parameter x
stack . c :8: Storage * x becomes stack
7 glob = & loc ; stack . c :9: Stack - allocated storage glob
8 * x = & sa [0]; reachable
9 return & loc ; from global glob
stack . c :7: Storage glob becomes stack
10 }
Pedro Pereira, Ulisses Costa Splint the C code static checker
19. Sum´rio
a
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
20. Control Flow - Execution
Many of these checks are possible because of the extra
information that is known in annotations
To avoid spurious errors it is important to know something
about the behaviour of called functions
Without additional information Splint assumes that all
functions return and execution continues normally
Pedro Pereira, Ulisses Costa Splint the C code static checker
21. Control Flow - Execution (cont.)
noreturn annotation is used to denote a function that never
returns.
1 extern /* @noreturn@ */ void fatalerror ( char * s ) ;
Problem!
We also have maynoreturn and alwaysreturns annotations, but
Splint must assume that a function returns normally when
checking the code and doesn’t verify if a function really returns.
Pedro Pereira, Ulisses Costa Splint the C code static checker
22. Control Flow - Execution (cont.)
To describe non-returning functions the noreturnwhentrue and
noreturnwhenfalse mean that a function never returns if the first
argument is true or false.
1 /* @ n o r e t u r n w h e n f a l s e @ */ void assert ( /* @sef@ */ bool /* @alt
int@ */ pred ) ;
The sef annotation denotes a parameter as side effect free
The alt int indicate that it may be either a Boolean or an
integer
Pedro Pereira, Ulisses Costa Splint the C code static checker
23. Control Flow - Undefined Behavior
The order which side effects take place in C is not
entirely defined by the code.
Sequence point
a function call (after the arguments have been evaluated)
at the end of a if, while, for or do statement
a &&, || and ?
Pedro Pereira, Ulisses Costa Splint the C code static checker
24. Control Flow - Undefined Behavior (cont.)
> splint order . c + evalorderuncon
order . c :5: Expression has undefined
1 extern int glob ; behavior ( value of
right operand modified by left operand ) :
2 extern int mystery ( void ) ; x ++ * x
3 extern int modglob ( void ) /* order . c :6: Expression has undefined
@globals glob@ */ /* behavior ( left operand
uses i , modified by right operand ) : y [ i ]
@modifies glob@ */ ; = i ++
4 int f ( int x , int y []) { order . c :7: Expression has undefined
5 int i = x ++ * x ; behavior ( value of
right operand modified by left operand ) :
6 y [ i ] = i ++; modglob () * glob
7 i += modglob () * glob ; order . c :8: Expression has undefined
8 i += mystery () * glob ; behavior
( unconstrained function mystery used in
9 return i ; left operand
10 } may set global variable glob used in
right operand ) :
mystery () * glob
Pedro Pereira, Ulisses Costa Splint the C code static checker
25. Control Flow - Likely Infinite Loops
Splint reports an error if it detects a loop that appears to be
inifinite. An error is reported for a loop that does not modify any
value used in its condition test inside the body of the loop or in the
condition test itself.
1 extern int glob1 , glob2 ;
2 extern int f ( void ) /* @globals
glob1@ */ /* @modifies > splint loop . c + infloopsuncon
loop . c :7: Suspected infinite loop . No
nothing@ */ ; value used in
3 extern void g ( void ) /* loop test (x , glob1 ) is modified by test
or loop
@modifies glob2@ */ ; body .
4 extern void h ( void ) ; loop . c :8: Suspected infinite loop . No
5 condition
values modified . Modification possible
6 void upto ( int x ) { through
7 while ( x > f () ) g () ; unconstrained calls : h
8 while ( f () < 3) h () ;
9 }
Pedro Pereira, Ulisses Costa Splint the C code static checker
26. Control Flow - Switches
Splint detects case statements with code that may fall through to
the next case. The casebreak flag controls reporting of fall
through cases. The keyword fallthrough explicitly indicates that
execution falls through to this case.
1 typedef enum {
2 YES , NO , DEFINITELY ,
3 PROBABLY , MAYBE } ynm ;
4
5 void decide ( ynm y ) {
6 switch ( y ) {
> splint switch . c
7 case PROBABLY : switch . c :9: Fall through case ( no
8 case NO : printf ( quot; No ! quot; ) ; preceding break )
switch . c :12: Missing case in switch :
9 case MAYBE : printf ( quot; DEFINITELY
Maybe quot; ) ;
10 /* @fallthrough@ */
11 case YES : printf ( quot; Yes ! quot;
);
12 }
13 }
Pedro Pereira, Ulisses Costa Splint the C code static checker
27. Control Flow - Conclusion
But Splint has more!
Deep Breaks
Complete Logic
Pedro Pereira, Ulisses Costa Splint the C code static checker
28. Sum´rio
a
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
29. Buffer sizes
1 Buffer overflow errors are a particularly dangerous type of bug
in C
2 They are responsible for half of all security attacks
3 C does not perform runtime bound checking (for performance
reasons)
4 Attackers can exploit program bugs to gain full access to a
machine
Pedro Pereira, Ulisses Costa Splint the C code static checker
30. Buffer sizes - Checking access
Splint models blocks of memory using two properties:
maxSet
maxSet(b) denotes the highest address beyond b that can be
safely used as lvalue, for instance:
char buffer[MAXSIZE] we have maxSet(buffer ) = MAXSIZE − 1
maxRead
maxRead(b) denotes the highest index of a buffer that can be
safely used as rvalue.
When a buffer is accessed as an lvalue, Splint generates a
precondition constraint involving the maxSet property
When a buffer is accessed as an rvalue, Splint generates a
precondition constraint involving the maxRead property
Pedro Pereira, Ulisses Costa Splint the C code static checker
31. Buffer sizes - Annotating Buffer Sizes
1 Function declarations may include requires and ensures
clauses to specify assumptions about buffer sizes for function
preconditions
2 When a function with requires clause is called, the call site
must be checked to satisfy the constraints implied by requires
3 If the +checkpost is set, Splint warns if it cannot verify that
a function implementation satisfies its declared postconditions
Pedro Pereira, Ulisses Costa Splint the C code static checker
34. Buffer sizes - Warnings
Bound checking is more complex than other checks done by
Splint
So, memory bound warnings contain extensive information
about the unresolved constraint
setChar . c :5:4: Likely out - of - bounds
store :
buf [10]
1 int buf [10]; Unable to resolve constraint : requires 9
2 buf [10] = 3; >= 10
needed to satisfy precondition : requires
maxSet ( buf @ setChar . c :5:4) >= 10
Pedro Pereira, Ulisses Costa Splint the C code static checker
35. Buffer sizes - Warnings (cont.)
> splint bounds . c + bounds +
showconstraintlocation
bounds . c :5: Possible out - of - bounds store
:
1 void updateEnv ( char * str ) { strcpy ( str , tmp )
2 char * tmp ; Unable to resolve constraint :
requires maxSet ( str @ bounds . c :5) >=
3 tmp = getenv ( quot; MYENV quot; ) ; maxRead ( getenv (quot; MYENV quot;) @ bounds . c :3)
4 if ( tmp != NULL ) needed to satisfy precondition :
5 strcpy ( str , tmp ) ; requires maxSet ( str @ bounds . c :5) >=
maxRead ( tmp @ bounds . c :5)
6 } derived from strcpy precondition :
requires
maxSet ( < parameter 1 >) >=
maxRead ( < parameter 2 >)
Pedro Pereira, Ulisses Costa Splint the C code static checker
36. Sum´rio
a
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
37. The Ultimate Test: wu-ftpd
wu-ftpd version 2.5.0
20.000 lines of code
Took less than four seconds to check all of wu-ftpd on a
1.2-GHz Athlon machine
Splint detected the known flaws as well as finding some
previously unknown flaws (!)
Pedro Pereira, Ulisses Costa Splint the C code static checker
38. The Ultimate Test: wu-ftpd (cont.)
Running Splint on wu-ftpd without adding annotations
produced 166 warnings for potential out-of-bounds writes
After adding 66 annotations, it produced 101 warnings: 25 of
these indicated real problems and 76 were false
Pedro Pereira, Ulisses Costa Splint the C code static checker
39. Sum´rio
a
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
40. Pros and Cons
Pros
Lightweight static analysis detects software vulnerabilities
Splint definately improves code quality
Suitable for real programs...
Cons
. . . although it produces more warning messages that lead to
confusion
It won’t eliminate all security risks
Hasn’t been developed since 2007, they need new volunteers
Pedro Pereira, Ulisses Costa Splint the C code static checker
41. Sum´rio
a
1 Introduction
2 Unused variables
3 Types
4 Memory management
5 Control Flow
6 Buffer sizes
7 The Ultimate Test: wu-ftpd
8 Pros and Cons
9 Conclusions
Pedro Pereira, Ulisses Costa Splint the C code static checker
42. Conclusions
No tool will eliminate all security risks
Lightweight static analysis tools (Splint) play an important
role in identifying security vulnerabilities
Pedro Pereira, Ulisses Costa Splint the C code static checker
43. Questions
?
Pedro Pereira, Ulisses Costa Splint the C code static checker