Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Stale pointers are the new black - white paper
1. POLITECNICO DI MILANO
Facolt` di Ingegneria dell’Informazione
a
Corso di Laurea in Ingegneria Informatica
Dipartimento di Elettronica e Informazione
Detecting aliased stale pointers via static analysis: An architecture
independent practical application of pointer analysis and graph
theory to find bugs in binary code
Relatore: Prof. Stefano Zanero
Correlatore: Ing. Federico Maggi
Tesi di Laurea di:
Giovanni Gola, matricola 717847
Vincenzo Iozzo, matricola 713583
Anno Accademico 2009-2010
2. Acknowledgements
The authors would like to thank Thomas Dullien, Julien Vanegue, Ralf-Philipp
Weinmann and Tim Kornau for their suggestions and help while researching
the topic.
The authors would also like to thank the thesis advisors from Politecnico di
Milano Stefano Zanero and Federico Maggi.
Finally we want to thank all the people who have reviewed the paper.
3
8. Chapter 1
Introduction
In the era of cloud computing and internet-connected devices, attacks to such
devices are becoming increasingly profitable. Nowadays web clients, such as
PCs, mobiles, UMPCs, tablets and netbooks, are indeed a precious source
of information. Although built using diverse architectures, including but not
limited to the x86, x86-64, ARM and PowerPC familes, they necessarily share
a base set of network dedicated software, first of all the web browsers. Ap-
plications like Firefox, Safari, Internet Explorer, Google Chrome and other
complementary software including JavaScript engines, Adobe Flash, Adobe
AIR, Adobe Reader, the Java Runtime Environment, are distributed pre-
installed in almost every OS. Due to the magnitude of the latter applications,
developers are bound to adopt self-made custom complex memory manage-
ment systems built on top of native memory allocators. Noticeable examples
are NS Alloc()/NS Free() in the XPCOM library, PR NEW/PR DELETE
in NSPR library, JS malloc()/JS free() in SpiderMonkey JavaScript engine,
9
9. CHAPTER 1. Introduction
TCmalloc() in Webkit and V8 JavaScript engines.
The complexity and heterogeneity of memory management in large code
bases results in numerous memory corruption vulnerablities, such as unintial-
ized memory, double frees and dangling pointers. The latter are an insidious
type of flaws, also known as ”use-after-free” or ”stale pointers”, which occur
when pointer variables reference to freed memory areas. A quick search on the
Common Vulnerabilities and Exposures (CVE) list yields 162 reports for web
browsers and more than 470 results among the three most-popular browser
add-ons, i.e., Adobe Flash, Adobe Reader and Java Runtime Environment
(between 2006 and 2010).
The astonishing number of use-after-free bugs reported in recent years make
them one of the most appetible attack vector on client systems. Although
use-after-free bugs are scattered throughout the code base, finding them and
other memory corruption bugs is arguably more di cult than overflows in that
they are temporal memory errors and, as such, an e↵ective detection of them
means understanding how a custom memory management system internally
works, in order to identify logical pitfalls. The only e↵ective solution consists
in manual code-review e↵orts, in the attempt to recognize exceptions with
respect to the rules that developers were supposed to follow while writing code.
This process is clearly cumbersome. A vast variety of approaches has been
proposed to automatically spot memory flaws. In particular, an analyst can
actually face the problem of finding dangling pointers with dynamic analysis
techniques, such as fuzzing, or by the means of static analysis. Even though
10
10. fuzzing has been proved to be an e↵ective method for finding such bugs, it
su↵ers from several intrinsic limitations. For example, relying on input to
exercise code paths has the disadvantage of scarce coverage of the application
code and limits the depth of exercised code paths, leaving part of the code
unexplored. Moreover, the randomic nature of the generated input makes the
running time virtually infinite, and totally unrelated to the code coverage the
analysis reaches over time. We propose a practical approach to automatically
find use-after-free conditions in large binary code bases using static analysis
and graph theory.
11
12. Chapter 2
Static Analysis
2.1 General knowledge
Static analysis is the automated process of extracting semantic information
about a program without executing it. Not having the need of executing the
binary, static analysis o↵ers a number of potential advantages over its dynamic
counterpart:
• Architecture independence: the analysis, even if it might be specific to an
instruction set or language, can be implemented on top of any framework
and run on any machine;
• Running time: time complexity of a static analysis algorithm may range
from linear to exponential, but it will process the entire binary in finite
time;
• Code coverage and depth of analysis: not relying on input to exercise
code paths, static analysis can achieve total code coverage. This facts
13
13. CHAPTER 2. Static Analysis
makes it particularly useful in analyzing large applications with complex
path triggering conditions.
Although the static analysis problem has been proven to be theoretically un-
decidable, it can be formally reformulated as an over-approximation of the ori-
ginal problem which can be proven to halt in finite time. In fact the term ”static
analysis” is indeed over-broad. It includes various implementation techniques
that make static analysis feasible. Model checking was the first technique to
appear in chronological order. It arose in an attempt to solve the so called
Concurrent Program Verification problem. A model checker checks for the
correctness of a formula expressed in Temporal Logic, being able to e↵ectively
uncover hard-to-find concurrency errors. As stated in Chapter 1, use-after-free
bugs are temporal memory errors and, as such, can be expressed in terms of
temporal logic: a program point in which a pointer gets dereferenced is reached
after a program point in which the same pointer gets freed. Model Checking
turns out to be very precise in unveiling temporal memory errors, but reveals
non-negligible faults. The major defect is the need for source code. The model
checker simply tries to resolve the system of pre- and post-conditions of each
function. In order to do that, the pre- and post-conditions has to be previ-
ously specified by annotating the source code of the program to analyze. The
process of annotating is also extremely time consuming. Moreover, in the gen-
eral case, the complexity of model checking algorithms is exponential in time.
Another very important method for static analysis is the so called Abstract
Interpretation. When running an algorithm based on Abstract Interpretation,
14
14. 2.1. General knowledge
outb = transb (inb )
inb = joinp2predb (outb )
Figure 2.1: Data-flow equations
the program semantics is over-approximated as a set of monotonic functions,
the algorithm uses to transform ordered sets, which are the results of the ana-
lysis. It can be viewed as a partial execution of the program which tracks only
part of information about its semantics, without performing all the calcula-
tions. The constraints of monotonicity and order assure the analysis to halt
in finite time. This is a well-known technique, mainly used in compilers, for
optimization tasks, and in debugging. Every analysis, taking advantage of the
abstract interpretation pattern for gathering information about the possible
set of values (of registers, variables, memory locations, etc.) calculated at a
given point in a program, belongs to the ”Data-flow” family of analyses. In
particular, a data-flow analysis algorithm usually walks the control flow graph
(CFG) of a program. The algorithm, at each program point (instruction, as-
signment, basic block and so forth), applies data-flow equations (see Figure
2.1) to the state (ordered set) associated to it. The analysis is repeated on
every node until the sets stabilize. As previously stated, the data-flow equa-
tions must be monotonic and the sets must carry an order relation, at least
partial. If the latter conditions hold, the repetition of the analysis will reach
the so called fixpoint and the algorithm can stop.
15
15. CHAPTER 2. Static Analysis
2.2 Pointer Analysis
We will now focus on data-flow analysis of pointer values, precisely named
Pointer Analysis. As one may immediately notice, Pointer Analysis fits our
need of tracking values assigned to pointers to later check for use-after-free
conditions. There are several dimensions that a↵ect cost/precision trade-o↵s
of pointer analysis. How a pointer analysis addresses each of these dimensions
helps to categorize the analysis. The dimensions we are now going to consider
are:
• Scope: a static analysis algorithm can either be engineered to perform
the analysis within a single function only, or could o↵er the possibility
to extend the analysis in order to cover multiple functions. The former
goes under the name of intraprocedural pointer analysis, the latter is
interprocedural anaysis;
• Flow-sensitivity: a flow-sensitive analysis takes into account the order of
statements in a program, and therefore it can compute a solution for each
program point, whereas a flow-insensitive analysis computes a solution
for either the whole program or for each procedure. The immediate
consequence is a higher degree of precision for flow-sensitive approaches.
On the other hand, flow-insensitivity shows much higher scalability in
terms of both time and space, therefore proving to be a better choice for
analyzing very large programs;
16
16. 2.2. Pointer Analysis
• Context-sensitivity: in context-sensitive analyses the calling context of a
function is considered. This means parameters passed on di↵erent calls
of the same function can be distinguished and properly returned to the
actual caller. Context-sensitivity o↵ers a higher degree of precision and,
if properly implemented, mildly impacts speed;
• Heap modeling: an accurate pointer analysis should rely on a representa-
tion of the entire heap space. This is a non-trivial issue, and constitutes
a static analysis branch by itself, going under the name of Shape Ana-
lysis. Even though shape analysis is undergoing heavy research, it shows
extremely limited scalability in analyzing real-life programs;
• Aggregate modeling: a very important factor a↵ecting the precision of
pointer analysis is how elements of aggregates are distinguished. An
extremely precise modeling, in which every single object can be dis-
tiguished, could be achieved by running a full-blown shape analysis. As
just stated shape analysis is not a feasible solution for our purposes. A
very fast and imprecise model could collapse the elements of the aggreg-
ate into one object. This would introduce excessive noise in the analysis
and would lead to a situation in which no heap object is discernible from
another;
• Alias representation: indicates whether alias pairs or points-to pairs are
mantained during the analysis. Alias pairs represent alias relations ex-
plicitely, whereas points-to data is a more compact representation.
17
17. CHAPTER 2. Static Analysis
A lot of reasearch has been done on Pointer Analysis in the last twentyfive
years. Nowadays moderate intraprocedural analyses are commonly implemen-
ted in almost every compiler, whareas interprocedural algorithms are still in
research stage. Figure 2.2 shows a summary of the interprocedural analyses
Figure 2.2: Pointer analyses
proposed so far. Each one having its own pros and cons, they all share a few
limitations that make them not suitable for our purposes. Their major fault
is the need for source code, which usually is, from an analyst point of view,
practically impossible to retrieve. Moreover, they are often built to analyze
source code translated to a sub-language of the original language. The lat-
ter limitation also a↵ects the few interprocedural analyses proposed to work
at the assembly level, like the one by Naeem et al [3]. The need to reduce
a real assembly language, with hundreds of instructions, to a really narrow
sub-language proves to be actually impossible in real scenarios.
18
18. 2.3. Conclusions and contributions
2.3 Conclusions and contributions
Intraprocedural analysis, in terms of e ciency and scalability, is reliable enough
to be implemented with minor modifications apt to make it able to deal with
more expressive assembly languages. On the other hand, interprocedural ana-
lysis at the assembly level is still in an alpha stage of development. Therefore
we propose a new tree-based context-sensitive interprocedural analysis target-
ing assembly languages.
19
20. Chapter 3
Preprocessing stage
3.1 The REIL intermediate language
The Reverse Engineering Intermediate Language (REIL) [6] is a platform-
independent intermediate language which aims to simplify static code analysis
algorithms such as the gadget finding algorithm for return oriented program-
ming presented in this paper. It allows to abstract various specific assembly
languages to facilitate cross-platform analysis of disassembled binary code.
REIL performs a simple one-to-many mapping of native CPU instructions
to sequences of simple atomic instructions. Memory access is explicit. Every
instruction has exactly one e↵ect on the program state. This contrasts sharply
to native assembly instruction sets where the exact behaviour of instructions
is often influenced by CPU flags or other pre-conditions.
All instructions use a three-operand format. For instructions where some
of the three operands are not used, place-holder operands of a special type
21
21. CHAPTER 3. Preprocessing stage
called " are used where necessary. Each of the 17 di↵erent REIL instruction
has exactly one mnemonic that specifies the e↵ects of an instruction on the
program state.
The REIL VM
To define the runtime semantics of the REIL language it is necessary to define
a virtual machine (REIL VM) that defines how REIL instructions behave when
interacting with memory or registers.
The name of REIL registers follows the convention t-number, like t0, t1,
t2. The actual size of these registers is specified upon use, and not defined a
priori (In practice only register sizes between 1 byte and 16 bytes have been
used). Registers of the original CPU can be used interchangeably with REIL
registers.
The REIL VM uses a flat memory model without alignment constraints.
The endianness of REIL memory accesses equals the endianness of memory
accesses of the source platform.
REIL instructions
REIL instructions can loosely be grouped into five di↵erent categories accord-
ing to the type of the instruction (See Table 3.1).
Arithmetic and bitwise instructions take two input operands and one output
operand. Input operands either are integer literals or registers; the output
operand is a register. None of the operands have any size restrictions. However,
arithmetic and bitwise operations can impose a minimum output operand size
22
22. 3.1. The REIL intermediate language
Arithmetic instructions Operation
ADD x1 , x2 , y y = x1 + x2
SUB x1 , x2 , y y = x1 x2
MUL x1 , x2 , y y = x1 · x2
j k
DIV x1 , x2 , y y = x1x 2
MOD x1 , x2 , y y = x1 mod x2
8
>
> x · 2x2
< 1 if x2 0
BSH x1 , x2 , y y =
> j x1 k
>
: if x2 < 0
2 x2
Bitwise instructions Operation
AND x1 , x2 , y y = x1 &x2
OR x1 , x2 , y y = x1 | x2
XOR x1 , x2 , y y = x1 x2
Logical instructions Operation
8
>
> 1 if x = 0
< 1
BISZ x1 , ", y y =
>
> 0 if x 6= 0
: 1
JCC x1 , ", y transfer control flow to y i↵ x1 6= 0
Data transfer instructions Operation
LDM x1 , ", y y = mem[x1 ]
STM x1 , ", y mem[y ] = x1
STR x1 , ", y y = x1
Other instructions Operation
NOP ", ", " no operation
UNDEF ", ", y undefined instruction
UNKN ", ", " unknown instruction
Figure 3.1: List of REIL instructions
23
23. CHAPTER 3. Preprocessing stage
or a maximum output operand size relative to the sizes of the input operands.
Note that certain native instructions such as FPU instructions and mul-
timedia instruction set extensions cannot be translated to REIL code yet.
Another limitation is that some instructions which are close to the underlying
hardware such as privileged instructions can not be translated to REIL; sim-
ilarly exceptions are not handled. All of these cases require an explicit and
accurate modelling of the respective hardware features.
An example of function, translated from x86 assembly alnguage to REIL is
shown in Figure 3.2
Figure 3.2: REIL translation of a function
24
24. 3.2. Single Static Assignment (SSA) Form
3.2 Single Static Assignment (SSA) Form
3.2.1 Graph theory overview
The algorithm for building SSA Form relies on the dominator tree and dom-
inance frontiers in order to identify merge points.
The following notions are required to understand the algorithms for SSA trans-
lation and how bug detection works:
• Dominance Relation: In a Control Flow Graph, a node D dominates a
node N if every path from the start node to N must through D. Nota-
tionally, this is equivalent to D dom N. By defition every node dominates
itself;
• Strict Dominance Relation: A node D strictly dominates a node N if D
dominates N and D does not equal N;
• Immediate Dominator : The immediate dominator or idom of a node N is
the unique node that strictly dominates N but does not strictly dominate
any other node that strictly dominates N. Not all nodes have immediate
dominators;
• Dominator Tree: The dominator tree of a graph is a tree where each
node’s children are those nodes it immediately dominates;
• Dominance Frontier : The dominance frontier of a node S is the set of all
nodes N such that S dominates a predecessor of N but does not strictly
25
25. CHAPTER 3. Preprocessing stage
dominates N; More intuitively, it is the set of nodes where N’s dominance
stops;
• Iterated Dominance Frontier : Formally, it is the irreflexive closure of the
dominance frontier relation. It is actually calculated as follows:
S
Let DF (S) = x2s DF (x) be the dominance frontier of a set of nodes.
The iterated dominance frontier is:
3.2.2 Computing SSA Form
Single Static Assignment (SSA) Form is an intermediate representation of a
function graph that is very frequently used in compiler optimization. SSA form
imposes a naming convention on the function variables such that each variable
name corresponds to the value produced at a single definition point. Another
advantage of SSA Form is the ability to identify merge points inside a function
flow graph and mark them with so called -functions.
In our prototype all the functions flow graphs inside a binary are translated
into SSA Form before proceeding with the analysis. There exists three known
types of SSA Form translation based on the e ciency of the algorithm and on
the number of -functions present in the resulting graph. We chose to imple-
ment a ”semi-pruned” SSA Form as a good trade-o↵ between precision and
performance.
In order to reduce the number of -functions inside a flow graph the pruned
SSA Form employs liveness analysis to determine which variables are still alive
26
26. 3.2. Single Static Assignment (SSA) Form
Figure 3.3: Non-local variable
at a given merge point. To improve performances instead of liveness analysis
the semi-pruned SSA form introduces the concept of non-locals. A non-local
is a variable which has been used inside a basic block but it has been defined
elsewhere, that is a variable that first appeared in a di↵erent basic block (see
Figure 3.3). It must be noticed that the concept of non-local is an under-
approximation of a full blown liveness analysis, thus the semi-pruned form is
still subject to the presence of not strictly needed -functions. The algorithm
proposed by Briggs et al[2] is the following:
non-locals ;
for each block B do
killed ;
for each instruction z x op y in B do
if x 2 killed then
/
non-locals non-locals [{x}
end if
if y 2 killed then
/
non-locals non-locals [{y }
27
27. CHAPTER 3. Preprocessing stage
end if
killed killed [ {z}
end for
end for
In our implementation the algorithm maintains three pre-computed data
structures: a list of addresses where to insert the -functions and two hashmaps
to keep track respectively of all the previously created variables and of the next
variable name to be assigned. The first data structure is created by calculating
the iterated dominance frontier of every live variable in the flow graph. The
rest of the algorithm works by recursively walking the dominator tree renaming
variables in the original graph so that when a new assignment or a -function
is found a variable with a new name is created and the results are propagated
to the children in the tree. The pseudo-code as adapted in our implementation
is the following:
for each variable v do
Let A(v ) be the set of blocks containing assignment to v
Place a -function for v in the iterated dominance frontier of A(v )
end for
for each variable v do
Counters[v ] 0
Stack[v ] ;
end for
Let start be the root node of the dominator tree, RENAME(start)
RENAME(block):
for each -function, v (...) in block do
i Counters[v ]
28
28. 3.2. Single Static Assignment (SSA) Form
Replace v with vi in the new graph
Stack[v].push(i)
Counters[v ] i +1
end for
for each instruction, v x op y in block do
i Stack[x].first(), c Stack[y ].first()
Replace x with xi and y with yc in the new graph
i Counters[v ]
Replace v with vi in the new graph
Stack[v].push(i)
Counters[v ] i +1
end for
for each successor s of block do
j block variables position index, corresponding to the position of block in the parents
array of s. This is just a convention
for each -function p in s do
v j th operand of p
Replace v with vi where i Stack[x].first()
end for
end for
for each child c of block in the dominator tree do
RENAME(c)
end for
for each instruction v x op y ||v (...) in block do
Stack[v ].pop()
end for
An example of REIL code translated in SSA Form is shown in Figure 3.4
29
30. Chapter 4
Analysis stage
4.1 Pointer analysis
In our work we implemented both intraprocedural and interprocedural pointer
analysis in order to track objects aliases and thus being able to reason about
possible dangling pointer conditions. The intraprocedural analysis is performed
on the top of MonoREIL[6], an abstract interpretation framework based on
REIL. In the following two subsections we are going to briefly describe the
main features of our analysis, and MonoREIL. Later on we will focus on the
intraprocedural pointer analysis algorithm. In the third section the interpro-
cedural algorithm will be explained.
4.1.1 Analysis features
Dataflow analysis and abstract interpretation algorithms have a number of
properties that characterize them. Among those the three most relevant are
31
31. CHAPTER 4. Analysis stage
flow, context and path sensitivity. An algorithm is said to be path-sensitive
if it computes di↵erent piece of analysis information depending on predicates
at conditional branch instructions. The intraprocedural algorithm used in our
work merges results of the analysis at the function merge-points, this e↵ect-
ively results in a path-insensitive algorithm. In fact we are not able to discern
code-paths that lead to the presence of a given alias.
Moreover our algorithm is flow-insensitive, in fact during the analysis we do
not track code locations. That is, the analysis will not be able to say after
which statement a given variable became an alias of another one.
The main problem deriving from the path and flow insensitivity of our al-
gorithm is the increased number of false positives that can appear in our ana-
lysis. In fact we are not able to gauge whether a specific path yielding to a
stale pointer condition is feasible. Nonetheless the performance gain obtained
by this implementation of the algorithm are significantly more beneficial than
the increase in the number of false positives.
Moreover a number of empirical studies [9] [10] [11] [12] have shown that the
improvement o↵ered by flow-sensitivity is minimal in terms of precision.
Our interprocedural algorithm works by merging trees generated in each func-
tion, therefore the flow-insensitivity of the intraprocedural analysis and the
nature of the merging we perform make it both flow and path insensitive. The
same considerations done for the intraprocedural analysis on performance gain
and precision loss apply to the interprocedural part of our analysis.
The algorithm performs the analysis on the procedural call graph (PCG) of the
32
32. 4.1. Pointer analysis
binary. The PCG allows to discern function parameters and calling locations,
that is every edge in the PCG is marked with the parameters passed to a given
function. This property guarantees that our analysis is context-sensitive.
Context-sensitivity is crucial, in fact the ability to discern function parameters
of each call prevents ambiguity and imprecision in tracking aliases between
functions.
Another problem of pointer analysis is dealing with data structures which can
make it di cult to track aliases. In order to deal with this nuance we resorted
to two strategies.
The first one consists of tracking the size of objects whenever possible, that is
when there is no need to perform range analysis, this way we are able to recog-
nize whether a given heap location belongs to a specific object and therefore
we are able to properly track aliases for it.
The second strategy is to model widely used data structures such as linked
lists, vectors and other similar ones in order to be able to track objects stored
in them. It must be noticed that not all data structures are covered, therefore
some aliases may be missed by our analysis.
The two latter strategies allow us to completely avoid heap modeling thus
greatly simplifying the analysis.
4.1.2 MonoREIL
MonoREIL is an abstract interpretation framework that performs fixed-point
iteration until a final state is reached. MonoREIL operates on the control flow
33
33. CHAPTER 4. Analysis stage
graph of a function that can be walked arbitrary depending on the analysis
that is intended to be performed. The definitions of a lattice, its elements and
a formula that can combine the elements are necessary for the framework to
work. Every analysis is supposed to start with an initial state that can be
arbitrary. Finally the e↵ects of REIL instructions on the lattice need to be
modelled.
To guarantee the termination of the analysis the lattice has to satisfy the
ascending chain condition, that is the lattice has to be a noetherian lattice. In
fact if the condition is violated it is not possible to guarantee that there exists
two states in the analysis, p n 1
and p n , such that p n 1
= pn .
In the following section we show that our analysis satisfies the requirement and
therefore is always guaranteed to terminate.
4.2 Intraprocedural analysis
Alias set analysis is a well-known variation of pointer analysis which grants an
higher degree of precision and at the same time avoids performance bottlenecks.
Intuitively an alias set is the set of all local pointer variables that point to a
given object. The strength of the analysis lies in the fact that whenever there
is some degree of uncertainty about whether a given variable x points to a
concrete object, instead of creating a may or must-point-to set, it creates two
alias sets, only one of which contains x.
Our analysis computes the alias sets for each function in the binary so that they
can be later combined in order to reason on the existence of dangling pointers
34
34. 4.2. Intraprocedural analysis
by propagating alive aliases between functions in the binary call graph.
We have adapted the algorithm proposed in [3] to fit our purposes and scope.
It can be proved that our analysis reaches the fixed point because our transfer
functions are distributive. In fact the fixed point computed for the alias set
dataflow algorithm corresponds to the merge-over-all-paths dataflow value of
our algorithm [7].
In order to analyze the functions we have to further simplify our intermediate
language so that it can be expressed by the means of a very simple grammar:
s ::= v1 v2 |v h|h v |v null|v new
Where h represents any heap location, null represents a null pointer and new
represents a newly created object.
To simplify REIL code so that it can be expressed with the above grammar we
created transformation functions for every REIL instruction in our MonoREIL
algorithm. Table 4.1 shows the appropriate transformations we apply to REIL
instructions.
It must be noticed that we consider an object to be newly created only
when it is the return value of either a constructor or an allocation functions.
Both constructors and allocation functions although partially recognized in an
automated fashion by our software need to be manually indicated by the user.
We treat code blocks di↵erently depending on whether we are dealing with a
simple assignment or a -function. In the former case we first merge all the
influencing states, performing an union on the sets, for a given node and then
we apply the equations shown in Figure 4.2 .
35
35. CHAPTER 4. Analysis stage
We create a new alias set for every newly allocated object, we then store in
the appropriate alias set all the variables that alias one of the objects aliases
and finally whenever an heap location, not previously known, is found we create
two alias sets one with the location and another one without it.
In the latter case instead we can easily assume that -functions are to be found
at merge points in the control flow graph of the function, that is when we need
to combine one or more incoming states in our lattice. In our analysis the
lattice is the set of all alias sets. Figure 4.3 shows the combine function for
merge points.
Each state is first pruned, that is we remove all the aliases that do not
exist in the set of variables of the node strict dominators. Once the alias set
has been pruned it is then updated by adding all destination variables whose
values are being assigned from variables already in the alias set.
We defined the elements of the lattice so that each element is a set of linked
lists. To each lattice element corresponds an object to which the variables in
the linked list alias to.
The reason for choosing set of linked lists over other data structure is the
performance gain. In fact it can be proved that the analysis carried on an SSA
form graph allows to perform operations only on the head of the list thus saving
look-up time. Nonetheless for further optimization, when the analysis for a
given function is complete we transform each alias set in a tree-like structure
which makes it easier to perform the interprocedural analysis we will discuss
in the following section.
36
36. 4.3. Interprocedural analysis
A sample run of the algorithm can be seen in Figure 4.4
4.3 Interprocedural analysis
At the end of the intraprocedural alias analysis, the resulting alias lists of each
function are used to construct points-to tree structures that make the alias
relationships between variables explicit. In such a points-to tree, each node
represents a distinct variable and its children the variables pointing to it, so
that siblings are equivalent aliases.
For each function we extract its parameters and its return. Given that in-
formation, the interprocedural analysis algorithm performs a walkdown on the
procedure call graph, updating a set of points-to trees for the object that needs
to be tracked, until the final state of the analysis is reached. We propose an
implementation of our algorithm on the top of BinNavi.
The interprocedural analysis, as opposed to the intraprocedural one, is run on
a Procedure Call Graph (PCG). A PCG is identical to a call graph, with the
exception that it has an edge for each call site, and every edge is labelled with
the variables of the source node that act like parameters in the target node
(see Figure 4.5).
At each iteration, the algorithm properly connects the points-to trees con-
taining the incoming parameters to the points-to trees of the previous iteration
on the graph. These are updated by connecting the trees containing the formal
parameters of the current function to the nodes corresponding to the incoming
37
37. CHAPTER 4. Analysis stage
parameters, as indicated in the edge label. If the node corresponding to the
formal parameter is the root node of a points-to tree, than the tree is appen-
ded to the node corresponding to the incoming parameter and the result is
added to the newly generated set of trees. Alternatively, if it is not a root
node, the points-to tree containing it is added to the new state set, with the
node replaced by the incoming parameter. The points-to tree containing the
incoming parameter is also copied into the new set of trees and the sub-tree
whose root node is the node containing the incoming parameter is detached
from that copy of the tree (see Figures 4.6, Figure 4.7, Figure 4.8 and Figure
4.9). When a merge point is met, the resulting sets of trees are computed
separately, one for each incoming edge, and a set union is performed, so that
duplicate trees are removed. Moreover, sub-trees of trees contained in the set
are also removed. Additional trees can be removed from the set in order to
furtherly reduce space requirements (see Figure 4.10).
Considering that the algorithm will not walk more than once on a node not in-
volved in a cycle, it is safe to remove trees not containing aliases to the tracked
object from the set of points-to trees of that node. Similarly, the points-to
trees of the previous step are extended with the points-to trees containing the
returned variable.
Once the fixed point iteration has been reached, the resulting set of trees
is an interprocedural set of points-to trees containing all of the aliases of the
object you want to track.
38
38. 4.4. C++ peculiarities
4.4 C++ peculiarities
In dealing with C++ we had to take into account a certain numbers of charac-
teristics linked to the language. One of the problems of complex applications
written in C++ is often the use of smart pointers interfaces. That is, C++
classes used for providing memory safety in terms of objects lifetime. In or-
der to deal with smart pointers we require the user to specify which functions
shall be considered the constructor and destructor of the object intended to
be analyzed and whether there are multiple constructors or destructors for an
object. In order to improve the precision of our analysis we used well-known
techniques explained in [8] to identify constructors and destructors of objects
in the binary whenever it is possible. These requirements are necessary to keep
the analysis as application independent as possible without constraining our
work to one specific kind of smart pointers or memory management architec-
ture.
User interaction is also needed to handle custom allocators, that is the user
is asked to specify whether or not the allocation and deallocation functions
identified by our tool are the correct ones.
39
39. CHAPTER 4. Analysis stage
Arithmetic instructions Operation
ADD x1 , x2 , y y is added to the alias set of x1 + x2
SUB x1 , x2 , y y is added to the alias set of x1 x2
MUL x1 , x2 , y y is added to the alias set of x1 · x2
j k
DIV x1 , x2 , y y is added to the alias set of x1x2
MOD x1 , x2 , y y is added to the alias set of x1 mod x2
8
>
> x · 2x2
< 1 if x2 0
BSH x1 , x2 , y y is added to the alias set of
> ⌅ x1 ⇧
>
:
2x2 if x2 < 0
Bitwise instructions Operation
AND x1 , x2 , y y is added to the alias set of x1 &x2
OR x1 , x2 , y y is added to the alias set of x1 | x2
XOR x1 , x2 , y y is added to the alias set of x1 x2
Logical instructions Operation
BISZ x1 , ", y y is removed from all alias sets
JCC x1 , ", y does not a↵ect alias sets
Data transfer instructions Operation
LDM x1 , ", y y is added to the alias set of mem[x1 ]
STM x1 , ", y mem[y ] is added to the alias set of x1
STR x1 , ", y y is added to the alias set of x1
Other instructions Operation
NOP ", ", " does not a↵ect alias sets
UNDEF ", ", y y is removed from all alias sets
UNKN ", ", " does not a↵ect alias sets
Figure 4.1: REIL Instructions transformations
40
40. 4.4. C++ peculiarities
8
>
> {{v }} if s = v
< new
[[s]]gen ,
>
> ;
: otherwise
8
>
> {a [ {v }} if s = v
>
> 1 1 v2 ^ v2 2 a
>
>
<
[[s]]a (a) ,
> {a, a [ {v }} if s = v
>
h
>
>
>
> {a}
: otherwise
[
[[s]]l (l) , [[s]]gen [ [[s]]a (a)
a2l
Figure 4.2: Transfer functions for common instructions
[[ ]]a (a, pred) , {(a vars(sdom( ))) [ (yi : yi xi 2 livevars( , pred) ^ x1 2 a)}
[
[[ ]]l (l, pred) , [[s]]a (a, pred)
a2l
Figure 4.3: Transfer functions for -nodes instructions
Figure 4.4: Intraprocedural analysis example
41
41. CHAPTER 4. Analysis stage
Figure 4.5: PCG used in our algorithm
Figure 4.6: Computing f1() to f2() alias trees
42
42. 4.4. C++ peculiarities
Figure 4.7: Computing f1() to f3() alias trees
Figure 4.8: Computing f2() to f4() alias trees through the leftmost edge
Figure 4.9: Computing f2() to f4() alias trees through the rightmost edge
43
43. CHAPTER 4. Analysis stage
Figure 4.10: E↵ects of combine() on functions alias trees
44
44. Chapter 5
Stale pointers detection
Detecting use after free conditions means to verify whether there are any code
paths in which an object alias is used after the object itself was freed. In order
to reason on this condition we first prune the control flow graph (CFG, see
Figure 5.1) of the binary, so that only functions that use aliases of the object
we are interested in or are linked to a function using an alias are preserved.
This can be trivially done by walking the call flow graph and eliminating all
the functions that are neither successors nor predecessors of the procedures
when an object alias appears (see Figure 5.2 and Figure 5.3).
Figure 5.1: Example of callgraph
45
45. CHAPTER 5. Stale pointers detection
Figure 5.2: Callgraph with relevant functions in red
Figure 5.3: Pruned callgraph
Finally we simply mark calls to the destructors on the pruned callgraph
(see Figure 5.4).
The rest of the algorithm walks cross references to the object destructors
backwards, that is it computes all functions that at some program point inval-
idate the concrete object. We call them functions aliases.
Figure 5.4: Pruned callgraph ready for bug detection step
46
46. For each function that calls a destructor alias we verify whether the concrete
object itself or one of its aliases are used. To do so we build the dominator
tree of the function flow graph and verify the conditions shown in Figure 5.5.
We assume the following notation: B is a generic basic block, F is the basic
block that either calls the destructor or destroys the concrete object, v is an
object alias. dom(B) denotes the basic blocks dominated by node B. v 2 B
is the relation that represents the use of variable v in basic block B. Finally
succ(B) are successors of node B.
Type of warning Condition
v is a stale pointer if v 2 B ^ B 2 dom(F )
v may be a stale pointer if v 2 B ^ B 2 dom(F ) ^ B 2 succ(F )
/
v might be a memory leak if v 2 B ^ F 2 dom(B) ^ F 2 succ(B)
/
v is a memory leak if v 2 B ^ F 2 dom(B) ^ F 2 succ(B)
/ /
v is neither a stale pointer nor a memory leak otherwise
Figure 5.5: Alias verification equations
47
48. Chapter 6
Results and future work
In this paper we have targeted a widely known cause of security flaws. We
have shown that it is feasible to collect enough data in terms of alias sets on a
C++ binary to discover stale pointers bugs at interprocedural level.
We have implemented our work on the top of BinNavi using REIL as the
intermediate language and MonoREIL as the monotone solver framework for
our algorithms. Our approach of only verifying one type of object per time
allowed us to drastically reduce the execution time and the number of false
positives to analyze, nonetheless we do realize that this approach is suboptimal
for scenarios in which a developer has to fix bugs in his software because in
that case it would be necessary to run the analysis multiple times.
From our test results, on a set of samples we built, it is clear that the prime
cause of false positives is the lack of flow-sensitiveness of our analysis. One
of the primary goal of future work in this direction is to use a SMT solver in
order to verify path feasibility.
49
49. CHAPTER 6. Results and future work
The principal source of false negatives in our analysis is the heavy presence
of function pointers in C++ code and complex data structures, in those cases
we were not able to obtain enough information either on the alias sets or
on the relationships between functions. Some techniques exist to deal with
these problems, we did not implement them because the results are far from
satisfying and could dramatically increase the number of false positives in our
analysis.
Finally we plan on augmenting our analysis by increasing the number of data
structures handled by our algorithm and by doing range analysis in order to
trace an higher number of aliases.
50
50. Bibliography
[1] Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and
F. Kenneth Zadeck: ”E ciently computing static single assignment form
and the control dependence graph.” ACM Transactions on Programming
Languages and Systems, 13(4):451-490, Oct 1991
[2] Preston Briggs, Keith D. Cooper, Timothy J. Harvey, and L. Taylor
Simpson: ”Practical improvements to the construction and destruction
of static single assignment form.” Software-Practice and Experience,
28(8):859-881, Jul 1998.
[3] Nomair A. Naeem, and Ondrej Lhotak: ”E cient Alias Set Analysis Using
SSA Form.”International Symposium on Memory Management - ISMM ,
pp. 79-88, 2009
[4] Xiaodong Ma, Ji Wang, and Wei Dong: ”Computing Must and May Alias
to Detect Null Pointer Dereference.”Leveraging Applications of Formal
Methods - ISOLA , pp. 252-261, 2008
[5] Sean Heelan: ”Finding use-after-free bugs with static analysis”
51
51. BIBLIOGRAPHY
[6] Thomas Dullien, and Sebastian Porst: ”REIL: A platform-independent
intermediate representation of disassembled code for static code analysis.”
CanSecWest 2009
[7] J. B. Kam and J. D. Ullman: ”Monotone data flow analysis frameworks.”
Acta Inf., 1977.
[8] Paul Vincent Sabanal, and Mark Vincent Yason: ”Reversing C++.” Black
Hat DC 2007
[9] Michael Hind: ”Pointer Analysis: Haven’t We Solved The Problem Yet?”
ACM Transactions on Programming Languages and Systems, June 2001
[10] M. Hind, M. Burke, P. Carini and J.-D. Choi: ”Interprocedural pointer
alias analysis” ACM Transactions on Programming Languages and Sys-
tems, Apr. 1993
[11] M. Hind and A. Pioli: ”Which Pointer Analysis Should I Use?” Interna-
tional Symposium on Software Testing and Analysis, Aug. 2000
[12] M. Hind and A. Pioli: ”Evaluating The E↵ectiveness of Pointer Alias
Analysis”, Science of Computer Programming, Jan. 2001
52