Stale pointers are the new black - white paper

POLITECNICO DI MILANO

Facolt` di Ingegneria dell’Informazione
a

Corso di Laurea in Ingegneria Informatica

Dipartimento di Elettronica e Informazione

Detecting aliased stale pointers via static analysis: An architecture

independent practical application of pointer analysis and graph

theory to ﬁnd bugs in binary code

Relatore: Prof. Stefano Zanero

Correlatore: Ing. Federico Maggi

Tesi di Laurea di:

Giovanni Gola, matricola 717847

Vincenzo Iozzo, matricola 713583

Anno Accademico 2009-2010

Acknowledgements

The authors would like to thank Thomas Dullien, Julien Vanegue, Ralf-Philipp

Weinmann and Tim Kornau for their suggestions and help while researching

the topic.

The authors would also like to thank the thesis advisors from Politecnico di

Milano Stefano Zanero and Federico Maggi.

Finally we want to thank all the people who have reviewed the paper.

3

Contents

1 Introduction 9

2 Static Analysis 13

2.1 General knowledge . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Pointer Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3 Conclusions and contributions . . . . . . . . . . . . . . . . . . . 19

3 Preprocessing stage 21

3.1 The REIL intermediate language . . . . . . . . . . . . . . . . . 21

3.2 Single Static Assignment (SSA) Form . . . . . . . . . . . . . . . 25

3.2.1 Graph theory overview . . . . . . . . . . . . . . . . . . . 25

3.2.2 Computing SSA Form . . . . . . . . . . . . . . . . . . . 26

4 Analysis stage 31

4.1 Pointer analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.1.1 Analysis features . . . . . . . . . . . . . . . . . . . . . . 31

4.1.2 MonoREIL . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.2 Intraprocedural analysis . . . . . . . . . . . . . . . . . . . . . . 34

5

CONTENTS

4.3 Interprocedural analysis . . . . . . . . . . . . . . . . . . . . . . 37

4.4 C++ peculiarities . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5 Stale pointers detection 45

6 Results and future work 49

6

List of Figures

2.1 Data-ﬂow equations . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 Pointer analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.1 List of REIL instructions . . . . . . . . . . . . . . . . . . . . . . 23

3.2 REIL translation of a function . . . . . . . . . . . . . . . . . . . 24

3.3 Non-local variable . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.4 SSA Form of a REIL function . . . . . . . . . . . . . . . . . . . 30

4.1 REIL Instructions transformations . . . . . . . . . . . . . . . . 40

4.2 Transfer functions for common instructions . . . . . . . . . . . . 41

4.3 Transfer functions for -nodes instructions . . . . . . . . . . . . 41

4.4 Intraprocedural analysis example . . . . . . . . . . . . . . . . . 41

4.5 PCG used in our algorithm . . . . . . . . . . . . . . . . . . . . 42

4.6 Computing f1() to f2() alias trees . . . . . . . . . . . . . . . . . 42

4.7 Computing f1() to f3() alias trees . . . . . . . . . . . . . . . . . 43

4.8 Computing f2() to f4() alias trees through the leftmost edge . . 43

4.9 Computing f2() to f4() alias trees through the rightmost edge . . 43

4.10 E↵ects of combine() on functions alias trees . . . . . . . . . . . 44

7

LIST OF FIGURES

5.1 Example of callgraph . . . . . . . . . . . . . . . . . . . . . . . . 45

5.2 Callgraph with relevant functions in red . . . . . . . . . . . . . 46

5.3 Pruned callgraph . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.4 Pruned callgraph ready for bug detection step . . . . . . . . . . 46

5.5 Alias veriﬁcation equations . . . . . . . . . . . . . . . . . . . . . 47

8

Chapter 1

Introduction

In the era of cloud computing and internet-connected devices, attacks to such

devices are becoming increasingly proﬁtable. Nowadays web clients, such as

PCs, mobiles, UMPCs, tablets and netbooks, are indeed a precious source

of information. Although built using diverse architectures, including but not

limited to the x86, x86-64, ARM and PowerPC familes, they necessarily share

a base set of network dedicated software, ﬁrst of all the web browsers. Ap-

plications like Firefox, Safari, Internet Explorer, Google Chrome and other

complementary software including JavaScript engines, Adobe Flash, Adobe

AIR, Adobe Reader, the Java Runtime Environment, are distributed pre-

installed in almost every OS. Due to the magnitude of the latter applications,

developers are bound to adopt self-made custom complex memory manage-

ment systems built on top of native memory allocators. Noticeable examples

are NS Alloc()/NS Free() in the XPCOM library, PR NEW/PR DELETE

in NSPR library, JS malloc()/JS free() in SpiderMonkey JavaScript engine,

9

CHAPTER 1. Introduction

TCmalloc() in Webkit and V8 JavaScript engines.

The complexity and heterogeneity of memory management in large code

bases results in numerous memory corruption vulnerablities, such as unintial-

ized memory, double frees and dangling pointers. The latter are an insidious

type of flaws, also known as ”use-after-free” or ”stale pointers”, which occur

when pointer variables reference to freed memory areas. A quick search on the

Common Vulnerabilities and Exposures (CVE) list yields 162 reports for web

browsers and more than 470 results among the three most-popular browser

add-ons, i.e., Adobe Flash, Adobe Reader and Java Runtime Environment

(between 2006 and 2010).

The astonishing number of use-after-free bugs reported in recent years make

them one of the most appetible attack vector on client systems. Although

use-after-free bugs are scattered throughout the code base, finding them and

other memory corruption bugs is arguably more di cult than overflows in that

they are temporal memory errors and, as such, an e↵ective detection of them

means understanding how a custom memory management system internally

works, in order to identify logical pitfalls. The only e↵ective solution consists

in manual code-review e↵orts, in the attempt to recognize exceptions with

respect to the rules that developers were supposed to follow while writing code.

This process is clearly cumbersome. A vast variety of approaches has been

proposed to automatically spot memory flaws. In particular, an analyst can

actually face the problem of finding dangling pointers with dynamic analysis

techniques, such as fuzzing, or by the means of static analysis. Even though

10

fuzzing has been proved to be an e↵ective method for finding such bugs, it

su↵ers from several intrinsic limitations. For example, relying on input to

exercise code paths has the disadvantage of scarce coverage of the application

code and limits the depth of exercised code paths, leaving part of the code

unexplored. Moreover, the randomic nature of the generated input makes the

running time virtually infinite, and totally unrelated to the code coverage the

analysis reaches over time. We propose a practical approach to automatically

find use-after-free conditions in large binary code bases using static analysis

and graph theory.

11

CHAPTER 1. Introduction

12

Chapter 2

Static Analysis

2.1 General knowledge

Static analysis is the automated process of extracting semantic information

about a program without executing it. Not having the need of executing the

binary, static analysis o↵ers a number of potential advantages over its dynamic

counterpart:

• Architecture independence: the analysis, even if it might be speciﬁc to an

instruction set or language, can be implemented on top of any framework

and run on any machine;

• Running time: time complexity of a static analysis algorithm may range

from linear to exponential, but it will process the entire binary in ﬁnite

time;

• Code coverage and depth of analysis: not relying on input to exercise

code paths, static analysis can achieve total code coverage. This facts

13

CHAPTER 2. Static Analysis

makes it particularly useful in analyzing large applications with complex

path triggering conditions.

Although the static analysis problem has been proven to be theoretically un-

decidable, it can be formally reformulated as an over-approximation of the ori-

ginal problem which can be proven to halt in finite time. In fact the term ”static

analysis” is indeed over-broad. It includes various implementation techniques

that make static analysis feasible. Model checking was the first technique to

appear in chronological order. It arose in an attempt to solve the so called

Concurrent Program Verification problem. A model checker checks for the

correctness of a formula expressed in Temporal Logic, being able to e↵ectively

uncover hard-to-find concurrency errors. As stated in Chapter 1, use-after-free

bugs are temporal memory errors and, as such, can be expressed in terms of

temporal logic: a program point in which a pointer gets dereferenced is reached

after a program point in which the same pointer gets freed. Model Checking

turns out to be very precise in unveiling temporal memory errors, but reveals

non-negligible faults. The major defect is the need for source code. The model

checker simply tries to resolve the system of pre- and post-conditions of each

function. In order to do that, the pre- and post-conditions has to be previ-

ously specified by annotating the source code of the program to analyze. The

process of annotating is also extremely time consuming. Moreover, in the gen-

eral case, the complexity of model checking algorithms is exponential in time.

Another very important method for static analysis is the so called Abstract

Interpretation. When running an algorithm based on Abstract Interpretation,

14

2.1. General knowledge

outb = transb (inb )

inb = joinp2predb (outb )

Figure 2.1: Data-flow equations

the program semantics is over-approximated as a set of monotonic functions,

the algorithm uses to transform ordered sets, which are the results of the ana-

lysis. It can be viewed as a partial execution of the program which tracks only

part of information about its semantics, without performing all the calcula-

tions. The constraints of monotonicity and order assure the analysis to halt

in finite time. This is a well-known technique, mainly used in compilers, for

optimization tasks, and in debugging. Every analysis, taking advantage of the

abstract interpretation pattern for gathering information about the possible

set of values (of registers, variables, memory locations, etc.) calculated at a

given point in a program, belongs to the ”Data-flow” family of analyses. In

particular, a data-flow analysis algorithm usually walks the control flow graph

(CFG) of a program. The algorithm, at each program point (instruction, as-

signment, basic block and so forth), applies data-flow equations (see Figure

2.1) to the state (ordered set) associated to it. The analysis is repeated on

every node until the sets stabilize. As previously stated, the data-flow equa-

tions must be monotonic and the sets must carry an order relation, at least

partial. If the latter conditions hold, the repetition of the analysis will reach

the so called fixpoint and the algorithm can stop.

15


2.2 Pointer Analysis

We will now focus on data-flow analysis of pointer values, precisely named

Pointer Analysis. As one may immediately notice, Pointer Analysis fits our

need of tracking values assigned to pointers to later check for use-after-free

conditions. There are several dimensions that a↵ect cost/precision trade-o↵s

of pointer analysis. How a pointer analysis addresses each of these dimensions

helps to categorize the analysis. The dimensions we are now going to consider

are:

• Scope: a static analysis algorithm can either be engineered to perform

the analysis within a single function only, or could o↵er the possibility

to extend the analysis in order to cover multiple functions. The former

goes under the name of intraprocedural pointer analysis, the latter is

interprocedural anaysis;

• Flow-sensitivity: a flow-sensitive analysis takes into account the order of

statements in a program, and therefore it can compute a solution for each

program point, whereas a flow-insensitive analysis computes a solution

for either the whole program or for each procedure. The immediate

consequence is a higher degree of precision for flow-sensitive approaches.

On the other hand, flow-insensitivity shows much higher scalability in

terms of both time and space, therefore proving to be a better choice for

analyzing very large programs;

16

2.2. Pointer Analysis

• Context-sensitivity: in context-sensitive analyses the calling context of a

function is considered. This means parameters passed on di↵erent calls

of the same function can be distinguished and properly returned to the

actual caller. Context-sensitivity o↵ers a higher degree of precision and,

if properly implemented, mildly impacts speed;

• Heap modeling: an accurate pointer analysis should rely on a representa-

tion of the entire heap space. This is a non-trivial issue, and constitutes

a static analysis branch by itself, going under the name of Shape Ana-

lysis. Even though shape analysis is undergoing heavy research, it shows

extremely limited scalability in analyzing real-life programs;

• Aggregate modeling: a very important factor a↵ecting the precision of

pointer analysis is how elements of aggregates are distinguished. An

extremely precise modeling, in which every single object can be dis-

tiguished, could be achieved by running a full-blown shape analysis. As

just stated shape analysis is not a feasible solution for our purposes. A

very fast and imprecise model could collapse the elements of the aggreg-

ate into one object. This would introduce excessive noise in the analysis

and would lead to a situation in which no heap object is discernible from

another;

• Alias representation: indicates whether alias pairs or points-to pairs are

mantained during the analysis. Alias pairs represent alias relations ex-

plicitely, whereas points-to data is a more compact representation.

17


A lot of reasearch has been done on Pointer Analysis in the last twentyﬁve

years. Nowadays moderate intraprocedural analyses are commonly implemen-

ted in almost every compiler, whareas interprocedural algorithms are still in

research stage. Figure 2.2 shows a summary of the interprocedural analyses

Figure 2.2: Pointer analyses

proposed so far. Each one having its own pros and cons, they all share a few

limitations that make them not suitable for our purposes. Their major fault

is the need for source code, which usually is, from an analyst point of view,

practically impossible to retrieve. Moreover, they are often built to analyze

source code translated to a sub-language of the original language. The lat-

ter limitation also a↵ects the few interprocedural analyses proposed to work

at the assembly level, like the one by Naeem et al [3]. The need to reduce

a real assembly language, with hundreds of instructions, to a really narrow

sub-language proves to be actually impossible in real scenarios.

18

2.3. Conclusions and contributions

2.3 Conclusions and contributions

Intraprocedural analysis, in terms of e ciency and scalability, is reliable enough

to be implemented with minor modiﬁcations apt to make it able to deal with

more expressive assembly languages. On the other hand, interprocedural ana-

lysis at the assembly level is still in an alpha stage of development. Therefore

we propose a new tree-based context-sensitive interprocedural analysis target-

ing assembly languages.

19


20

Chapter 3

Preprocessing stage

3.1 The REIL intermediate language

The Reverse Engineering Intermediate Language (REIL) [6] is a platform-

independent intermediate language which aims to simplify static code analysis

algorithms such as the gadget finding algorithm for return oriented program-

ming presented in this paper. It allows to abstract various specific assembly

languages to facilitate cross-platform analysis of disassembled binary code.

REIL performs a simple one-to-many mapping of native CPU instructions

to sequences of simple atomic instructions. Memory access is explicit. Every

instruction has exactly one e↵ect on the program state. This contrasts sharply

to native assembly instruction sets where the exact behaviour of instructions

is often influenced by CPU flags or other pre-conditions.

All instructions use a three-operand format. For instructions where some

of the three operands are not used, place-holder operands of a special type

21

CHAPTER 3. Preprocessing stage

called " are used where necessary. Each of the 17 di↵erent REIL instruction

has exactly one mnemonic that specifies the e↵ects of an instruction on the

program state.

The REIL VM

To define the runtime semantics of the REIL language it is necessary to define

a virtual machine (REIL VM) that defines how REIL instructions behave when

interacting with memory or registers.

The name of REIL registers follows the convention t-number, like t0, t1,

t2. The actual size of these registers is specified upon use, and not defined a

priori (In practice only register sizes between 1 byte and 16 bytes have been

used). Registers of the original CPU can be used interchangeably with REIL

registers.

The REIL VM uses a flat memory model without alignment constraints.

The endianness of REIL memory accesses equals the endianness of memory

accesses of the source platform.

REIL instructions

REIL instructions can loosely be grouped into five di↵erent categories accord-

ing to the type of the instruction (See Table 3.1).

Arithmetic and bitwise instructions take two input operands and one output

operand. Input operands either are integer literals or registers; the output

operand is a register. None of the operands have any size restrictions. However,

arithmetic and bitwise operations can impose a minimum output operand size

22

3.1. The REIL intermediate language

Arithmetic instructions Operation

ADD x1 , x2 , y y = x1 + x2

SUB x1 , x2 , y y = x1 x2

MUL x1 , x2 , y y = x1 · x2
j k
DIV x1 , x2 , y y = x1x 2

MOD x1 , x2 , y y = x1 mod x2
8
>
> x · 2x2
< 1 if x2 0
BSH x1 , x2 , y y =
> j x1 k
>
: if x2 < 0
2 x2

Bitwise instructions Operation

AND x1 , x2 , y y = x1 &x2

OR x1 , x2 , y y = x1 | x2

XOR x1 , x2 , y y = x1 x2

Logical instructions Operation
8
>
> 1 if x = 0
< 1
BISZ x1 , ", y y =
>
> 0 if x 6= 0
: 1

JCC x1 , ", y transfer control ﬂow to y i↵ x1 6= 0

Data transfer instructions Operation

LDM x1 , ", y y = mem[x1 ]

STM x1 , ", y mem[y ] = x1

STR x1 , ", y y = x1

Other instructions Operation

NOP ", ", " no operation

UNDEF ", ", y undeﬁned instruction

UNKN ", ", " unknown instruction

Figure 3.1: List of REIL instructions

23


or a maximum output operand size relative to the sizes of the input operands.

Note that certain native instructions such as FPU instructions and mul-

timedia instruction set extensions cannot be translated to REIL code yet.

Another limitation is that some instructions which are close to the underlying

hardware such as privileged instructions can not be translated to REIL; sim-

ilarly exceptions are not handled. All of these cases require an explicit and

accurate modelling of the respective hardware features.

An example of function, translated from x86 assembly alnguage to REIL is

shown in Figure 3.2

Figure 3.2: REIL translation of a function

24

3.2. Single Static Assignment (SSA) Form

3.2 Single Static Assignment (SSA) Form

3.2.1 Graph theory overview

The algorithm for building SSA Form relies on the dominator tree and dom-

inance frontiers in order to identify merge points.

The following notions are required to understand the algorithms for SSA trans-

lation and how bug detection works:

• Dominance Relation: In a Control Flow Graph, a node D dominates a

node N if every path from the start node to N must through D. Nota-

tionally, this is equivalent to D dom N. By deﬁtion every node dominates

itself;

• Strict Dominance Relation: A node D strictly dominates a node N if D

dominates N and D does not equal N;

• Immediate Dominator : The immediate dominator or idom of a node N is

the unique node that strictly dominates N but does not strictly dominate

any other node that strictly dominates N. Not all nodes have immediate

dominators;

• Dominator Tree: The dominator tree of a graph is a tree where each

node’s children are those nodes it immediately dominates;

• Dominance Frontier : The dominance frontier of a node S is the set of all

nodes N such that S dominates a predecessor of N but does not strictly

25


dominates N; More intuitively, it is the set of nodes where N’s dominance

stops;

• Iterated Dominance Frontier : Formally, it is the irreflexive closure of the

dominance frontier relation. It is actually calculated as follows:
S
Let DF (S) = x2s DF (x) be the dominance frontier of a set of nodes.

The iterated dominance frontier is:

3.2.2 Computing SSA Form

Single Static Assignment (SSA) Form is an intermediate representation of a

function graph that is very frequently used in compiler optimization. SSA form

imposes a naming convention on the function variables such that each variable

name corresponds to the value produced at a single definition point. Another

advantage of SSA Form is the ability to identify merge points inside a function

flow graph and mark them with so called -functions.

In our prototype all the functions flow graphs inside a binary are translated

into SSA Form before proceeding with the analysis. There exists three known

types of SSA Form translation based on the e ciency of the algorithm and on

the number of -functions present in the resulting graph. We chose to imple-

ment a ”semi-pruned” SSA Form as a good trade-o↵ between precision and

performance.

In order to reduce the number of -functions inside a flow graph the pruned

SSA Form employs liveness analysis to determine which variables are still alive

26


Figure 3.3: Non-local variable

at a given merge point. To improve performances instead of liveness analysis

the semi-pruned SSA form introduces the concept of non-locals. A non-local

is a variable which has been used inside a basic block but it has been deﬁned

elsewhere, that is a variable that ﬁrst appeared in a di↵erent basic block (see

Figure 3.3). It must be noticed that the concept of non-local is an under-

approximation of a full blown liveness analysis, thus the semi-pruned form is

still subject to the presence of not strictly needed -functions. The algorithm

proposed by Briggs et al[2] is the following:

non-locals ;

for each block B do

killed ;

for each instruction z x op y in B do

if x 2 killed then
/

non-locals non-locals [{x}

end if

if y 2 killed then
/

non-locals non-locals [{y }

27


end if

killed killed [ {z}

end for

end for

In our implementation the algorithm maintains three pre-computed data

structures: a list of addresses where to insert the -functions and two hashmaps

to keep track respectively of all the previously created variables and of the next

variable name to be assigned. The ﬁrst data structure is created by calculating

the iterated dominance frontier of every live variable in the ﬂow graph. The

rest of the algorithm works by recursively walking the dominator tree renaming

variables in the original graph so that when a new assignment or a -function

is found a variable with a new name is created and the results are propagated

to the children in the tree. The pseudo-code as adapted in our implementation

is the following:

for each variable v do

Let A(v ) be the set of blocks containing assignment to v

Place a -function for v in the iterated dominance frontier of A(v )

end for

for each variable v do

Counters[v ] 0

Stack[v ] ;

end for

Let start be the root node of the dominator tree, RENAME(start)

RENAME(block):

for each -function, v (...) in block do

i Counters[v ]

28


Replace v with vi in the new graph

Stack[v].push(i)

Counters[v ] i +1

end for

for each instruction, v x op y in block do

i Stack[x].first(), c Stack[y ].first()

Replace x with xi and y with yc in the new graph

i Counters[v ]

Replace v with vi in the new graph

Stack[v].push(i)

Counters[v ] i +1

end for

for each successor s of block do

j block variables position index, corresponding to the position of block in the parents

array of s. This is just a convention

for each -function p in s do

v j th operand of p

Replace v with vi where i Stack[x].first()

end for

end for

for each child c of block in the dominator tree do

RENAME(c)

end for

for each instruction v x op y ||v (...) in block do

Stack[v ].pop()

end for

An example of REIL code translated in SSA Form is shown in Figure 3.4

29


Figure 3.4: SSA Form of a REIL function

30

Chapter 4

Analysis stage

4.1 Pointer analysis

In our work we implemented both intraprocedural and interprocedural pointer

analysis in order to track objects aliases and thus being able to reason about

possible dangling pointer conditions. The intraprocedural analysis is performed

on the top of MonoREIL[6], an abstract interpretation framework based on

REIL. In the following two subsections we are going to brieﬂy describe the

main features of our analysis, and MonoREIL. Later on we will focus on the

intraprocedural pointer analysis algorithm. In the third section the interpro-

cedural algorithm will be explained.

4.1.1 Analysis features

Dataﬂow analysis and abstract interpretation algorithms have a number of

properties that characterize them. Among those the three most relevant are

31

CHAPTER 4. Analysis stage

flow, context and path sensitivity. An algorithm is said to be path-sensitive

if it computes di↵erent piece of analysis information depending on predicates

at conditional branch instructions. The intraprocedural algorithm used in our

work merges results of the analysis at the function merge-points, this e↵ect-

ively results in a path-insensitive algorithm. In fact we are not able to discern

code-paths that lead to the presence of a given alias.

Moreover our algorithm is flow-insensitive, in fact during the analysis we do

not track code locations. That is, the analysis will not be able to say after

which statement a given variable became an alias of another one.

The main problem deriving from the path and flow insensitivity of our al-

gorithm is the increased number of false positives that can appear in our ana-

lysis. In fact we are not able to gauge whether a specific path yielding to a

stale pointer condition is feasible. Nonetheless the performance gain obtained

by this implementation of the algorithm are significantly more beneficial than

the increase in the number of false positives.

Moreover a number of empirical studies [9] [10] [11] [12] have shown that the

improvement o↵ered by flow-sensitivity is minimal in terms of precision.

Our interprocedural algorithm works by merging trees generated in each func-

tion, therefore the flow-insensitivity of the intraprocedural analysis and the

nature of the merging we perform make it both flow and path insensitive. The

same considerations done for the intraprocedural analysis on performance gain

and precision loss apply to the interprocedural part of our analysis.

The algorithm performs the analysis on the procedural call graph (PCG) of the

32

4.1. Pointer analysis

binary. The PCG allows to discern function parameters and calling locations,

that is every edge in the PCG is marked with the parameters passed to a given

function. This property guarantees that our analysis is context-sensitive.

Context-sensitivity is crucial, in fact the ability to discern function parameters

of each call prevents ambiguity and imprecision in tracking aliases between

functions.

Another problem of pointer analysis is dealing with data structures which can

make it di cult to track aliases. In order to deal with this nuance we resorted

to two strategies.

The first one consists of tracking the size of objects whenever possible, that is

when there is no need to perform range analysis, this way we are able to recog-

nize whether a given heap location belongs to a specific object and therefore

we are able to properly track aliases for it.

The second strategy is to model widely used data structures such as linked

lists, vectors and other similar ones in order to be able to track objects stored

in them. It must be noticed that not all data structures are covered, therefore

some aliases may be missed by our analysis.

The two latter strategies allow us to completely avoid heap modeling thus

greatly simplifying the analysis.

4.1.2 MonoREIL

MonoREIL is an abstract interpretation framework that performs fixed-point

iteration until a final state is reached. MonoREIL operates on the control flow

33


graph of a function that can be walked arbitrary depending on the analysis

that is intended to be performed. The deﬁnitions of a lattice, its elements and

a formula that can combine the elements are necessary for the framework to

work. Every analysis is supposed to start with an initial state that can be

arbitrary. Finally the e↵ects of REIL instructions on the lattice need to be

modelled.

To guarantee the termination of the analysis the lattice has to satisfy the

ascending chain condition, that is the lattice has to be a noetherian lattice. In

fact if the condition is violated it is not possible to guarantee that there exists

two states in the analysis, p n 1
and p n , such that p n 1
= pn .

In the following section we show that our analysis satisﬁes the requirement and

therefore is always guaranteed to terminate.

4.2 Intraprocedural analysis

Alias set analysis is a well-known variation of pointer analysis which grants an

higher degree of precision and at the same time avoids performance bottlenecks.

Intuitively an alias set is the set of all local pointer variables that point to a

given object. The strength of the analysis lies in the fact that whenever there

is some degree of uncertainty about whether a given variable x points to a

concrete object, instead of creating a may or must-point-to set, it creates two

alias sets, only one of which contains x.

Our analysis computes the alias sets for each function in the binary so that they

can be later combined in order to reason on the existence of dangling pointers

34

4.2. Intraprocedural analysis

by propagating alive aliases between functions in the binary call graph.

We have adapted the algorithm proposed in [3] to fit our purposes and scope.

It can be proved that our analysis reaches the fixed point because our transfer

functions are distributive. In fact the fixed point computed for the alias set

dataflow algorithm corresponds to the merge-over-all-paths dataflow value of

our algorithm [7].

In order to analyze the functions we have to further simplify our intermediate

language so that it can be expressed by the means of a very simple grammar:

s ::= v1 v2 |v h|h v |v null|v new

Where h represents any heap location, null represents a null pointer and new

represents a newly created object.

To simplify REIL code so that it can be expressed with the above grammar we

created transformation functions for every REIL instruction in our MonoREIL

algorithm. Table 4.1 shows the appropriate transformations we apply to REIL

instructions.

It must be noticed that we consider an object to be newly created only

when it is the return value of either a constructor or an allocation functions.

Both constructors and allocation functions although partially recognized in an

automated fashion by our software need to be manually indicated by the user.

We treat code blocks di↵erently depending on whether we are dealing with a

simple assignment or a -function. In the former case we first merge all the

influencing states, performing an union on the sets, for a given node and then

we apply the equations shown in Figure 4.2 .

35


We create a new alias set for every newly allocated object, we then store in

the appropriate alias set all the variables that alias one of the objects aliases

and finally whenever an heap location, not previously known, is found we create

two alias sets one with the location and another one without it.

In the latter case instead we can easily assume that -functions are to be found

at merge points in the control flow graph of the function, that is when we need

to combine one or more incoming states in our lattice. In our analysis the

lattice is the set of all alias sets. Figure 4.3 shows the combine function for

merge points.

Each state is first pruned, that is we remove all the aliases that do not

exist in the set of variables of the node strict dominators. Once the alias set

has been pruned it is then updated by adding all destination variables whose

values are being assigned from variables already in the alias set.

We defined the elements of the lattice so that each element is a set of linked

lists. To each lattice element corresponds an object to which the variables in

the linked list alias to.

The reason for choosing set of linked lists over other data structure is the

performance gain. In fact it can be proved that the analysis carried on an SSA

form graph allows to perform operations only on the head of the list thus saving

look-up time. Nonetheless for further optimization, when the analysis for a

given function is complete we transform each alias set in a tree-like structure

which makes it easier to perform the interprocedural analysis we will discuss

in the following section.

36

4.3. Interprocedural analysis

A sample run of the algorithm can be seen in Figure 4.4

4.3 Interprocedural analysis

At the end of the intraprocedural alias analysis, the resulting alias lists of each

function are used to construct points-to tree structures that make the alias

relationships between variables explicit. In such a points-to tree, each node

represents a distinct variable and its children the variables pointing to it, so

that siblings are equivalent aliases.

For each function we extract its parameters and its return. Given that in-

formation, the interprocedural analysis algorithm performs a walkdown on the

procedure call graph, updating a set of points-to trees for the object that needs

to be tracked, until the ﬁnal state of the analysis is reached. We propose an

implementation of our algorithm on the top of BinNavi.

The interprocedural analysis, as opposed to the intraprocedural one, is run on

a Procedure Call Graph (PCG). A PCG is identical to a call graph, with the

exception that it has an edge for each call site, and every edge is labelled with

the variables of the source node that act like parameters in the target node

(see Figure 4.5).

At each iteration, the algorithm properly connects the points-to trees con-

taining the incoming parameters to the points-to trees of the previous iteration

on the graph. These are updated by connecting the trees containing the formal

parameters of the current function to the nodes corresponding to the incoming

37


parameters, as indicated in the edge label. If the node corresponding to the

formal parameter is the root node of a points-to tree, than the tree is appen-

ded to the node corresponding to the incoming parameter and the result is

added to the newly generated set of trees. Alternatively, if it is not a root

node, the points-to tree containing it is added to the new state set, with the

node replaced by the incoming parameter. The points-to tree containing the

incoming parameter is also copied into the new set of trees and the sub-tree

whose root node is the node containing the incoming parameter is detached

from that copy of the tree (see Figures 4.6, Figure 4.7, Figure 4.8 and Figure

4.9). When a merge point is met, the resulting sets of trees are computed

separately, one for each incoming edge, and a set union is performed, so that

duplicate trees are removed. Moreover, sub-trees of trees contained in the set

are also removed. Additional trees can be removed from the set in order to

furtherly reduce space requirements (see Figure 4.10).

Considering that the algorithm will not walk more than once on a node not in-

volved in a cycle, it is safe to remove trees not containing aliases to the tracked

object from the set of points-to trees of that node. Similarly, the points-to

trees of the previous step are extended with the points-to trees containing the

returned variable.

Once the ﬁxed point iteration has been reached, the resulting set of trees

is an interprocedural set of points-to trees containing all of the aliases of the

object you want to track.

38

4.4. C++ peculiarities

4.4 C++ peculiarities

In dealing with C++ we had to take into account a certain numbers of charac-

teristics linked to the language. One of the problems of complex applications

written in C++ is often the use of smart pointers interfaces. That is, C++

classes used for providing memory safety in terms of objects lifetime. In or-

der to deal with smart pointers we require the user to specify which functions

shall be considered the constructor and destructor of the object intended to

be analyzed and whether there are multiple constructors or destructors for an

object. In order to improve the precision of our analysis we used well-known

techniques explained in [8] to identify constructors and destructors of objects

in the binary whenever it is possible. These requirements are necessary to keep

the analysis as application independent as possible without constraining our

work to one speciﬁc kind of smart pointers or memory management architec-

ture.

User interaction is also needed to handle custom allocators, that is the user

is asked to specify whether or not the allocation and deallocation functions

identiﬁed by our tool are the correct ones.

39


Arithmetic instructions Operation

ADD x1 , x2 , y y is added to the alias set of x1 + x2

SUB x1 , x2 , y y is added to the alias set of x1 x2

MUL x1 , x2 , y y is added to the alias set of x1 · x2
j k
DIV x1 , x2 , y y is added to the alias set of x1x2

MOD x1 , x2 , y y is added to the alias set of x1 mod x2
8
>
> x · 2x2
< 1 if x2 0
BSH x1 , x2 , y y is added to the alias set of
> ⌅ x1 ⇧
>
:
2x2 if x2 < 0

Bitwise instructions Operation

AND x1 , x2 , y y is added to the alias set of x1 &x2

OR x1 , x2 , y y is added to the alias set of x1 | x2

XOR x1 , x2 , y y is added to the alias set of x1 x2

Logical instructions Operation

BISZ x1 , ", y y is removed from all alias sets

JCC x1 , ", y does not a↵ect alias sets

Data transfer instructions Operation

LDM x1 , ", y y is added to the alias set of mem[x1 ]

STM x1 , ", y mem[y ] is added to the alias set of x1

STR x1 , ", y y is added to the alias set of x1

Other instructions Operation

NOP ", ", " does not a↵ect alias sets

UNDEF ", ", y y is removed from all alias sets

UNKN ", ", " does not a↵ect alias sets

Figure 4.1: REIL Instructions transformations

40


8
>
> {{v }} if s = v
< new
[[s]]gen ,
>
> ;
: otherwise
8
>
> {a [ {v }} if s = v
>
> 1 1 v2 ^ v2 2 a
>
>
<
[[s]]a (a) ,
> {a, a [ {v }} if s = v
>
h
>
>
>
> {a}
: otherwise
[
[[s]]l (l) , [[s]]gen [ [[s]]a (a)
a2l

Figure 4.2: Transfer functions for common instructions

[[ ]]a (a, pred) , {(a vars(sdom( ))) [ (yi : yi xi 2 livevars( , pred) ^ x1 2 a)}
[
[[ ]]l (l, pred) , [[s]]a (a, pred)
a2l

Figure 4.3: Transfer functions for -nodes instructions

Figure 4.4: Intraprocedural analysis example

41


Figure 4.5: PCG used in our algorithm

Figure 4.6: Computing f1() to f2() alias trees

42


Figure 4.7: Computing f1() to f3() alias trees

Figure 4.8: Computing f2() to f4() alias trees through the leftmost edge

Figure 4.9: Computing f2() to f4() alias trees through the rightmost edge

43


Figure 4.10: E↵ects of combine() on functions alias trees

44

Chapter 5

Stale pointers detection

Detecting use after free conditions means to verify whether there are any code

paths in which an object alias is used after the object itself was freed. In order

to reason on this condition we first prune the control flow graph (CFG, see

Figure 5.1) of the binary, so that only functions that use aliases of the object

we are interested in or are linked to a function using an alias are preserved.

This can be trivially done by walking the call flow graph and eliminating all

the functions that are neither successors nor predecessors of the procedures

when an object alias appears (see Figure 5.2 and Figure 5.3).

Figure 5.1: Example of callgraph

45

CHAPTER 5. Stale pointers detection

Figure 5.2: Callgraph with relevant functions in red

Figure 5.3: Pruned callgraph

Finally we simply mark calls to the destructors on the pruned callgraph

(see Figure 5.4).

The rest of the algorithm walks cross references to the object destructors

backwards, that is it computes all functions that at some program point inval-

idate the concrete object. We call them functions aliases.

Figure 5.4: Pruned callgraph ready for bug detection step

46

For each function that calls a destructor alias we verify whether the concrete

object itself or one of its aliases are used. To do so we build the dominator

tree of the function ﬂow graph and verify the conditions shown in Figure 5.5.

We assume the following notation: B is a generic basic block, F is the basic

block that either calls the destructor or destroys the concrete object, v is an

object alias. dom(B) denotes the basic blocks dominated by node B. v 2 B

is the relation that represents the use of variable v in basic block B. Finally

succ(B) are successors of node B.

Type of warning Condition

v is a stale pointer if v 2 B ^ B 2 dom(F )

v may be a stale pointer if v 2 B ^ B 2 dom(F ) ^ B 2 succ(F )
/

v might be a memory leak if v 2 B ^ F 2 dom(B) ^ F 2 succ(B)
/

v is a memory leak if v 2 B ^ F 2 dom(B) ^ F 2 succ(B)
/ /

v is neither a stale pointer nor a memory leak otherwise

Figure 5.5: Alias veriﬁcation equations

47

CHAPTER 5. Stale pointers detection

48

Chapter 6

Results and future work

In this paper we have targeted a widely known cause of security flaws. We

have shown that it is feasible to collect enough data in terms of alias sets on a

C++ binary to discover stale pointers bugs at interprocedural level.

We have implemented our work on the top of BinNavi using REIL as the

intermediate language and MonoREIL as the monotone solver framework for

our algorithms. Our approach of only verifying one type of object per time

allowed us to drastically reduce the execution time and the number of false

positives to analyze, nonetheless we do realize that this approach is suboptimal

for scenarios in which a developer has to fix bugs in his software because in

that case it would be necessary to run the analysis multiple times.

From our test results, on a set of samples we built, it is clear that the prime

cause of false positives is the lack of flow-sensitiveness of our analysis. One

of the primary goal of future work in this direction is to use a SMT solver in

order to verify path feasibility.

49

CHAPTER 6. Results and future work

The principal source of false negatives in our analysis is the heavy presence

of function pointers in C++ code and complex data structures, in those cases

we were not able to obtain enough information either on the alias sets or

on the relationships between functions. Some techniques exist to deal with

these problems, we did not implement them because the results are far from

satisfying and could dramatically increase the number of false positives in our

analysis.

Finally we plan on augmenting our analysis by increasing the number of data

structures handled by our algorithm and by doing range analysis in order to

trace an higher number of aliases.

50

Bibliography

[1] Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and

F. Kenneth Zadeck: ”E ciently computing static single assignment form

and the control dependence graph.” ACM Transactions on Programming

Languages and Systems, 13(4):451-490, Oct 1991

[2] Preston Briggs, Keith D. Cooper, Timothy J. Harvey, and L. Taylor

Simpson: ”Practical improvements to the construction and destruction

of static single assignment form.” Software-Practice and Experience,

28(8):859-881, Jul 1998.

[3] Nomair A. Naeem, and Ondrej Lhotak: ”E cient Alias Set Analysis Using

SSA Form.”International Symposium on Memory Management - ISMM ,

pp. 79-88, 2009

[4] Xiaodong Ma, Ji Wang, and Wei Dong: ”Computing Must and May Alias

to Detect Null Pointer Dereference.”Leveraging Applications of Formal

Methods - ISOLA , pp. 252-261, 2008

[5] Sean Heelan: ”Finding use-after-free bugs with static analysis”

51

BIBLIOGRAPHY

[6] Thomas Dullien, and Sebastian Porst: ”REIL: A platform-independent

intermediate representation of disassembled code for static code analysis.”

CanSecWest 2009

[7] J. B. Kam and J. D. Ullman: ”Monotone data ﬂow analysis frameworks.”

Acta Inf., 1977.

[8] Paul Vincent Sabanal, and Mark Vincent Yason: ”Reversing C++.” Black

Hat DC 2007

[9] Michael Hind: ”Pointer Analysis: Haven’t We Solved The Problem Yet?”

ACM Transactions on Programming Languages and Systems, June 2001

[10] M. Hind, M. Burke, P. Carini and J.-D. Choi: ”Interprocedural pointer

alias analysis” ACM Transactions on Programming Languages and Sys-

tems, Apr. 1993

[11] M. Hind and A. Pioli: ”Which Pointer Analysis Should I Use?” Interna-

tional Symposium on Software Testing and Analysis, Aug. 2000

[12] M. Hind and A. Pioli: ”Evaluating The E↵ectiveness of Pointer Alias

Analysis”, Science of Computer Programming, Jan. 2001

52

Stale pointers are the new black - white paper

Recommandé

Recommandé

Contenu connexe

Similaire à Stale pointers are the new black - white paper

Similaire à Stale pointers are the new black - white paper (20)

Plus de Vincenzo Iozzo

Plus de Vincenzo Iozzo (10)

Dernier

Dernier (20)

Stale pointers are the new black - white paper