SlideShare une entreprise Scribd logo
1  sur  195
Télécharger pour lire hors ligne
TechnischeUniversitätMünchen
Fakultät für Informatik
SSPREW 2017 Training
Breaking Obfuscated Programs
with Symbolic Execution
Sebastian Banescu
Technical University of Munich, Germany
Chair of Software Engineering
Prof. Alexander Pretschner
Outline
1 Introduction
2 Obfuscation in Theory
3 Obfuscation in Practice
Static Obfuscation
Compiler Optimizations
Automated Code Obfuscation
Software Diversity
Dynamic Obfuscation
4 The Strength of Obfuscation
5 Hands-on Tutorial
6 Conclusions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 2
Assumptions and Tools for Hands-on Tutorial
We assume you have Docker installed and have basic user knowledge
Docker installation instructions
https://docs.docker.com/engine/installation/
Docker image based on Ubuntu contains: Tigress, KLEE, STP, Z3,
SatGraf, etc.
$ docker pull banescusebi/obfuscation-symex
$ docker run -it banescusebi/obfuscation-symex
For instructions on how to start the Docker image read description
at https:
//hub.docker.com/r/banescusebi/obfuscation-symex/
Instructions for running GUI apps available for Ubuntu and Mac OS
(not mandatory, only needed for SatGraf)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 3
Expected Outcome
After this training you will have a better understanding of the following:
The theory and practice of obfuscation and software diversity
Practical static & dynamic obfuscation transformations
Tools for applying symbolic execution on obfuscated programs
Which obfuscation transformations help against symbolic execution
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 4
Attackers in Computer Security
1. Man-in-the-middle (MITM)
attacks communication
channels
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 5
Attackers in Computer Security
1. Man-in-the-middle (MITM)
attacks communication
channels
2. Remote attacker exploits
vulnerabilities (e.g. buffer
overflows)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 5
Attackers in Computer Security
1. Man-in-the-middle (MITM)
attacks communication
channels
2. Remote attacker exploits
vulnerabilities (e.g. buffer
overflows)
3. Man-at-the-end (MATE)
reverse engineers software
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 5
Attackers in Computer Security
1. Man-in-the-middle (MITM)
attacks communication
channels
2. Remote attacker exploits
vulnerabilities (e.g. buffer
overflows)
3. Man-at-the-end (MATE)
reverse engineers software
We focus on MATE during this
tutorial.
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 5
Introduction
Informal Definition of Obfuscation:
To obfuscate a program P means to transform it into a executable
program P from which it is harder to extract information than from P.
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 6
Introduction
Informal Definition of Obfuscation:
To obfuscate a program P means to transform it into a executable
program P from which it is harder to extract information than from P.
Motivation:
Obfuscation is last layer of software defense against attackers
(e.g. after attacker bypasses OS authentication)
Obfuscation raises the bar for reverse engineering
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 6
Introduction
Informal Definition of Obfuscation:
To obfuscate a program P means to transform it into a executable
program P from which it is harder to extract information than from P.
Motivation:
Obfuscation is last layer of software defense against attackers
(e.g. after attacker bypasses OS authentication)
Obfuscation raises the bar for reverse engineering
Popular questions:
Wasn’t obfuscation proved to be impossible back in 2001?
Isn’t obfuscation the same as security by obscurity?
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 6
Outline
1 Introduction
2 Obfuscation in Theory
3 Obfuscation in Practice
Static Obfuscation
Compiler Optimizations
Automated Code Obfuscation
Software Diversity
Dynamic Obfuscation
4 The Strength of Obfuscation
5 Hands-on Tutorial
6 Conclusions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 7
Black Box Obfuscation
Formal Definition of Black-Box Obfuscation:
A probabilistic algorithm O is an obfuscator if the following conditions
hold:
For every program P, the obfuscated program O(P) has the same
functionality (e.g. input-output behavior)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 8
Black Box Obfuscation
Formal Definition of Black-Box Obfuscation:
A probabilistic algorithm O is an obfuscator if the following conditions
hold:
For every program P, the obfuscated program O(P) has the same
functionality (e.g. input-output behavior)
The memory size increase and execution slowdown of O(P) w.r.t. P
are less than polynomial
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 8
Black Box Obfuscation
Formal Definition of Black-Box Obfuscation:
A probabilistic algorithm O is an obfuscator if the following conditions
hold:
For every program P, the obfuscated program O(P) has the same
functionality (e.g. input-output behavior)
The memory size increase and execution slowdown of O(P) w.r.t. P
are less than polynomial
Any probabilistic polynomial time attacker only has a negligible
probability of guessing any bit of information about P, given O(P)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 8
Black Box Obfuscation
In 2001 Barak et al. [3]:
Proved there exists no general obfuscator applicable to all programs
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
Black Box Obfuscation
In 2001 Barak et al. [3]:
Proved there exists no general obfuscator applicable to all programs
Proof performed by giving a counter-example using 2 programs:
Cα,β(x) =
β if x = α
0k
otherwise
Dα,β(C) =
1 if C(α) = β
0 otherwise
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
Black Box Obfuscation
In 2001 Barak et al. [3]:
Proved there exists no general obfuscator applicable to all programs
Proof performed by giving a counter-example using 2 programs:
Cα,β(x) =
β if x = α
0k
otherwise
Dα,β(C) =
1 if C(α) = β
0 otherwise
Attacker is defined as A(C, D) = D(C). If A(C, D) = 1 then the
attacker is successful
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
Black Box Obfuscation
In 2001 Barak et al. [3]:
Proved there exists no general obfuscator applicable to all programs
Proof performed by giving a counter-example using 2 programs:
Cα,β(x) =
β if x = α
0k
otherwise
Dα,β(C) =
1 if C(α) = β
0 otherwise
Attacker is defined as A(C, D) = D(C). If A(C, D) = 1 then the
attacker is successful
If the attacker is given O(C) and O(D) then A(O(C), O(D)) = 1
with 100% probability
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
Black Box Obfuscation
In 2001 Barak et al. [3]:
Proved there exists no general obfuscator applicable to all programs
Proof performed by giving a counter-example using 2 programs:
Cα,β(x) =
β if x = α
0k
otherwise
Dα,β(C) =
1 if C(α) = β
0 otherwise
Attacker is defined as A(C, D) = D(C). If A(C, D) = 1 then the
attacker is successful
If the attacker is given O(C) and O(D) then A(O(C), O(D)) = 1
with 100% probability
If the attacker only has black box access to C and D, then the
probability of a successful attack is negligible
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
Black Box Obfuscation
In 2001 Barak et al. [3]:
Proved there exists no general obfuscator applicable to all programs
Proof performed by giving a counter-example using 2 programs:
Cα,β(x) =
β if x = α
0k
otherwise
Dα,β(C) =
1 if C(α) = β
0 otherwise
Attacker is defined as A(C, D) = D(C). If A(C, D) = 1 then the
attacker is successful
If the attacker is given O(C) and O(D) then A(O(C), O(D)) = 1
with 100% probability
If the attacker only has black box access to C and D, then the
probability of a successful attack is negligible
This proof does not imply that every obfuscator fails on a subset of
programs
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
Black Box Obfuscation
In 2001 Barak et al. [3]:
Proved there exists no general obfuscator applicable to all programs
Proof performed by giving a counter-example using 2 programs:
Cα,β(x) =
β if x = α
0k
otherwise
Dα,β(C) =
1 if C(α) = β
0 otherwise
Attacker is defined as A(C, D) = D(C). If A(C, D) = 1 then the
attacker is successful
If the attacker is given O(C) and O(D) then A(O(C), O(D)) = 1
with 100% probability
If the attacker only has black box access to C and D, then the
probability of a successful attack is negligible
This proof does not imply that every obfuscator fails on a subset of
programs
There may exist non-black-box obfuscators for some programs that
leak bits of information, but are “good enough”
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
Indistinguishability Obfuscation
In 2013 Garg et al. [8] proposed indistinguishability obfuscation:
The obfuscated versions of 2 semantically equivalent programs
cannot be distinguished
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 10
Indistinguishability Obfuscation
In 2013 Garg et al. [8] proposed indistinguishability obfuscation:
The obfuscated versions of 2 semantically equivalent programs
cannot be distinguished
[9] proved this obfuscation is the best-possible obfuscation
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 10
Indistinguishability Obfuscation
In 2013 Garg et al. [8] proposed indistinguishability obfuscation:
The obfuscated versions of 2 semantically equivalent programs
cannot be distinguished
[9] proved this obfuscation is the best-possible obfuscation
Implementations still far from being practical: [1, 2]
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 10
Indistinguishability Obfuscation
In 2013 Garg et al. [8] proposed indistinguishability obfuscation:
The obfuscated versions of 2 semantically equivalent programs
cannot be distinguished
[9] proved this obfuscation is the best-possible obfuscation
Implementations still far from being practical: [1, 2]
Cool stuff, but let’s look at obfuscation we can use in practice.
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 10
Outline
1 Introduction
2 Obfuscation in Theory
3 Obfuscation in Practice
Static Obfuscation
Compiler Optimizations
Automated Code Obfuscation
Software Diversity
Dynamic Obfuscation
4 The Strength of Obfuscation
5 Hands-on Tutorial
6 Conclusions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 11
Categorization of Obfuscation
What types of obfuscation transformations exist?
Static Vs. Dynamic
Static obfuscation:
Obfuscated programs remain fixed at runtime
Raises the bar against static analysis
Can be attacked through dynamic techniques (debugging, emulation,
tracing)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 12
Categorization of Obfuscation
What types of obfuscation transformations exist?
Static Vs. Dynamic
Static obfuscation:
Obfuscated programs remain fixed at runtime
Raises the bar against static analysis
Can be attacked through dynamic techniques (debugging, emulation,
tracing)
Dynamic obfuscation:
Programs change during load-/run-time
Raises the bar against dynamic analysis
Different executions of the same program with the same input may
produce different execution traces
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 12
Categorization of Obfuscation
What types of obfuscation transformations exist?
Static Vs. Dynamic
Static obfuscation:
Obfuscated programs remain fixed at runtime
Raises the bar against static analysis
Can be attacked through dynamic techniques (debugging, emulation,
tracing)
Dynamic obfuscation:
Programs change during load-/run-time
Raises the bar against dynamic analysis
Different executions of the same program with the same input may
produce different execution traces
Point of insertion:
Source code
Intermediate representation
Machine code
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 12
Categorization of Obfuscation
What types of obfuscation transformations exist?
Static Vs. Dynamic
Static obfuscation:
Obfuscated programs remain fixed at runtime
Raises the bar against static analysis
Can be attacked through dynamic techniques (debugging, emulation,
tracing)
Dynamic obfuscation:
Programs change during load-/run-time
Raises the bar against dynamic analysis
Different executions of the same program with the same input may
produce different execution traces
Point of insertion:
Source code
Intermediate representation
Machine code
Transformation targets:
Layout → scramble identifiers and code layout
Data → obfuscate data (structures) embedded in code
Control flow → obfuscate secret algorithms
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 12
Outline
1 Introduction
2 Obfuscation in Theory
3 Obfuscation in Practice
Static Obfuscation
Compiler Optimizations
Automated Code Obfuscation
Software Diversity
Dynamic Obfuscation
4 The Strength of Obfuscation
5 Hands-on Tutorial
6 Conclusions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 13
Compiler Optimizations
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 14
Compiler Optimizations
Compiler Optimizations
In-lining function bodies
Loop unrolling
Loop-invariant code motion
Common sub-expression elimination
Constant folding and propagation
Dead code elimination
Strength reduction
more @ http://en.wikipedia.org/wiki/Optimizing_compiler
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 14
In-lining function bodies
Replace function call by function body
Before
1 int foo(int a, int b) {
2 return a + b;
3 }
4 ...
5 c = foo(a, b+1);
After
1 ...
2 c = a + b + 1;
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 15
Loop unrolling
Remove “end-of-loop” test overhead
Before
1 ...
2 for (i = 0; i < 2; i++)
3 {
4 a[i] = 0;
5 }
After
1 ...
2 a[0] = 0;
3 a[1] = 0;
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 16
Loop invariant code-motion
Extract operations whose results are independent of loop execution
Before
1 ...
2 for (i = 0; i < 2; i++)
3 {
4 a[i] = p + q;
5 }
After
1 ...
2 temp = p + q;
3 for (i = 0; i < 2; i++)
4 {
5 a[i] = temp;
6 }
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 17
Common subexpression elimination
Replace re-occurring identical (sub-)expressions by a single variable
holding the result
Before
1 ...
2 a = b + (z + 1);
3 p = q + (z + 1);
After
1 ...
2 temp = z + 1;
3 a = b + temp;
4 p = q + temp;
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 18
Constant folding and propagation
Simplify constant expressions and substitute the values of known
constants in expressions
Before
1 ...
2 a = 3 + 5;
3 b = a + 1;
4 func(a, b);
After
1 ...
2 func(8, 9);
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 19
Dead code elimination
Remove code which does not affect program results: unreachable code
and code that affects variables that are irrelevant for the program
Before
1 ...
2 a = 1;
3 if (a < 0)
4 {
5 printf("This should never be printed!");
6 }
After
1 ...
2 a = 1;
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 20
Strength reduction
Replace expensive operations with equivalent cheap ones
Before
1 ...
2 y = x / 8;
3 p = q * 15;
After
1 ...
2 y = x >> 3;
3 p = (q << 4) - x;
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 21
Outline
1 Introduction
2 Obfuscation in Theory
3 Obfuscation in Practice
Static Obfuscation
Compiler Optimizations
Automated Code Obfuscation
Software Diversity
Dynamic Obfuscation
4 The Strength of Obfuscation
5 Hands-on Tutorial
6 Conclusions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 22
Automated Code Obfuscation
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 23
Automated Code Obfuscation
Obfuscation Techniques:
Scramble identifiers
Instruction substitution
Garbage code insertion
Merging and splitting functions
Encode Literals
Encode Arithmetic
Opaque predicates
Control-flow flattening
Virtualization obfuscation
White-box cryptography
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 23
Scramble identifiers
Replace identifier names with random strings
Before
1 ...
2 sum = 0;
3 for (i = 0; i < arr_len; i++)
4 sum += arr[i];
5 average /= arr_len;
After
1 ...
2 za82b547bcb = 0;
3 for (z1c0ab7cf0c = 0; z1c0ab7cf0c < za862d19cbc;
z1c0ab7cf0c ++)
4 za82b547bcb += zc1c28ca67f[ z1c0ab7cf0c ];
5 z8c8f7c7867 /= za862d19cbc
This layout obfuscation has high potency, but low resilience
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 24
Instruction substitution
Replace binary operation (e.g. +, -, AND, OR, XOR, etc.) by
functionally equivalent, but more complicated computations
Before
1 ...
2 a = b + c
After
1 ...
2 r = rand ();
3 a = b + r;
4 a = a + c;
5 a = a - r;
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 25
Garbage code insertion
Insert code that executes, but does not affect the IO behavior
Before
1 ...
2 sum = 0;
3 for (i = 0; i < arr_len; i++)
4 sum += arr[i];
5 average = sum / arr_len;
After
1 ...
2 sum = 0; prod = 1;
3 for (i = 0; i < arr_len; i++) {
4 sum += arr[i];
5 prod *= arr[i];
6 }
7 average = sqrt(prod);
8 average = sum / arr_len;
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 26
Merging and splitting functions
Merging implies combining the code of two or more functions into a
single function
Splitting implies dividing the code one function into two or more
functions
Split
1 func1(int a, int b) {
2 x = 4;
3 if (a < 3)
4 x = x + 6;
5 x *= b;
6 }
7
8 func2(int a, int c) {
9 y = a + 12;
10 y = y/c;
11 }
Merged
1 func3(int a, int b, int c) {
2 if (c % 2 == 0) {
3 x = 4;
4 if (a < 3)
5 x = x + 6;
6 x *= b;
7 } else {
8 y = a + 12;
9 y = y/b;
10 }
11 }
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 27
Encode Literals
Literals are constant (hard-coded) strings and numeric values
Literals can be encoded in numerous ways using encoder functions
At runtime they are decoded back to their original value
Before
1 main(int ac , char* av[]) {
2 printf("hellon");
3 return 0;
4 }
After
1 gen_str(char str []) {
2 int i = 0;
3 str[i++] = ’h’;
4 str[i++] = ’e’;
5 str[i++] = ’l’;
6 str[i++] = ’l’;
7 str[i++] = ’o’;
8 str[i++] = ’n’;
9 str[i] = ’000 ’;
10 }
11
12 main(int ac , char* av[]) {
13 char str [7];
14 gen_str(str);
15 printf(str);
16 return 0;
17 }
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 28
Encode Arithmetic (Mixed-Boolean Arithmetic)
Replace arithmetic or Boolean expressions with more complex ones
Complex expressions contain both arithmetic and boolean operators
Transformation can be applied recursively to increase strength
Before
1 func(int x, int y) {
2 return x + y;
3 }
After (Version 1)
1 func(int x, int y) {
2 return 2*(x | y) - (x ^ y);
3 }
After (Version 2)
1 func(int x, int y) {
2 return (x | y) + (x & y);
3 }
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 29
Opaque predicates and opaque expressions [6]
Informal Definition:
An expression whose value is known to the defender (at obfuscation
time), but which is difficult for an attacker to figure out statically.
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 30
Opaque predicates and opaque expressions [6]
Informal Definition:
An expression whose value is known to the defender (at obfuscation
time), but which is difficult for an attacker to figure out statically.
Notation:
PT
for an opaquely true predicate
PF
for an opaquely false predicate
P?
for an opaquely intermediate predicate (range divider)
E=v
for an opaque expression of value v
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 30
Opaque predicates and opaque expressions [6]
Informal Definition:
An expression whose value is known to the defender (at obfuscation
time), but which is difficult for an attacker to figure out statically.
Notation:
PT
for an opaquely true predicate
PF
for an opaquely false predicate
P?
for an opaquely intermediate predicate (range divider)
E=v
for an opaque expression of value v
true false true false true false
P?
PT PF
true false true false true false2|(x2
+ x) x mod2=0x2
< 0
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 30
Bogus control-flow via opaque predicates
Opaque predicates facilitate insertion of bogus control-flow:
Does not affect I/O behavior
Attacker does not know which code is bogus
Increases attack / analysis time
Resilience of bogus control-flow reduced to resilience of opaque
predicates
true false true false true false
P?
PT PF
true false true false true false2|(x2
+ x) x mod2=0x2
< 0
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 31
Bogus control-flow via opaque predicates
E.g. add an opaquely true predicate (PT
) to a while loop condition (P)
Before
1 i = 1;
2 while (i < 100) {
3 dostuff(i);
4 i++;
5 }
After
1 i = 1; j = 100;
2 while ((i < 100) &&
3 (j*j*(j+1) *(j+1)%4 == 0)) {
4 dostuff(i);
5 i++;
6 j = j*i+3;
7 }
true
false
false
false truetrue
P PTP
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 32
Breaking opaque predicates via abstract interp.
Dalla Preda et al. [7] used abstract interpretation to break opaque
predicates:
Opaque predicates are confined in a single basic block
Only opaque predicate of the following form: n|p(x), ∀x ∈ Z, where
p(x) is a polynomial in x and n ∈ N
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 33
Breaking opaque predicates via abstract interp.
Dalla Preda et al. [7] used abstract interpretation to break opaque
predicates:
Opaque predicates are confined in a single basic block
Only opaque predicate of the following form: n|p(x), ∀x ∈ Z, where
p(x) is a polynomial in x and n ∈ N
E.g. opaquely true predicate: 2|(x2
+ x), ∀x ∈ Z
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 33
Breaking opaque predicates via abstract interp.
Dalla Preda et al. [7] used abstract interpretation to break opaque
predicates:
Opaque predicates are confined in a single basic block
Only opaque predicate of the following form: n|p(x), ∀x ∈ Z, where
p(x) is a polynomial in x and n ∈ N
E.g. opaquely true predicate: 2|(x2
+ x), ∀x ∈ Z
x is odd
1 x = ...; // any odd number
2 y = x * x; // odd
3 y = y + x; // even
4 if ( y % 2 == 0) // always
5 ... // true
6 else // dead branch
7 ...
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 33
Breaking opaque predicates via abstract interp.
Dalla Preda et al. [7] used abstract interpretation to break opaque
predicates:
Opaque predicates are confined in a single basic block
Only opaque predicate of the following form: n|p(x), ∀x ∈ Z, where
p(x) is a polynomial in x and n ∈ N
E.g. opaquely true predicate: 2|(x2
+ x), ∀x ∈ Z
x is odd
1 x = ...; // any odd number
2 y = x * x; // odd
3 y = y + x; // even
4 if ( y % 2 == 0) // always
5 ... // true
6 else // dead branch
7 ...
x is even
1 x = ...; // any even number
2 y = x * x; // even
3 y = y + x; // even
4 if ( y % 2 == 0) // always
5 ... // true
6 else // dead branch
7 ...
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 33
Breaking opaque predicates via abstract interp.
Dalla Preda et al. [7] used abstract interpretation to break opaque
predicates:
Opaque predicates are confined in a single basic block
Only opaque predicate of the following form: n|p(x), ∀x ∈ Z, where
p(x) is a polynomial in x and n ∈ N
E.g. opaquely true predicate: 2|(x2
+ x), ∀x ∈ Z
x is odd
1 x = ...; // any odd number
2 y = x * x; // odd
3 y = y + x; // even
4 if ( y % 2 == 0) // always
5 ... // true
6 else // dead branch
7 ...
x is even
1 x = ...; // any even number
2 y = x * x; // even
3 y = y + x; // even
4 if ( y % 2 == 0) // always
5 ... // true
6 else // dead branch
7 ...
Abstract interpretation is able to infer that regardless of x’s value,
the IF condition is always true
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 33
Opaquely intermediate predicate (Range divider)
Range dividers can lead to different paths in the code
No dead code
All branches have the same behavior but different syntax
Before
1 int main(int ac , char* av []){
2 char *str = av [1];
3 int hash = 0;
4 for(int i = 0;
5 i < strlen(str);
6 str++, i++) {
7 hash = (hash <<7) ^(* str);
8 }
9 if (hash == 809267)
10 printf("You win!");
11 return 0;
12 }
After
1 int main(int ac , char* av []){
2 char *str = av [1];
3 int hash = 0;
4 for(int i = 0;
5 i < strlen(str);
6 str++, i++) {
7 char chr = *str;
8 if (chr > 42) {
9 hash = (hash << 7) ^ chr;
10 } else {
11 hash = (hash * 128) ^ chr;
12 }
13 }
14 if (hash == 809267)
15 printf("You win!");
16 return 0;
17 }
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 34
Control-flow flattening
Remove the control-flow structure of functions:
1. Put each basic block as a case inside a switch statement
2. Wrap the switch inside an infinite loop
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 35
Control-flow flattening
Remove the control-flow structure of functions:
1. Put each basic block as a case inside a switch statement
2. Wrap the switch inside an infinite loop
Let’s take one function, e.g. GCD
1 int gcd(int a, int b){
2 while (a != b)
3 if (a > b)
4 a = a - b;
5 else
6 b = b - a;
7 return a;
8 }
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 35
Control-flow flattening GCD example
Before
1 int gcd(int a, int b){
2 while (a != b)
3 if (a > b)
4 a = a - b;
5 else
6 b = b - a;
7 return a;
8 }
After
1 int gcd(int a, int b){
2 int next = 0;
3 while (1) {
4 switch(next) {
5 case 0:
6 next = (int)(a != b) + 1;
7 break;
8 case 1: return a;
9 break;
10 case 2:
11 next = (a > b) + 3;
12 break;
13 case 3: b = b - a;
14 next = 0;
15 break;
16 case 4: a = a - b;
17 next = 0;
18 break;
19 default:
20 break;
21 }}}
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 36
Control-flow flattening GCD example
The CFG of the resulting code:
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 37
Control-flow flattening discussion
Performance penalty:
For 3 SPEC programs: 4× slowdown, 2× size
Reasons:
The wrapper loop condition check, plus jump
The switch next value check, plus indirect jump
How to optimize:
Keep tight loops as one switch entry (don’t split)
Use gcc’s labels-as-values → allows jumping to next basic block
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 38
Control-flow flattening discussion
Performance penalty:
For 3 SPEC programs: 4× slowdown, 2× size
Reasons:
The wrapper loop condition check, plus jump
The switch next value check, plus indirect jump
How to optimize:
Keep tight loops as one switch entry (don’t split)
Use gcc’s labels-as-values → allows jumping to next basic block
Attack on control-flow flattening:
1. Find next block of every basic block
2. Rebuild original CFG
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 38
Control-flow flattening discussion
Performance penalty:
For 3 SPEC programs: 4× slowdown, 2× size
Reasons:
The wrapper loop condition check, plus jump
The switch next value check, plus indirect jump
How to optimize:
Keep tight loops as one switch entry (don’t split)
Use gcc’s labels-as-values → allows jumping to next basic block
Attack on control-flow flattening:
1. Find next block of every basic block
2. Rebuild original CFG
Mitigation: assign opaque expressions (E=v
) to next
Question: How do we build such opaque expressions?
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 38
Opaque expressions from array aliasing
1. A statically initialized array with seemingly random values:
g = 10 5 13 3 27 5 24 38 0 73 115 3 66 60 17 31
2. The values are generated such that some invariants hold, e.g.:
Every 3rd cell starting from cell 0, contains a value v ≡ 3 mod 7
Every 3rd cell starting from cell 1, contains a value v ≡ 5 mod 11
Cells 2, 5, 8, 11 and 14 contain values 13, 5, 0, 3, respectively 17
3. Update array cells with values that respect invariants
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 39
Opaque expressions from array aliasing
1. A statically initialized array with seemingly random values:
g = 10 5 13 3 27 5 24 38 0 73 115 3 66 60 17 31
2. The values are generated such that some invariants hold, e.g.:
Every 3rd cell starting from cell 0, contains a value v ≡ 3 mod 7
Every 3rd cell starting from cell 1, contains a value v ≡ 5 mod 11
Cells 2, 5, 8, 11 and 14 contain values 13, 5, 0, 3, respectively 17
3. Update array cells with values that respect invariants
Let’s replace right-hand values 0, 1, 2, 3 and 4 of assignments to next
with E=0
, E=1
, E=2
, E=3
, respectively E=4
:
next = 0 → next = g[3] % g[11] - g[8]
next = 1 → next = 3 * g[11] - 4 * (g[4] % g[5])
next = 2 → next = g[5] - g[3]
next = 3 → next = g[2] % g[1]
next = 4 → next = g[15] - g[4]
Next slide shows how this looks in the flattened GCD example
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 39
Control-flow flattening + opaque expressions
1 int gcd(int a, int b){
2 int g[] = {10, 5, 13, 3, 27, 5, 24, 38, 0, 73, 115,
3, 66, 60, 17, 31};
3 int next = g[3] % g[11] - g[11]; // 0
4 while (1) {
5 switch(next) {
6 case 0:
7 if(a != b)
8 next = 3 * g[11] - 4 * (g[4] % g[5]); // 1
9 else next = g[5] - g[3]; // 2
10 break;
11 case 1:
12 if (a > b) next = g[2] % g[1]; // 3
13 else next = g[15] - g[4]; // 4
14 break;
15 case 2: return a;
16 break;
17 case 3: a = a - b;
18 next = g[3] % g[11] - g[8]; // 0
19 break;
20 case 4: b = b - a;
21 next = g[3] % g[11] - g[8]; // 0
22 break;
23 default:
24 break;
25 }}}
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 40
Opaque expressions from pointer aliasing
Assumption: pointer aliasing is a computationally hard static analysis
problem
1. Construct one or more linked-lists
2. Set pointers into those linked-lists
3. Create opaque predicates by checking properties you know to be
true / false, e.g.:
→ q1 points to a node with value v > 53
→ q2 points to a node with value v ≤ 53
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 41
Opaque expressions from pointer aliasing
Assumption: pointer aliasing is a computationally hard static analysis
problem
1. Construct one or more linked-lists
2. Set pointers into those linked-lists
3. Create opaque predicates by checking properties you know to be
true / false, e.g.:
→ q1 points to a node with value v > 53
→ q2 points to a node with value v ≤ 53
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 41
Opaque expressions from pointer aliasing
One opaquely true predicate: (∗q1 > ∗q2)T
After performing several operations to confuse alias analysis another
opaque predicate is: (q1 = q2)T
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 42
Virtualization Obfuscation
Goal: Hide secret algorithm of program P
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 43
Virtualization Obfuscation
Goal: Hide secret algorithm of program P
Obfuscation Procedure:
1. Generate random bytecode ISA (L) covering all instructions of P
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 43
Virtualization Obfuscation
Goal: Hide secret algorithm of program P
Obfuscation Procedure:
1. Generate random bytecode ISA (L) covering all instructions of P
2. Translate P to L bytecode program
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 43
Virtualization Obfuscation
Goal: Hide secret algorithm of program P
Obfuscation Procedure:
1. Generate random bytecode ISA (L) covering all instructions of P
2. Translate P to L bytecode program
3. Generate emulator to interpret L bytecode on x86 machine
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 43
Virtualization Obfuscation
Goal: Hide secret algorithm of program P
Obfuscation Procedure:
1. Generate random bytecode ISA (L) covering all instructions of P
2. Translate P to L bytecode program
3. Generate emulator to interpret L bytecode on x86 machine
Obfuscated program (P ) consists of bytecode program and
emulator
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 43
Virtualization Obfuscation Example [11]
We apply virtualization obfuscation to function foo, written in C
1 void foo(int x){
2 int y = 10;
3 y++;
4 y++;
5 if (x > 0){
6 y++;
7 }
8 else {}
9 printf("%dn",y);
10 }
Figure : CFG of foo
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 44
Virtualization Obfuscation Example
Step 1: Generate random bytecode ISA covering all instructions of foo
1 void foo(int x){
2 int y = 10;
3 y++;
4 y++;
5 if (x > 0){
6 y++;
7 }
8 else {}
9 printf("%dn",y);
10 }
Random ISA:
1. Integer assignment (line 2)
encode
−−−−→ 52, LH op, RH op
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 45
Virtualization Obfuscation Example
Step 1: Generate random bytecode ISA covering all instructions of foo
1 void foo(int x){
2 int y = 10;
3 y++;
4 y++;
5 if (x > 0){
6 y++;
7 }
8 else {}
9 printf("%dn",y);
10 }
Random ISA:
1. Integer assignment (line 2)
encode
−−−−→ 52, LH op, RH op
2. Integer increment (lines 3,4,6)
encode
−−−−→ 03, op
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 45
Virtualization Obfuscation Example
Step 1: Generate random bytecode ISA covering all instructions of foo
1 void foo(int x){
2 int y = 10;
3 y++;
4 y++;
5 if (x > 0){
6 y++;
7 }
8 else {}
9 printf("%dn",y);
10 }
Random ISA:
1. Integer assignment (line 2)
encode
−−−−→ 52, LH op, RH op
2. Integer increment (lines 3,4,6)
encode
−−−−→ 03, op
3. Branch if > 0 (line 5)
encode
−−−−→ 08, tst op, jmp offset
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 45
Virtualization Obfuscation Example
Step 1: Generate random bytecode ISA covering all instructions of foo
1 void foo(int x){
2 int y = 10;
3 y++;
4 y++;
5 if (x > 0){
6 y++;
7 }
8 else {}
9 printf("%dn",y);
10 }
Random ISA:
1. Integer assignment (line 2)
encode
−−−−→ 52, LH op, RH op
2. Integer increment (lines 3,4,6)
encode
−−−−→ 03, op
3. Branch if > 0 (line 5)
encode
−−−−→ 08, tst op, jmp offset
4. Call to printf (line 9)
encode
−−−−→ 18, op
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 45
Virtualization Obfuscation Example
Variables and constants of bytecode program stored in array: data
data[0] represents variable x
data[1] represents variable y
data[2] represents constant for initialization to 10 (line 2 of foo)
data[3] represents constant jump offset of conditional branch
1 void foo(int x){
2 int y = 10;
3 y++;
4 y++;
5 if (x > 0){
6 y++;
7 }
8 else {}
9 printf("%dn",y);
10 }
Random ISA:
1. Integer assignment (line 2)
encode
−−−−→ 52, LH op, RH op
2. Integer increment (lines 3,4,6)
encode
−−−−→ 03, op
3. Branch if > 0 (line 5)
encode
−−−−→ 08, tst op, jmp offset
4. Call to printf (line 9)
encode
−−−−→ 18, op
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 46
Virtualization Obfuscation Example
Step 2: Translate foo to bytecode program
data = {00,00,10,05} // {x, y, init const, jmp offset}
Bytecode: {52, 01, 02, 03, 01, 03, 01, 08, 00, 03, 03,
01, 18, 01, 00}
1 void foo(int x){// bytecode
2 int y = 10; // 52, 01, 02
3 y++; // 03, 01
4 y++; // 03, 01
5 if (x > 0){ // 08, 00, 03
6 y++; // 03, 01
7 }
8 else {}
9 printf("%d",y); // 18, 01
10 } // 00
Random ISA:
1. Integer assignment (line 2)
encode
−−−−→ 52, LH op, RH op
2. Integer increment (lines 3,4,6)
encode
−−−−→ 03, op
3. Branch if > 0 (line 5)
encode
−−−−→ 08, tst op, jmp offset
4. Call to printf (line 9)
encode
−−−−→ 18, op
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 47
Virtualization Obfuscation Example
Step 3: Generate emulator to
interpret bytecode on x86 machine
int data = {00,00,10,05};
{x, y, init const, jmp off}
int code = {
52, 01, 02,
03, 01,
03, 01,
08, 00, 03,
03, 01,
18, 01,
00};
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 48
Virtualization Obfuscation Example
Step 3: Generate emulator to
interpret bytecode on x86 machine
int data = {00,00,10,05};
{x, y, init const, jmp off}
int code = {
52, 01, 02,
03, 01,
03, 01,
08, 00, 03,
03, 01,
18, 01,
00};
1 int vpc = 0, op1 , op2;
2 while (true) {
3 switch(code[vpc]) {
4 case 03: // increment
5 op1 = code[vpc + 1];
6 data[op1 ]++;
7 vpc += 2; break;
8 case 08: // conditional jump
9 op1 = code[vpc + 1];
10 op2 = code[vpc + 2];
11 if (data[op1] > 0)
12 vpc += 3;
13 else
14 vpc += data[op2];
15 break;
16 case 18: // call printf
17 op1 = code[vpc + 1];
18 printf("%dn", data[op1]);
19 vpc += 2; break;
20 case 52: // assignment
21 op1 = code[vpc + 1];
22 op2 = code[vpc + 2];
23 data[op1] = data[op2 ];
24 vpc += 3; break;
25 default: return; // halt
26 } // end switch
27 } // end while
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 48
Virtualization Obfuscation Example
Figure : Control flow graph of obfuscated foo
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 49
Virtualization Obfuscation
Goal: obfuscate secret algorithm (not secret data)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 50
Virtualization Obfuscation
Goal: obfuscate secret algorithm (not secret data)
Strengths:
Random ISA, different for each instance
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 50
Virtualization Obfuscation
Goal: obfuscate secret algorithm (not secret data)
Strengths:
Random ISA, different for each instance
Control-flow flattening → Different locations in original program
mapped to same case branch
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 50
Virtualization Obfuscation
Goal: obfuscate secret algorithm (not secret data)
Strengths:
Random ISA, different for each instance
Control-flow flattening → Different locations in original program
mapped to same case branch
Many other obfuscation techniques can be added on top of:
Each case branch
The switch statement
The data and code arrays
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 50
Virtualization Obfuscation
Goal: obfuscate secret algorithm (not secret data)
Strengths:
Random ISA, different for each instance
Control-flow flattening → Different locations in original program
mapped to same case branch
Many other obfuscation techniques can be added on top of:
Each case branch
The switch statement
The data and code arrays
Tools:
CodeVirtualizer (59 e - 119 e )
VMProtect (69 e - 599 e )
Tigress, Diablo (Free research tools)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 50
White-Box Cryptography (WBC)
Idea:
Embed the key inside the cipher (e.g. in S-boxes)
Convert cipher computation into large network of look-up tables
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 51
White-Box Cryptography (WBC)
Idea:
Embed the key inside the cipher (e.g. in S-boxes)
Convert cipher computation into large network of look-up tables
Source: http://www.whiteboxcrypto.com/
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 51
White-Box Cryptography (WBC)
Idea:
Embed the key inside the cipher (e.g. in S-boxes)
Convert cipher computation into large network of look-up tables
Attack: WB-AES and WB-DES broken (222
operations)
Source: http://www.whiteboxcrypto.com/
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 51
Some Details on WBC
Before
1 char xor(char inputs)
2 {
3 char a = inputs & 0x0F;
4 char b = inputs >> 4;
5 return a ^ b;
6 }
After
1 char lut [256] =
2 {
3 0x00 , 0x01 , 0x02 , ..., 0x0F ,
4 0x01 , 0x00 , 0x03 , ..., 0x0E ,
5 0x02 , 0x03 , 0x00 , ..., 0x0D ,
6 | |
7 0x0F , 0x0E , 0x0D , ..., 0x00
8 };
9
10 char xor(char inputs)
11 {
12 return lut[inputs ];
13 }
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 52
Some Details on WBC
Before
1 char xor(char inputs)
2 {
3 char a = inputs & 0x0F;
4 char b = inputs >> 4;
5 return a ^ b;
6 }
After
1 char lut [256] =
2 {
3 0x00 , 0x01 , 0x02 , ..., 0x0F ,
4 0x01 , 0x00 , 0x03 , ..., 0x0E ,
5 0x02 , 0x03 , 0x00 , ..., 0x0D ,
6 | |
7 0x0F , 0x0E , 0x0D , ..., 0x00
8 };
9
10 char xor(char inputs)
11 {
12 return lut[inputs ];
13 }
Source: PhD
presentation
of B. Wyseur
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 52
Outline
1 Introduction
2 Obfuscation in Theory
3 Obfuscation in Practice
Static Obfuscation
Compiler Optimizations
Automated Code Obfuscation
Software Diversity
Dynamic Obfuscation
4 The Strength of Obfuscation
5 Hands-on Tutorial
6 Conclusions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 53
Software Diversity
1. Software developer
distributes software X to all
end-users
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 54
Software Diversity
1. Software developer
distributes software X to all
end-users
2. Some end-users are MATE
attackers
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 54
Software Diversity
1. Software developer
distributes software X to all
end-users
2. Some end-users are MATE
attackers
3. MATE reverse engineers X
and builds a hijacker of X
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 54
Software Diversity
1. Software developer
distributes software X to all
end-users
2. Some end-users are MATE
attackers
3. MATE reverse engineers X
and builds a hijacker of X
4. MATE distributes hijacker
to other end-users of X
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 54
Software Diversity
What can we do to protect
victims?
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 55
Software Diversity
What can we do to protect
victims?
Give everyone a different
version
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 55
Software Diversity
What can we do to protect
victims?
Give everyone a different
version
Generate 1000s of different
versions using obfuscation
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 55
Software Diversity
What can we do to protect
victims?
Give everyone a different
version
Generate 1000s of different
versions using obfuscation
Assumption: Same attack
will not work on different
binaries
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 55
Software Diversity
What can we do to protect
victims?
Give everyone a different
version
Generate 1000s of different
versions using obfuscation
Assumption: Same attack
will not work on different
binaries
Issues:
Analyzing crash-dumps
Incremental updates
Digitally signing all
versions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 55
Pre- vs. Post-Distribution Software Diversity
Pre-Distribution Software Diversity
Obfuscation transformations
involve randomly generated
code & data
Many different binaries
generated by software developer
Different binaries distributed to
different end-users
All previously presented
techniques can be used for
pre-distribution diversification
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 56
Pre- vs. Post-Distribution Software Diversity
Pre-Distribution Software Diversity
Obfuscation transformations
involve randomly generated
code & data
Many different binaries
generated by software developer
Different binaries distributed to
different end-users
All previously presented
techniques can be used for
pre-distribution diversification
Post-Distribution Software Diversity
Software developer embeds
self-modifying code in application
All end-users may get the same
binary from software developer
Code may change differently for
different users depending on inputs
Next we present dynamic
obfuscation techniques which can
be used for post-distribution
diversification
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 56
Outline
1 Introduction
2 Obfuscation in Theory
3 Obfuscation in Practice
Static Obfuscation
Compiler Optimizations
Automated Code Obfuscation
Software Diversity
Dynamic Obfuscation
4 The Strength of Obfuscation
5 Hands-on Tutorial
6 Conclusions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 57
Dynamic Obfuscation
A dynamic obfuscator runs in two phases:
1. At compile-time:
Transform the program to an initial configuration
Add a runtime code-transformer
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 58
Dynamic Obfuscation
A dynamic obfuscator runs in two phases:
1. At compile-time:
Transform the program to an initial configuration
Add a runtime code-transformer
2. At runtime:
Interleave execution of the program with calls to the transformer T
T changes the code segment content at runtime
Ideally a non-repeating series of configurations, in practice they
repeat
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 58
Dynamic Obfuscation Techniques
General Techniques:
Build-and-execute: generate code for a routine at runtime, and
jump to it
Self-modification: modify the executable code
Encryption: the self-modification is decrypting the encrypted code
before executing it
Move code: every time the code executes, it is in a different
location
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 59
Dynamic Obfuscation Techniques
General Techniques:
Build-and-execute: generate code for a routine at runtime, and
jump to it
Self-modification: modify the executable code
Encryption: the self-modification is decrypting the encrypted code
before executing it
Move code: every time the code executes, it is in a different
location
These techniques can be applied at the following levels of granularity:
File level
Function level
Basic block level
Instruction level
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 59
Dynamic Obfuscation Techniques
General Techniques:
Build-and-execute: generate code for a routine at runtime, and
jump to it
Self-modification: modify the executable code
Encryption: the self-modification is decrypting the encrypted code
before executing it
Move code: every time the code executes, it is in a different
location
These techniques can be applied at the following levels of granularity:
File level
Function level
Basic block level
Instruction level
The attacker’s goals can be to:
Recover the original code
Modify the original code
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 59
Replacing instructions (Kanzaki [10])
Motivation: prevent code recovery via memory snapshot
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 60
Replacing instructions (Kanzaki [10])
Motivation: prevent code recovery via memory snapshot
Algorithm idea:
1. Replace real instructions by bogus instructions
2. Just before execution, replace bogus instruction with real instruction
3. After execution, replace real instruction with bogus instruction
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 60
Replacing instructions (Kanzaki [10])
Motivation: prevent code recovery via memory snapshot
Algorithm idea:
1. Replace real instructions by bogus instructions
2. Just before execution, replace bogus instruction with real instruction
3. After execution, replace real instruction with bogus instruction
Implementation by choosing 3 points A, B and C in the CFG:
1. All paths to B must flow through A and all paths from B must flow
through C
2. A replaces bogus instruction at B with real instruction
3. C replaces real instruction at B with bogus instruction
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 60
Dynamic Code Merging (Madou [12])
Motivation: keep code in constant flux
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 61
Dynamic Code Merging (Madou [12])
Motivation: keep code in constant flux
Algorithm idea:
1. Have two or more functions share the same location in memory
2. Create templates for functions that share same location
3. Before a function is called, patch memory using edit script to load it
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 61
Dynamic Code Merging (Madou [12])
Motivation: keep code in constant flux
Algorithm idea:
1. Have two or more functions share the same location in memory
2. Create templates for functions that share same location
3. Before a function is called, patch memory using edit script to load it
Implementation example for program with 2 functions f1 and f2:
1. Create template T with the same size as the largest function, i.e. f1
2. T contains values at memory offsets which are common for f1 and f2,
i.e. offsets 1 and 2
3. T contains wildcard values at all other memory offsets
f1 f2
memory offset: 0 1 2 3 4 0 1 2 3
memory value: b7 48 a0 53 fa e9 48 a0 33
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 61
Dynamic Code Merging (Madou [12])
Implementation example for program with 2 functions f1 and f2:
1. Create template T with the same size as the largest function, i.e. f1
2. T contains values at memory offsets which are common for f1 and f2,
i.e. offsets 1 and 2
3. T contains wildcard values at all other memory offsets
f1 f2
memory offset: 0 1 2 3 4 0 1 2 3
memory value: b7 48 a0 53 fa e9 48 a0 33
T Edit Scripts
memory offset: 0 1 2 3 4 e1 = [0 → b7, 3 → 53, 4 → fa]
memory value: ? 48 a0 ? ? e2 = [0 → e9, 3 → 33]
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 62
Dynamic Code Merging (Madou [12])
Implementation example for program with 2 functions f1 and f2:
1. Create template T with the same size as the largest function, i.e. f1
2. T contains values at memory offsets which are common for f1 and f2,
i.e. offsets 1 and 2
3. T contains wildcard values at all other memory offsets
4. Create edit scripts e1 and e2 which replace wildcards of T to load
the functions f1, respectively f2
f1 f2
memory offset: 0 1 2 3 4 0 1 2 3
memory value: b7 48 a0 53 fa e9 48 a0 33
T Edit Scripts
memory offset: 0 1 2 3 4 e1 = [0 → b7, 3 → 53, 4 → fa]
memory value: ? 48 a0 ? ? e2 = [0 → e9, 3 → 33]
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 62
Dynamic Decryption and Re-encryption
1. Execute current basic block
2. At some point in the current
basic block decrypt the next
basic block
3. Decryption key could be hash
of some other basic block
4. Jump to decrypted basic block
5. Encrypt the previously executed
basic block
6. Go to step 1
Source: http:
//tigress.cs.arizona.edu/
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 63
Dynamic Decryption and Re-encryption
1. Execute current basic block
2. At some point in the current
basic block decrypt the next
basic block
3. Decryption key could be hash
of some other basic block
4. Jump to decrypted basic block
5. Encrypt the previously executed
basic block
6. Go to step 1
Source: http:
//tigress.cs.arizona.edu/
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 63
Dynamic Decryption and Re-encryption
1. Execute current basic block
2. At some point in the current
basic block decrypt the next
basic block
3. Decryption key could be hash
of some other basic block
4. Jump to decrypted basic block
5. Encrypt the previously executed
basic block
6. Go to step 1
Source: http:
//tigress.cs.arizona.edu/
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 63
Dynamic Decryption and Re-encryption
1. Execute current basic block
2. At some point in the current
basic block decrypt the next
basic block
3. Decryption key could be hash
of some other basic block
4. Jump to decrypted basic block
5. Encrypt the previously executed
basic block
6. Go to step 1
Source: http:
//tigress.cs.arizona.edu/
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 63
Dynamic Decryption and Re-encryption
1. Execute current basic block
2. At some point in the current
basic block decrypt the next
basic block
3. Decryption key could be hash
of some other basic block
4. Jump to decrypted basic block
5. Encrypt the previously executed
basic block
6. Go to step 1
Source: http:
//tigress.cs.arizona.edu/
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 63
Dynamic Decryption and Re-encryption
1. Execute current basic block
2. At some point in the current
basic block decrypt the next
basic block
3. Decryption key could be hash
of some other basic block
4. Jump to decrypted basic block
5. Encrypt the previously executed
basic block
6. Go to step 1
Source: http:
//tigress.cs.arizona.edu/
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 63
Outline
1 Introduction
2 Obfuscation in Theory
3 Obfuscation in Practice
Static Obfuscation
Compiler Optimizations
Automated Code Obfuscation
Software Diversity
Dynamic Obfuscation
4 The Strength of Obfuscation
5 Hands-on Tutorial
6 Conclusions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 64
Classical Obfuscation Strength Dimensions
According to Collberg et al. [5] we should keep the following 4 factors in
mind when choosing obfuscation transformations:
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 65
Classical Obfuscation Strength Dimensions
According to Collberg et al. [5] we should keep the following 4 factors in
mind when choosing obfuscation transformations:
Potency → comprehensibility of code by humans
Time and skills needed to understand how the code works
Directly proportional with software complexity and size
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 65
Classical Obfuscation Strength Dimensions
According to Collberg et al. [5] we should keep the following 4 factors in
mind when choosing obfuscation transformations:
Potency → comprehensibility of code by humans
Time and skills needed to understand how the code works
Directly proportional with software complexity and size
Stealth → identifiability of obfuscated code
Time and computation needed to identify the parts of the code
which were obfuscated
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 65
Classical Obfuscation Strength Dimensions
According to Collberg et al. [5] we should keep the following 4 factors in
mind when choosing obfuscation transformations:
Potency → comprehensibility of code by humans
Time and skills needed to understand how the code works
Directly proportional with software complexity and size
Stealth → identifiability of obfuscated code
Time and computation needed to identify the parts of the code
which were obfuscated
Resilience → resistance against automatic deobfuscation
Time and computation needed to deobfuscate the code with a
software tool
Plus resources needed to build such a tool
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 65
Classical Obfuscation Strength Dimensions
According to Collberg et al. [5] we should keep the following 4 factors in
mind when choosing obfuscation transformations:
Potency → comprehensibility of code by humans
Time and skills needed to understand how the code works
Directly proportional with software complexity and size
Stealth → identifiability of obfuscated code
Time and computation needed to identify the parts of the code
which were obfuscated
Resilience → resistance against automatic deobfuscation
Time and computation needed to deobfuscate the code with a
software tool
Plus resources needed to build such a tool
Cost → performance and resource overhead of obfuscation
Time overhead added to program execution after obfuscation
Size of the obfuscated binary relative to original binary
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 65
The Strength of Obfuscation
Question: Which obfuscation transformation is better?
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
The Strength of Obfuscation
Question: Which obfuscation transformation is better?
Answer: It depends ...
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
The Strength of Obfuscation
Question: Which obfuscation transformation is better?
Answer: It depends ...
What do you want to protect?:
intellectual property
keys
integrity of software
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
The Strength of Obfuscation
Question: Which obfuscation transformation is better?
Answer: It depends ...
What do you want to protect?:
intellectual property
keys
integrity of software
What kind of attacker are you facing? (e.g. amateur hacker,
professional organization, nation state)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
The Strength of Obfuscation
Question: Which obfuscation transformation is better?
Answer: It depends ...
What do you want to protect?:
intellectual property
keys
integrity of software
What kind of attacker are you facing? (e.g. amateur hacker,
professional organization, nation state)
What is economically feasible for the attacker? (e.g. static attacks,
dynamic attacks)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
The Strength of Obfuscation
Question: Which obfuscation transformation is better?
Answer: It depends ...
What do you want to protect?:
intellectual property
keys
integrity of software
What kind of attacker are you facing? (e.g. amateur hacker,
professional organization, nation state)
What is economically feasible for the attacker? (e.g. static attacks,
dynamic attacks)
What is the impact of protection on performance?
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
The Strength of Obfuscation
Question: Which obfuscation transformation is better?
Answer: It depends ...
What do you want to protect?:
intellectual property
keys
integrity of software
What kind of attacker are you facing? (e.g. amateur hacker,
professional organization, nation state)
What is economically feasible for the attacker? (e.g. static attacks,
dynamic attacks)
What is the impact of protection on performance?
Example Scenario: Protecting a hard-coded password / license key
Software: offline game / word processor, costs 50 USD
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
The Strength of Obfuscation
Question: Which obfuscation transformation is better?
Answer: It depends ...
What do you want to protect?:
intellectual property
keys
integrity of software
What kind of attacker are you facing? (e.g. amateur hacker,
professional organization, nation state)
What is economically feasible for the attacker? (e.g. static attacks,
dynamic attacks)
What is the impact of protection on performance?
Example Scenario: Protecting a hard-coded password / license key
Software: offline game / word processor, costs 50 USD
→ cost (obfuscation overhead) < 10% performance impact
→ stealth less important for this scenario
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
The Strength of Obfuscation
Question: Which obfuscation transformation is better?
Answer: It depends ...
What do you want to protect?:
intellectual property
keys
integrity of software
What kind of attacker are you facing? (e.g. amateur hacker,
professional organization, nation state)
What is economically feasible for the attacker? (e.g. static attacks,
dynamic attacks)
What is the impact of protection on performance?
Example Scenario: Protecting a hard-coded password / license key
Software: offline game / word processor, costs 50 USD
→ cost (obfuscation overhead) < 10% performance impact
→ stealth less important for this scenario
Each user needs a different password / license key
→ potency > 50 USD (exclude: professional org., nation state)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
The Strength of Obfuscation
Question: Which obfuscation transformation is better?
Answer: It depends ...
What do you want to protect?:
intellectual property
keys
integrity of software
What kind of attacker are you facing? (e.g. amateur hacker,
professional organization, nation state)
What is economically feasible for the attacker? (e.g. static attacks,
dynamic attacks)
What is the impact of protection on performance?
Example Scenario: Protecting a hard-coded password / license key
Software: offline game / word processor, costs 50 USD
→ cost (obfuscation overhead) < 10% performance impact
→ stealth less important for this scenario
Each user needs a different password / license key
→ potency > 50 USD (exclude: professional org., nation state)
Attacker develops automated crack based on symbolic execution
→ resilience > 50 USD × Nr. of users not willing to pay
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
Symbolic Execution in a Nutshell
Interpret program using symbolic values instead of concrete ones
int main(int ac , char* av []) {
int a = atoi(av [1]);
int b = atoi(av [2]);
int c = atoi(av [3]);
if (a > b)
a = a - b
if (b < 1)
c = a + b
b = 1;
return 0;
}
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 67
Symbolic Execution in a Nutshell
Interpret program using symbolic values instead of concrete ones
Fork execution on each branch dependent on symbolic values
int main(int ac , char* av []) {
int a = atoi(av [1]);
int b = atoi(av [2]);
int c = atoi(av [3]);
if (a > b)
a = a - b
if (b < 1)
c = a + b
b = 1;
return 0;
}
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 67
Symbolic Execution in a Nutshell
Interpret program using symbolic values instead of concrete ones
Fork execution on each branch dependent on symbolic values
Collect path constraints for each execution path
int main(int ac , char* av []) {
int a = atoi(av [1]);
int b = atoi(av [2]);
int c = atoi(av [3]);
if (a > b)
a = a - b
if (b < 1)
c = a + b
b = 1;
return 0;
}
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 67
Symbolic Execution in a Nutshell
Interpret program using symbolic values instead of concrete ones
Fork execution on each branch dependent on symbolic values
Collect path constraints for each execution path
Get concrete input values from path conditions using SMT solver
int main(int ac , char* av []) {
int a = atoi(av [1]);
int b = atoi(av [2]);
int c = atoi(av [3]);
if (a > b)
a = a - b
if (b < 1)
c = a + b
b = 1;
return 0;
}
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 67
Bypassing License Checks via Symbolic Execution
1. Make license input symbolic
2. Indicate distinct statement executed when license key is correct
3. Explore paths until desired instruction (sequence) is found
4. Solve path constraints on paths that lead to desired instruction via
SMT solver
5. Find satisfiable path constraints → concrete inputs to bypass check
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 68
Are other Attacks Possible via Symbolic Execution?
Yes! Symbolic execution is a prerequisite for several attacks:
Simplify control-flow graph
Identify & disable tamper-proofing checks
Bypass authentication checks / trigger conditions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 69
Simplifying the CFG (Yadegari et al. [14])
1. Explore paths such that all code is covered
2. Simplify traces using compiler optimization tricks
3. Reconstruct CFG from traces
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 70
Identify & Disable Checks (Qiu et al. [13])
1. Taint code segment
2. Explore paths until enough self-checks
disabled (cyclic checks → explore all code)
3. Disable self-checking instructions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 71
A Common Sub-Problem of Deobfuscation Attacks
Common sub-problem: path exploration
How do we explore paths of a given program?
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 72
A Common Sub-Problem of Deobfuscation Attacks
Common sub-problem: path exploration
How do we explore paths of a given program?
Generate test cases:
Black-box test generation: Fuzzing, Random testing
White-box test generation: Symbolic/Concolic execution
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 72
Measuring Obfuscation Strength
Strength of obfuscation: increase in test case generation time
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 73
Measuring Obfuscation Strength
Strength of obfuscation: increase in test case generation time
Observation: Generally, obfuscation does not change input-output
behavior
→ No increase in black-box test case generation time
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 73
Measuring Obfuscation Strength
Strength of obfuscation: increase in test case generation time
Observation: Generally, obfuscation does not change input-output
behavior
→ No increase in black-box test case generation time
Example:
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 73
Measuring Obfuscation Strength
Strength of obfuscation: increase in test case generation time
Observation: Generally, obfuscation does not change input-output
behavior
→ No increase in black-box test case generation time
Example:
Observation: Could be faster to use black-box test generator than
white-box
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 73
Measuring Obfuscation Strength
Strength of obfuscation: increase in test case generation time
Observation: Generally, obfuscation does not change input-output
behavior
→ No increase in black-box test case generation time
Example:
Observation: Could be faster to use black-box test generator than
white-box
Conclusion: Apply obfuscation transformations until white-box
slower than black-box test case generation
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 73
Code Tampering Attacks
Question: Why do we need code obfuscation? Just use
cryptographic hash
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 74
Code Tampering Attacks
Question: Why do we need code obfuscation? Just use
cryptographic hash
Example:
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 74
Code Tampering Attacks
Question: Why do we need code obfuscation? Just use
cryptographic hash
Example:
Hard for symbolic execution (SMT solver) to break crypto hash
functions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 74
Code Tampering Attacks
Question: Why do we need code obfuscation? Just use
cryptographic hash
Example:
Hard for symbolic execution (SMT solver) to break crypto hash
functions
Answer:
Test case generation is non-invasive attack, i.e. code is read, not
changed
Obfuscation aims to defend against MATE attacker (can tamper
with code)
Easy to find and patch-out crypto hash functions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 74
Metrics for Measuring Obfuscation Strength
Number of lines of code
McCabe cyclomatic complexity
Nesting complexity
Data flow complexity
Object oriented design metrics
Data structure complexity
...
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 75
Metrics for Measuring Obfuscation Strength
Number of lines of code
McCabe cyclomatic complexity
Nesting complexity
Data flow complexity
Object oriented design metrics
Data structure complexity
...
SAT metrics: Graph metrics on a
SAT formula represented as a graph
(x +y +z) · (!x+!y +z) · (x+!y+!z)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 75
SAT Instance Before & After Obfuscation
Figure : Before Obfuscation (7.5 sec)
1 unsigned int SDBMHash(char* str , unsigned int len)
2 {
3 unsigned int hash = 0;
4 unsigned int i = 0;
5 for(i = 0; i < len; str++, i++)
6 hash = (*str) + (hash << 6)
7 + (hash << 16) - hash;
8 return hash;
9 }
10
11 int main(int argc , char* argv []) {
12 unsigned char *str = argv [1];
13 unsigned int hash = SDBMHash(str , strlen(str));
14
15 if (hash == 0x89dcd66e)
16 printf("You win!n");
17 return 0;
18 }
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 76
SAT Instance Before & After Obfuscation
Figure : Before Obfuscation (7.5 sec) Figure : After Obfuscation (438 sec)
Strong obfuscation transformations destroy community structures
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 76
Outline
1 Introduction
2 Obfuscation in Theory
3 Obfuscation in Practice
Static Obfuscation
Compiler Optimizations
Automated Code Obfuscation
Software Diversity
Dynamic Obfuscation
4 The Strength of Obfuscation
5 Hands-on Tutorial
6 Conclusions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 77
Hands-on Tutorial Overview
Example of bypassing password check in C programs
1. Start with plain unobfuscated program (break with static analysis)
2. Discuss the option of using hash functions (break with patching)
3. Add one obfuscation layer (break with symbolic execution)
4. Add more obfuscation layers (harder to break)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 78
Hands-on Tutorial Overview
Example of bypassing password check in C programs
1. Start with plain unobfuscated program (break with static analysis)
2. Discuss the option of using hash functions (break with patching)
3. Add one obfuscation layer (break with symbolic execution)
4. Add more obfuscation layers (harder to break)
Tools we are going to use:
We assume you have Docker installed and have basic user knowledge
Docker image based on Ubuntu contains: Tigress, KLEE, STP, Z3,
SatGraf, etc.
$ docker pull banescusebi/obfuscation-symex
$ docker run -it banescusebi/obfuscation-symex
For instructions on how to start the Docker image read description at
https://hub.docker.com/r/banescusebi/obfuscation-symex/
Instructions for running GUI apps available for Ubuntu and Mac OS
(not mandatory, only needed for SatGraf)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 78
Bypassing Password Check in Unobfuscated Program
Example C Program with Hard-Coded Password
1 #include <stdlib.h>
2 #include <stdio.h>
3 #include <string.h>
4
5 int main(int argc , char* argv []) {
6 if (strcmp(argv [1],
7 " my_license_key ") == 0)
8 printf("You win!n");
9 return 0;
10 }
Hard-coded password can be found via static analysis
$ ./nohash my license key
You win!
$ strings nohash | grep ’’license’’
my license key
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 79
Bypassing Password Check even when Using Hash
Example C Program where Hard-Coded Password Replaced with Hash
1 unsigned int BPHash(char* str , unsigned int len)
2 {
3 unsigned int hash = 0;
4 unsigned int i = 0;
5 for(i = 0; i < len; str++, i++) {
6 hash = hash << 7 ^ (*str);
7 }
8 return hash;
9 }
10
11 int main(int argc , char* argv []) {
12 unsigned char *str = argv [1];
13 unsigned int hash = BPHash(str , strlen(str));
14 if (hash == 0x5bfaf2f9)
15 printf("You win!n");
16 return 0;
17 }
Check can still be disabled using binary patching
$ strings bphash | grep ’’license’’
$ objdump -D bphash
Use vi in hex editor mode (:%!xxd) to patch check (jne)
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 80
Bypassing Password Check after Obfuscation
Add one layer of obfuscation:
$ tigress --Transform=EncodeArithmetic
--Functions=DEKHash --out=dekhash-obf/dekhash-encA.c
dekhash.c
$ tigress --Transform=Flatten --Functions=DEKHash
--out=dekhash-obf/dekhash-flat.c dekhash.c
$ tigress --Transform=Virtualize --Functions=DEKHash
--out=dekhash-obf/dekhash-virt.c dekhash.c
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 81
Bypassing Password Check after Obfuscation
Add one layer of obfuscation:
$ tigress --Transform=EncodeArithmetic
--Functions=DEKHash --out=dekhash-obf/dekhash-encA.c
dekhash.c
$ tigress --Transform=Flatten --Functions=DEKHash
--out=dekhash-obf/dekhash-flat.c dekhash.c
$ tigress --Transform=Virtualize --Functions=DEKHash
--out=dekhash-obf/dekhash-virt.c dekhash.c
Add another layer of obfuscation:
$ tigress --Transform=Flatten --Functions=DEKHash
--Transform=EncodeArithmetic --Functions=DEKHash
--out=dekhash-obf/dekhash-flat-encA.c dekhash.c
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 81
Bypassing Password Check after Obfuscation
Add one layer of obfuscation:
$ tigress --Transform=EncodeArithmetic
--Functions=DEKHash --out=dekhash-obf/dekhash-encA.c
dekhash.c
$ tigress --Transform=Flatten --Functions=DEKHash
--out=dekhash-obf/dekhash-flat.c dekhash.c
$ tigress --Transform=Virtualize --Functions=DEKHash
--out=dekhash-obf/dekhash-virt.c dekhash.c
Add another layer of obfuscation:
$ tigress --Transform=Flatten --Functions=DEKHash
--Transform=EncodeArithmetic --Functions=DEKHash
--out=dekhash-obf/dekhash-flat-encA.c dekhash.c
Apply symbolic execution to each of these programs (see KLEE
command on next slide)
$ ∼/scripts/step05-klee-symex-until-win.sh . exe 15
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 81
The KLEE Symbolic Execution Tool
1 klee --optimize
2 // Optimize before execution
3 --emit -all -errors
4 // Don’t stop on first error
5 --libc=uclibc
6 // Choose libc version
7 --posix -runtime
8 // Link with POSIX runtime
9 --only -output -states -covering -new
10 // Only tests covering new code
11 --max -time =3600
12 // Halt after given nr. of sec.
13 --write -smt2s
14 // Write smt2 file per test
15 --output -dir=klee -out -${file_name}
16 // Output directory for tests
17 ./${file_name }.bc
18 // Bitcode file under test
19 --sym -arg 5
20 // Length of symbolic arg.
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 82
Viewing Results
See number of nanoseconds needed for KLEE to analyze a
bitcode program
$ cat klee-time-to-win.txt
Check differences between 1 and 2 layers of obfuscation
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 83
Viewing Results
See number of nanoseconds needed for KLEE to analyze a
bitcode program
$ cat klee-time-to-win.txt
Check differences between 1 and 2 layers of obfuscation
Convert the SMT2 instances generated by KLEE into CNF
instances
$ ∼/scripts/step07-convert-smt-to-cnf.sh .
obfuscated-cnf-instances.txt
$ ll ./obfuscated-cnf-instances
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 83
Viewing Results
See number of nanoseconds needed for KLEE to analyze a
bitcode program
$ cat klee-time-to-win.txt
Check differences between 1 and 2 layers of obfuscation
Convert the SMT2 instances generated by KLEE into CNF
instances
$ ∼/scripts/step07-convert-smt-to-cnf.sh .
obfuscated-cnf-instances.txt
$ ll ./obfuscated-cnf-instances
View CNF instances in SatGraf:
$ java -jar ∼/satgraf/dist/SatGraf.jar com -f
obfuscated-cnf-instances/dekhash-virt.cnf
Check differences between 1 and 2 layers of obfuscation
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 83
Outline
1 Introduction
2 Obfuscation in Theory
3 Obfuscation in Practice
Static Obfuscation
Compiler Optimizations
Automated Code Obfuscation
Software Diversity
Dynamic Obfuscation
4 The Strength of Obfuscation
5 Hands-on Tutorial
6 Conclusions
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 84
Conclusions
Theoretical obfuscation offers security guarantees, but is unpractical
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 85
Conclusions
Theoretical obfuscation offers security guarantees, but is unpractical
Practical obfuscation raises the bar against MATE attackers
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 85
Conclusions
Theoretical obfuscation offers security guarantees, but is unpractical
Practical obfuscation raises the bar against MATE attackers
Static obfuscation transformations:
Applied on source code, IR or machine code
Obfuscated program fixed (doesn’t change at runtime)
Makes static analysis harder
Many transformations available: compiler optimizations, scramble
identifiers, instruction substitution, merging & splitting functions,
opaque predicates & expressions, control flow flattening,
virtualization obfuscation, white-box cryptography, etc.
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 85
Conclusions
Theoretical obfuscation offers security guarantees, but is unpractical
Practical obfuscation raises the bar against MATE attackers
Static obfuscation transformations:
Applied on source code, IR or machine code
Obfuscated program fixed (doesn’t change at runtime)
Makes static analysis harder
Many transformations available: compiler optimizations, scramble
identifiers, instruction substitution, merging & splitting functions,
opaque predicates & expressions, control flow flattening,
virtualization obfuscation, white-box cryptography, etc.
Software diversity improves potency and resilience against reverse
engineering attacks
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 85
Conclusions
Theoretical obfuscation offers security guarantees, but is unpractical
Practical obfuscation raises the bar against MATE attackers
Static obfuscation transformations:
Applied on source code, IR or machine code
Obfuscated program fixed (doesn’t change at runtime)
Makes static analysis harder
Many transformations available: compiler optimizations, scramble
identifiers, instruction substitution, merging & splitting functions,
opaque predicates & expressions, control flow flattening,
virtualization obfuscation, white-box cryptography, etc.
Software diversity improves potency and resilience against reverse
engineering attacks
Dynamic obfuscation transformations:
Program changes during execution
Makes dynamic analysis harder
Several transformations available: replacing instructions, dynamic
code merging, dynamic decryption and re-encryption.
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 85
Conclusions
Theoretical obfuscation offers security guarantees, but is unpractical
Practical obfuscation raises the bar against MATE attackers
Static obfuscation transformations:
Applied on source code, IR or machine code
Obfuscated program fixed (doesn’t change at runtime)
Makes static analysis harder
Many transformations available: compiler optimizations, scramble
identifiers, instruction substitution, merging & splitting functions,
opaque predicates & expressions, control flow flattening,
virtualization obfuscation, white-box cryptography, etc.
Software diversity improves potency and resilience against reverse
engineering attacks
Dynamic obfuscation transformations:
Program changes during execution
Makes dynamic analysis harder
Several transformations available: replacing instructions, dynamic
code merging, dynamic decryption and re-encryption.
Multiple transformations should be combined to improve strength
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 85
Thank you for your attention
Questions ?
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 86
References I
S. Banescu, M. Ochoa, A. Pretschner, and N. Kunze.
Benchmarking indistinguishability obfuscation - a candidate
implementation.
In Proc. of 7th International Symposium on ESSoS, number 8978 in
LNCS, 2015.
Boaz Barak.
Hopes, fears, and software obfuscation.
Communications of the ACM, 59(3):88–96, 2016.
Boaz Barak, Oded Goldreich, Rusell Impagliazzo, Steven Rudich,
Amit Sahai, Salil Vadhan, and Ke Yang.
On the (im) possibility of obfuscating programs.
In Advances in Cryptology CRYPTO 2001, pages 1–18. Springer,
2001.
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 87
References II
Christian Collberg and Jasvir Nagra.
Surreptitious Software: Obfuscation, Watermarking, and
Tamperproofing for Software Protection.
Addison-Wesley, 2009.
Christian Collberg, Clark Thomborson, and Douglas Low.
A taxonomy of obfuscating transformations.
Technical report, Department of Computer Science, The University
of Auckland, New Zealand, 1997.
Christian Collberg, Clark Thomborson, and Douglas Low.
Manufacturing cheap, resilient, and stealthy opaque constructs.
In Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on
Principles of programming languages, POPL ’98, pages 184–196,
New York, NY, USA, 1998. ACM.
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 88
References III
M. Dalla Preda, M. Madou, K. De Bosschere, and R. Giacobazzi.
Opaque predicates detection by abstract interpretation.
In Michael Johnson and Varmo Vene, editors, Algebraic Methodology
and Software Technology, volume 4019 of Lecture Notes in
Computer Science, pages 81–95. Springer Berlin Heidelberg, 2006.
S. Garg, C. Gentry, S. Halevi, M. Raykova, A Sahai, and B. Waters.
Candidate indistinguishability obfuscation and functional encryption
for all circuits.
In Proc. of the 54th Annual Symp. on Foundations of Computer
Science, pages 40–49, 2013.
S. Goldwasser and G. N. Rothblum.
On best-possible obfuscation.
In Theory of Cryptography, pages 194–213. Springer, 2007.
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 89
References IV
Yuichiro Kanzaki, Akito Monden, Masahide Nakamura, and Ken-ichi
Matsumoto.
Exploiting self-modification mechanism for program protection.
In Computer Software and Applications Conference, 2003.
COMPSAC 2003. Proceedings. 27th Annual International, pages
170–179, 2003.
J. Kinder.
Towards static analysis of virtualization-obfuscated binaries.
In 19th Working Conference on Reverse Engineering (WCRE), pages
61–70, Oct 2012.
Matias Madou, Bertrand Anckaert, Patrick Moseley, Saumya Debray,
Bjorn De Sutter, and Koen De Bosschere.
Software protection through dynamic code mutation.
In Information Security Applications, pages 194–206. Springer, 2006.
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 90
References V
Jing Qiu, Babak Yadegari, Brian Johannesmeyer, Saumya Debray,
and Xiaohong Su.
Identifying and understanding self-checksumming defenses in
software.
In Proceedings of the 5th ACM Conference on Data and Application
Security and Privacy, pages 207–218. ACM, 2015.
Babak Yadegari, Brian Johannesmeyer, Ben Whitely, and Saumya
Debray.
A generic approach to automatic deobfuscation of executable code.
In Security and Privacy (SP), 2015 IEEE Symposium on, pages
674–691. IEEE, 2015.
4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 91

Contenu connexe

Tendances

Elliptic Curve Cryptography and Zero Knowledge Proof
Elliptic Curve Cryptography and Zero Knowledge ProofElliptic Curve Cryptography and Zero Knowledge Proof
Elliptic Curve Cryptography and Zero Knowledge ProofArunanand Ta
 
Debug Information And Where They Come From
Debug Information And Where They Come FromDebug Information And Where They Come From
Debug Information And Where They Come FromMin-Yih Hsu
 
Hack an ASP .NET website? Hard, but possible!
Hack an ASP .NET website? Hard, but possible! Hack an ASP .NET website? Hard, but possible!
Hack an ASP .NET website? Hard, but possible! Vladimir Kochetkov
 
逆向工程入門
逆向工程入門逆向工程入門
逆向工程入門耀德 蔡
 
Trace memory leak with gdb (GDB로 메모리 누수 찾기)
Trace memory  leak with gdb (GDB로 메모리 누수 찾기)Trace memory  leak with gdb (GDB로 메모리 누수 찾기)
Trace memory leak with gdb (GDB로 메모리 누수 찾기)Jun Hong Kim
 
A TLS Story
A TLS StoryA TLS Story
A TLS Storyereddick
 
A Brief History of Cryptography
A Brief History of CryptographyA Brief History of Cryptography
A Brief History of Cryptographyguest9006ab
 
Elgamal Şifreleme Algoritması
Elgamal Şifreleme AlgoritmasıElgamal Şifreleme Algoritması
Elgamal Şifreleme AlgoritmasıYaseminD1
 
[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practical[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practicalMoabi.com
 
Computer Security Lecture 2: Classical Encryption Techniques 1
Computer Security Lecture 2: Classical Encryption Techniques 1Computer Security Lecture 2: Classical Encryption Techniques 1
Computer Security Lecture 2: Classical Encryption Techniques 1Mohamed Loey
 
Elliptic Curve Cryptography
Elliptic Curve CryptographyElliptic Curve Cryptography
Elliptic Curve CryptographyKelly Bresnahan
 
Ch03 block-cipher-and-data-encryption-standard
Ch03 block-cipher-and-data-encryption-standardCh03 block-cipher-and-data-encryption-standard
Ch03 block-cipher-and-data-encryption-standardtarekiceiuk
 
CNIT 141: 9. Elliptic Curve Cryptosystems
CNIT 141: 9. Elliptic Curve CryptosystemsCNIT 141: 9. Elliptic Curve Cryptosystems
CNIT 141: 9. Elliptic Curve CryptosystemsSam Bowne
 
Node.js Event Loop & EventEmitter
Node.js Event Loop & EventEmitterNode.js Event Loop & EventEmitter
Node.js Event Loop & EventEmitterSimen Li
 
OpenCLに触れてみよう
OpenCLに触れてみようOpenCLに触れてみよう
OpenCLに触れてみようYou&I
 
Cryptography - RSA and ECDSA
Cryptography - RSA and ECDSACryptography - RSA and ECDSA
Cryptography - RSA and ECDSAAPNIC
 
BPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLabBPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLabTaeung Song
 
Cryptography and network security Nit701
Cryptography and network security Nit701Cryptography and network security Nit701
Cryptography and network security Nit701Amit Pathak
 
The rsa algorithm
The rsa algorithmThe rsa algorithm
The rsa algorithmKomal Singh
 

Tendances (20)

Elliptic Curve Cryptography and Zero Knowledge Proof
Elliptic Curve Cryptography and Zero Knowledge ProofElliptic Curve Cryptography and Zero Knowledge Proof
Elliptic Curve Cryptography and Zero Knowledge Proof
 
Debug Information And Where They Come From
Debug Information And Where They Come FromDebug Information And Where They Come From
Debug Information And Where They Come From
 
Hack an ASP .NET website? Hard, but possible!
Hack an ASP .NET website? Hard, but possible! Hack an ASP .NET website? Hard, but possible!
Hack an ASP .NET website? Hard, but possible!
 
逆向工程入門
逆向工程入門逆向工程入門
逆向工程入門
 
Trace memory leak with gdb (GDB로 메모리 누수 찾기)
Trace memory  leak with gdb (GDB로 메모리 누수 찾기)Trace memory  leak with gdb (GDB로 메모리 누수 찾기)
Trace memory leak with gdb (GDB로 메모리 누수 찾기)
 
A TLS Story
A TLS StoryA TLS Story
A TLS Story
 
A Brief History of Cryptography
A Brief History of CryptographyA Brief History of Cryptography
A Brief History of Cryptography
 
Elgamal Şifreleme Algoritması
Elgamal Şifreleme AlgoritmasıElgamal Şifreleme Algoritması
Elgamal Şifreleme Algoritması
 
[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practical[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practical
 
Computer Security Lecture 2: Classical Encryption Techniques 1
Computer Security Lecture 2: Classical Encryption Techniques 1Computer Security Lecture 2: Classical Encryption Techniques 1
Computer Security Lecture 2: Classical Encryption Techniques 1
 
Elliptic Curve Cryptography
Elliptic Curve CryptographyElliptic Curve Cryptography
Elliptic Curve Cryptography
 
Ch03 block-cipher-and-data-encryption-standard
Ch03 block-cipher-and-data-encryption-standardCh03 block-cipher-and-data-encryption-standard
Ch03 block-cipher-and-data-encryption-standard
 
CNIT 141: 9. Elliptic Curve Cryptosystems
CNIT 141: 9. Elliptic Curve CryptosystemsCNIT 141: 9. Elliptic Curve Cryptosystems
CNIT 141: 9. Elliptic Curve Cryptosystems
 
RSA
RSARSA
RSA
 
Node.js Event Loop & EventEmitter
Node.js Event Loop & EventEmitterNode.js Event Loop & EventEmitter
Node.js Event Loop & EventEmitter
 
OpenCLに触れてみよう
OpenCLに触れてみようOpenCLに触れてみよう
OpenCLに触れてみよう
 
Cryptography - RSA and ECDSA
Cryptography - RSA and ECDSACryptography - RSA and ECDSA
Cryptography - RSA and ECDSA
 
BPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLabBPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLab
 
Cryptography and network security Nit701
Cryptography and network security Nit701Cryptography and network security Nit701
Cryptography and network security Nit701
 
The rsa algorithm
The rsa algorithmThe rsa algorithm
The rsa algorithm
 

Similaire à Breaking Obfuscated Programs with Symbolic Execution

Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022Marcus Botacin
 
A study of detecting computer viruses in real infected files in the n-gram re...
A study of detecting computer viruses in real infected files in the n-gram re...A study of detecting computer viruses in real infected files in the n-gram re...
A study of detecting computer viruses in real infected files in the n-gram re...UltraUploader
 
How do we detect malware? A step-by-step guide
How do we detect malware? A step-by-step guideHow do we detect malware? A step-by-step guide
How do we detect malware? A step-by-step guideMarcus Botacin
 
Distributed Cache, bridging C++ to new technologies (Hazelcast)
Distributed Cache, bridging C++ to new technologies (Hazelcast)Distributed Cache, bridging C++ to new technologies (Hazelcast)
Distributed Cache, bridging C++ to new technologies (Hazelcast)Ovidiu Farauanu
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksNguyen Quang
 
PyConPL 2017 - with python: security
PyConPL 2017 - with python: securityPyConPL 2017 - with python: security
PyConPL 2017 - with python: securityPiotr Dyba
 
Research Careers in Applied Computer Science
Research Careers in Applied Computer ScienceResearch Careers in Applied Computer Science
Research Careers in Applied Computer ScienceChristoph Lange
 
Lessons learned from useR! 2015
Lessons learned from useR! 2015Lessons learned from useR! 2015
Lessons learned from useR! 2015salankia
 
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...Akram El-Korashy
 
Striving to Demystify Bayesian Computational Modelling
Striving to Demystify Bayesian Computational ModellingStriving to Demystify Bayesian Computational Modelling
Striving to Demystify Bayesian Computational ModellingMarco Wirthlin
 
Deep learning: challenges and applications
Deep learning: challenges and  applicationsDeep learning: challenges and  applications
Deep learning: challenges and applicationsAboul Ella Hassanien
 
Software Preservation: challenges and opportunities for reproductibility (Sci...
Software Preservation: challenges and opportunities for reproductibility (Sci...Software Preservation: challenges and opportunities for reproductibility (Sci...
Software Preservation: challenges and opportunities for reproductibility (Sci...Roberto Di Cosmo
 
ScilabTEC 2015 - Irill
ScilabTEC 2015 - IrillScilabTEC 2015 - Irill
ScilabTEC 2015 - IrillScilab
 
#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...
#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...
#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...Paris Open Source Summit
 
Ontologies Ontop Databases
Ontologies Ontop DatabasesOntologies Ontop Databases
Ontologies Ontop DatabasesMartín Rezk
 
Programming Actor-based Collective Adaptive Systems
Programming Actor-based Collective Adaptive SystemsProgramming Actor-based Collective Adaptive Systems
Programming Actor-based Collective Adaptive SystemsRoberto Casadei
 
Artificial Level Language and JAVS-tranScripter working
Artificial Level Language and JAVS-tranScripter workingArtificial Level Language and JAVS-tranScripter working
Artificial Level Language and JAVS-tranScripter workingVishnu Suresh
 
Open & reproducible research - What can we do in practice?
Open & reproducible research - What can we do in practice?Open & reproducible research - What can we do in practice?
Open & reproducible research - What can we do in practice?Felix Z. Hoffmann
 
Application of bpcs steganography to wavelet compressed video (synopsis)
Application of bpcs steganography to wavelet compressed video (synopsis)Application of bpcs steganography to wavelet compressed video (synopsis)
Application of bpcs steganography to wavelet compressed video (synopsis)Mumbai Academisc
 
Universal programmability how ai can help
Universal programmability how ai can helpUniversal programmability how ai can help
Universal programmability how ai can helpCS, NcState
 

Similaire à Breaking Obfuscated Programs with Symbolic Execution (20)

Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
 
A study of detecting computer viruses in real infected files in the n-gram re...
A study of detecting computer viruses in real infected files in the n-gram re...A study of detecting computer viruses in real infected files in the n-gram re...
A study of detecting computer viruses in real infected files in the n-gram re...
 
How do we detect malware? A step-by-step guide
How do we detect malware? A step-by-step guideHow do we detect malware? A step-by-step guide
How do we detect malware? A step-by-step guide
 
Distributed Cache, bridging C++ to new technologies (Hazelcast)
Distributed Cache, bridging C++ to new technologies (Hazelcast)Distributed Cache, bridging C++ to new technologies (Hazelcast)
Distributed Cache, bridging C++ to new technologies (Hazelcast)
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
 
PyConPL 2017 - with python: security
PyConPL 2017 - with python: securityPyConPL 2017 - with python: security
PyConPL 2017 - with python: security
 
Research Careers in Applied Computer Science
Research Careers in Applied Computer ScienceResearch Careers in Applied Computer Science
Research Careers in Applied Computer Science
 
Lessons learned from useR! 2015
Lessons learned from useR! 2015Lessons learned from useR! 2015
Lessons learned from useR! 2015
 
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...
 
Striving to Demystify Bayesian Computational Modelling
Striving to Demystify Bayesian Computational ModellingStriving to Demystify Bayesian Computational Modelling
Striving to Demystify Bayesian Computational Modelling
 
Deep learning: challenges and applications
Deep learning: challenges and  applicationsDeep learning: challenges and  applications
Deep learning: challenges and applications
 
Software Preservation: challenges and opportunities for reproductibility (Sci...
Software Preservation: challenges and opportunities for reproductibility (Sci...Software Preservation: challenges and opportunities for reproductibility (Sci...
Software Preservation: challenges and opportunities for reproductibility (Sci...
 
ScilabTEC 2015 - Irill
ScilabTEC 2015 - IrillScilabTEC 2015 - Irill
ScilabTEC 2015 - Irill
 
#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...
#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...
#OSSPARIS17 - Logiciel libre pour une science reproductible, par ROBERTO DI C...
 
Ontologies Ontop Databases
Ontologies Ontop DatabasesOntologies Ontop Databases
Ontologies Ontop Databases
 
Programming Actor-based Collective Adaptive Systems
Programming Actor-based Collective Adaptive SystemsProgramming Actor-based Collective Adaptive Systems
Programming Actor-based Collective Adaptive Systems
 
Artificial Level Language and JAVS-tranScripter working
Artificial Level Language and JAVS-tranScripter workingArtificial Level Language and JAVS-tranScripter working
Artificial Level Language and JAVS-tranScripter working
 
Open & reproducible research - What can we do in practice?
Open & reproducible research - What can we do in practice?Open & reproducible research - What can we do in practice?
Open & reproducible research - What can we do in practice?
 
Application of bpcs steganography to wavelet compressed video (synopsis)
Application of bpcs steganography to wavelet compressed video (synopsis)Application of bpcs steganography to wavelet compressed video (synopsis)
Application of bpcs steganography to wavelet compressed video (synopsis)
 
Universal programmability how ai can help
Universal programmability how ai can helpUniversal programmability how ai can help
Universal programmability how ai can help
 

Dernier

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...software pro Development
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 

Dernier (20)

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 

Breaking Obfuscated Programs with Symbolic Execution

  • 1. TechnischeUniversitätMünchen Fakultät für Informatik SSPREW 2017 Training Breaking Obfuscated Programs with Symbolic Execution Sebastian Banescu Technical University of Munich, Germany Chair of Software Engineering Prof. Alexander Pretschner
  • 2. Outline 1 Introduction 2 Obfuscation in Theory 3 Obfuscation in Practice Static Obfuscation Compiler Optimizations Automated Code Obfuscation Software Diversity Dynamic Obfuscation 4 The Strength of Obfuscation 5 Hands-on Tutorial 6 Conclusions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 2
  • 3. Assumptions and Tools for Hands-on Tutorial We assume you have Docker installed and have basic user knowledge Docker installation instructions https://docs.docker.com/engine/installation/ Docker image based on Ubuntu contains: Tigress, KLEE, STP, Z3, SatGraf, etc. $ docker pull banescusebi/obfuscation-symex $ docker run -it banescusebi/obfuscation-symex For instructions on how to start the Docker image read description at https: //hub.docker.com/r/banescusebi/obfuscation-symex/ Instructions for running GUI apps available for Ubuntu and Mac OS (not mandatory, only needed for SatGraf) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 3
  • 4. Expected Outcome After this training you will have a better understanding of the following: The theory and practice of obfuscation and software diversity Practical static & dynamic obfuscation transformations Tools for applying symbolic execution on obfuscated programs Which obfuscation transformations help against symbolic execution 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 4
  • 5. Attackers in Computer Security 1. Man-in-the-middle (MITM) attacks communication channels 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 5
  • 6. Attackers in Computer Security 1. Man-in-the-middle (MITM) attacks communication channels 2. Remote attacker exploits vulnerabilities (e.g. buffer overflows) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 5
  • 7. Attackers in Computer Security 1. Man-in-the-middle (MITM) attacks communication channels 2. Remote attacker exploits vulnerabilities (e.g. buffer overflows) 3. Man-at-the-end (MATE) reverse engineers software 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 5
  • 8. Attackers in Computer Security 1. Man-in-the-middle (MITM) attacks communication channels 2. Remote attacker exploits vulnerabilities (e.g. buffer overflows) 3. Man-at-the-end (MATE) reverse engineers software We focus on MATE during this tutorial. 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 5
  • 9. Introduction Informal Definition of Obfuscation: To obfuscate a program P means to transform it into a executable program P from which it is harder to extract information than from P. 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 6
  • 10. Introduction Informal Definition of Obfuscation: To obfuscate a program P means to transform it into a executable program P from which it is harder to extract information than from P. Motivation: Obfuscation is last layer of software defense against attackers (e.g. after attacker bypasses OS authentication) Obfuscation raises the bar for reverse engineering 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 6
  • 11. Introduction Informal Definition of Obfuscation: To obfuscate a program P means to transform it into a executable program P from which it is harder to extract information than from P. Motivation: Obfuscation is last layer of software defense against attackers (e.g. after attacker bypasses OS authentication) Obfuscation raises the bar for reverse engineering Popular questions: Wasn’t obfuscation proved to be impossible back in 2001? Isn’t obfuscation the same as security by obscurity? 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 6
  • 12. Outline 1 Introduction 2 Obfuscation in Theory 3 Obfuscation in Practice Static Obfuscation Compiler Optimizations Automated Code Obfuscation Software Diversity Dynamic Obfuscation 4 The Strength of Obfuscation 5 Hands-on Tutorial 6 Conclusions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 7
  • 13. Black Box Obfuscation Formal Definition of Black-Box Obfuscation: A probabilistic algorithm O is an obfuscator if the following conditions hold: For every program P, the obfuscated program O(P) has the same functionality (e.g. input-output behavior) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 8
  • 14. Black Box Obfuscation Formal Definition of Black-Box Obfuscation: A probabilistic algorithm O is an obfuscator if the following conditions hold: For every program P, the obfuscated program O(P) has the same functionality (e.g. input-output behavior) The memory size increase and execution slowdown of O(P) w.r.t. P are less than polynomial 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 8
  • 15. Black Box Obfuscation Formal Definition of Black-Box Obfuscation: A probabilistic algorithm O is an obfuscator if the following conditions hold: For every program P, the obfuscated program O(P) has the same functionality (e.g. input-output behavior) The memory size increase and execution slowdown of O(P) w.r.t. P are less than polynomial Any probabilistic polynomial time attacker only has a negligible probability of guessing any bit of information about P, given O(P) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 8
  • 16. Black Box Obfuscation In 2001 Barak et al. [3]: Proved there exists no general obfuscator applicable to all programs 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
  • 17. Black Box Obfuscation In 2001 Barak et al. [3]: Proved there exists no general obfuscator applicable to all programs Proof performed by giving a counter-example using 2 programs: Cα,β(x) = β if x = α 0k otherwise Dα,β(C) = 1 if C(α) = β 0 otherwise 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
  • 18. Black Box Obfuscation In 2001 Barak et al. [3]: Proved there exists no general obfuscator applicable to all programs Proof performed by giving a counter-example using 2 programs: Cα,β(x) = β if x = α 0k otherwise Dα,β(C) = 1 if C(α) = β 0 otherwise Attacker is defined as A(C, D) = D(C). If A(C, D) = 1 then the attacker is successful 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
  • 19. Black Box Obfuscation In 2001 Barak et al. [3]: Proved there exists no general obfuscator applicable to all programs Proof performed by giving a counter-example using 2 programs: Cα,β(x) = β if x = α 0k otherwise Dα,β(C) = 1 if C(α) = β 0 otherwise Attacker is defined as A(C, D) = D(C). If A(C, D) = 1 then the attacker is successful If the attacker is given O(C) and O(D) then A(O(C), O(D)) = 1 with 100% probability 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
  • 20. Black Box Obfuscation In 2001 Barak et al. [3]: Proved there exists no general obfuscator applicable to all programs Proof performed by giving a counter-example using 2 programs: Cα,β(x) = β if x = α 0k otherwise Dα,β(C) = 1 if C(α) = β 0 otherwise Attacker is defined as A(C, D) = D(C). If A(C, D) = 1 then the attacker is successful If the attacker is given O(C) and O(D) then A(O(C), O(D)) = 1 with 100% probability If the attacker only has black box access to C and D, then the probability of a successful attack is negligible 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
  • 21. Black Box Obfuscation In 2001 Barak et al. [3]: Proved there exists no general obfuscator applicable to all programs Proof performed by giving a counter-example using 2 programs: Cα,β(x) = β if x = α 0k otherwise Dα,β(C) = 1 if C(α) = β 0 otherwise Attacker is defined as A(C, D) = D(C). If A(C, D) = 1 then the attacker is successful If the attacker is given O(C) and O(D) then A(O(C), O(D)) = 1 with 100% probability If the attacker only has black box access to C and D, then the probability of a successful attack is negligible This proof does not imply that every obfuscator fails on a subset of programs 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
  • 22. Black Box Obfuscation In 2001 Barak et al. [3]: Proved there exists no general obfuscator applicable to all programs Proof performed by giving a counter-example using 2 programs: Cα,β(x) = β if x = α 0k otherwise Dα,β(C) = 1 if C(α) = β 0 otherwise Attacker is defined as A(C, D) = D(C). If A(C, D) = 1 then the attacker is successful If the attacker is given O(C) and O(D) then A(O(C), O(D)) = 1 with 100% probability If the attacker only has black box access to C and D, then the probability of a successful attack is negligible This proof does not imply that every obfuscator fails on a subset of programs There may exist non-black-box obfuscators for some programs that leak bits of information, but are “good enough” 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 9
  • 23. Indistinguishability Obfuscation In 2013 Garg et al. [8] proposed indistinguishability obfuscation: The obfuscated versions of 2 semantically equivalent programs cannot be distinguished 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 10
  • 24. Indistinguishability Obfuscation In 2013 Garg et al. [8] proposed indistinguishability obfuscation: The obfuscated versions of 2 semantically equivalent programs cannot be distinguished [9] proved this obfuscation is the best-possible obfuscation 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 10
  • 25. Indistinguishability Obfuscation In 2013 Garg et al. [8] proposed indistinguishability obfuscation: The obfuscated versions of 2 semantically equivalent programs cannot be distinguished [9] proved this obfuscation is the best-possible obfuscation Implementations still far from being practical: [1, 2] 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 10
  • 26. Indistinguishability Obfuscation In 2013 Garg et al. [8] proposed indistinguishability obfuscation: The obfuscated versions of 2 semantically equivalent programs cannot be distinguished [9] proved this obfuscation is the best-possible obfuscation Implementations still far from being practical: [1, 2] Cool stuff, but let’s look at obfuscation we can use in practice. 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 10
  • 27. Outline 1 Introduction 2 Obfuscation in Theory 3 Obfuscation in Practice Static Obfuscation Compiler Optimizations Automated Code Obfuscation Software Diversity Dynamic Obfuscation 4 The Strength of Obfuscation 5 Hands-on Tutorial 6 Conclusions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 11
  • 28. Categorization of Obfuscation What types of obfuscation transformations exist? Static Vs. Dynamic Static obfuscation: Obfuscated programs remain fixed at runtime Raises the bar against static analysis Can be attacked through dynamic techniques (debugging, emulation, tracing) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 12
  • 29. Categorization of Obfuscation What types of obfuscation transformations exist? Static Vs. Dynamic Static obfuscation: Obfuscated programs remain fixed at runtime Raises the bar against static analysis Can be attacked through dynamic techniques (debugging, emulation, tracing) Dynamic obfuscation: Programs change during load-/run-time Raises the bar against dynamic analysis Different executions of the same program with the same input may produce different execution traces 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 12
  • 30. Categorization of Obfuscation What types of obfuscation transformations exist? Static Vs. Dynamic Static obfuscation: Obfuscated programs remain fixed at runtime Raises the bar against static analysis Can be attacked through dynamic techniques (debugging, emulation, tracing) Dynamic obfuscation: Programs change during load-/run-time Raises the bar against dynamic analysis Different executions of the same program with the same input may produce different execution traces Point of insertion: Source code Intermediate representation Machine code 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 12
  • 31. Categorization of Obfuscation What types of obfuscation transformations exist? Static Vs. Dynamic Static obfuscation: Obfuscated programs remain fixed at runtime Raises the bar against static analysis Can be attacked through dynamic techniques (debugging, emulation, tracing) Dynamic obfuscation: Programs change during load-/run-time Raises the bar against dynamic analysis Different executions of the same program with the same input may produce different execution traces Point of insertion: Source code Intermediate representation Machine code Transformation targets: Layout → scramble identifiers and code layout Data → obfuscate data (structures) embedded in code Control flow → obfuscate secret algorithms 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 12
  • 32. Outline 1 Introduction 2 Obfuscation in Theory 3 Obfuscation in Practice Static Obfuscation Compiler Optimizations Automated Code Obfuscation Software Diversity Dynamic Obfuscation 4 The Strength of Obfuscation 5 Hands-on Tutorial 6 Conclusions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 13
  • 33. Compiler Optimizations 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 14
  • 34. Compiler Optimizations Compiler Optimizations In-lining function bodies Loop unrolling Loop-invariant code motion Common sub-expression elimination Constant folding and propagation Dead code elimination Strength reduction more @ http://en.wikipedia.org/wiki/Optimizing_compiler 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 14
  • 35. In-lining function bodies Replace function call by function body Before 1 int foo(int a, int b) { 2 return a + b; 3 } 4 ... 5 c = foo(a, b+1); After 1 ... 2 c = a + b + 1; 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 15
  • 36. Loop unrolling Remove “end-of-loop” test overhead Before 1 ... 2 for (i = 0; i < 2; i++) 3 { 4 a[i] = 0; 5 } After 1 ... 2 a[0] = 0; 3 a[1] = 0; 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 16
  • 37. Loop invariant code-motion Extract operations whose results are independent of loop execution Before 1 ... 2 for (i = 0; i < 2; i++) 3 { 4 a[i] = p + q; 5 } After 1 ... 2 temp = p + q; 3 for (i = 0; i < 2; i++) 4 { 5 a[i] = temp; 6 } 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 17
  • 38. Common subexpression elimination Replace re-occurring identical (sub-)expressions by a single variable holding the result Before 1 ... 2 a = b + (z + 1); 3 p = q + (z + 1); After 1 ... 2 temp = z + 1; 3 a = b + temp; 4 p = q + temp; 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 18
  • 39. Constant folding and propagation Simplify constant expressions and substitute the values of known constants in expressions Before 1 ... 2 a = 3 + 5; 3 b = a + 1; 4 func(a, b); After 1 ... 2 func(8, 9); 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 19
  • 40. Dead code elimination Remove code which does not affect program results: unreachable code and code that affects variables that are irrelevant for the program Before 1 ... 2 a = 1; 3 if (a < 0) 4 { 5 printf("This should never be printed!"); 6 } After 1 ... 2 a = 1; 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 20
  • 41. Strength reduction Replace expensive operations with equivalent cheap ones Before 1 ... 2 y = x / 8; 3 p = q * 15; After 1 ... 2 y = x >> 3; 3 p = (q << 4) - x; 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 21
  • 42. Outline 1 Introduction 2 Obfuscation in Theory 3 Obfuscation in Practice Static Obfuscation Compiler Optimizations Automated Code Obfuscation Software Diversity Dynamic Obfuscation 4 The Strength of Obfuscation 5 Hands-on Tutorial 6 Conclusions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 22
  • 43. Automated Code Obfuscation 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 23
  • 44. Automated Code Obfuscation Obfuscation Techniques: Scramble identifiers Instruction substitution Garbage code insertion Merging and splitting functions Encode Literals Encode Arithmetic Opaque predicates Control-flow flattening Virtualization obfuscation White-box cryptography 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 23
  • 45. Scramble identifiers Replace identifier names with random strings Before 1 ... 2 sum = 0; 3 for (i = 0; i < arr_len; i++) 4 sum += arr[i]; 5 average /= arr_len; After 1 ... 2 za82b547bcb = 0; 3 for (z1c0ab7cf0c = 0; z1c0ab7cf0c < za862d19cbc; z1c0ab7cf0c ++) 4 za82b547bcb += zc1c28ca67f[ z1c0ab7cf0c ]; 5 z8c8f7c7867 /= za862d19cbc This layout obfuscation has high potency, but low resilience 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 24
  • 46. Instruction substitution Replace binary operation (e.g. +, -, AND, OR, XOR, etc.) by functionally equivalent, but more complicated computations Before 1 ... 2 a = b + c After 1 ... 2 r = rand (); 3 a = b + r; 4 a = a + c; 5 a = a - r; 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 25
  • 47. Garbage code insertion Insert code that executes, but does not affect the IO behavior Before 1 ... 2 sum = 0; 3 for (i = 0; i < arr_len; i++) 4 sum += arr[i]; 5 average = sum / arr_len; After 1 ... 2 sum = 0; prod = 1; 3 for (i = 0; i < arr_len; i++) { 4 sum += arr[i]; 5 prod *= arr[i]; 6 } 7 average = sqrt(prod); 8 average = sum / arr_len; 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 26
  • 48. Merging and splitting functions Merging implies combining the code of two or more functions into a single function Splitting implies dividing the code one function into two or more functions Split 1 func1(int a, int b) { 2 x = 4; 3 if (a < 3) 4 x = x + 6; 5 x *= b; 6 } 7 8 func2(int a, int c) { 9 y = a + 12; 10 y = y/c; 11 } Merged 1 func3(int a, int b, int c) { 2 if (c % 2 == 0) { 3 x = 4; 4 if (a < 3) 5 x = x + 6; 6 x *= b; 7 } else { 8 y = a + 12; 9 y = y/b; 10 } 11 } 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 27
  • 49. Encode Literals Literals are constant (hard-coded) strings and numeric values Literals can be encoded in numerous ways using encoder functions At runtime they are decoded back to their original value Before 1 main(int ac , char* av[]) { 2 printf("hellon"); 3 return 0; 4 } After 1 gen_str(char str []) { 2 int i = 0; 3 str[i++] = ’h’; 4 str[i++] = ’e’; 5 str[i++] = ’l’; 6 str[i++] = ’l’; 7 str[i++] = ’o’; 8 str[i++] = ’n’; 9 str[i] = ’000 ’; 10 } 11 12 main(int ac , char* av[]) { 13 char str [7]; 14 gen_str(str); 15 printf(str); 16 return 0; 17 } 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 28
  • 50. Encode Arithmetic (Mixed-Boolean Arithmetic) Replace arithmetic or Boolean expressions with more complex ones Complex expressions contain both arithmetic and boolean operators Transformation can be applied recursively to increase strength Before 1 func(int x, int y) { 2 return x + y; 3 } After (Version 1) 1 func(int x, int y) { 2 return 2*(x | y) - (x ^ y); 3 } After (Version 2) 1 func(int x, int y) { 2 return (x | y) + (x & y); 3 } 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 29
  • 51. Opaque predicates and opaque expressions [6] Informal Definition: An expression whose value is known to the defender (at obfuscation time), but which is difficult for an attacker to figure out statically. 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 30
  • 52. Opaque predicates and opaque expressions [6] Informal Definition: An expression whose value is known to the defender (at obfuscation time), but which is difficult for an attacker to figure out statically. Notation: PT for an opaquely true predicate PF for an opaquely false predicate P? for an opaquely intermediate predicate (range divider) E=v for an opaque expression of value v 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 30
  • 53. Opaque predicates and opaque expressions [6] Informal Definition: An expression whose value is known to the defender (at obfuscation time), but which is difficult for an attacker to figure out statically. Notation: PT for an opaquely true predicate PF for an opaquely false predicate P? for an opaquely intermediate predicate (range divider) E=v for an opaque expression of value v true false true false true false P? PT PF true false true false true false2|(x2 + x) x mod2=0x2 < 0 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 30
  • 54. Bogus control-flow via opaque predicates Opaque predicates facilitate insertion of bogus control-flow: Does not affect I/O behavior Attacker does not know which code is bogus Increases attack / analysis time Resilience of bogus control-flow reduced to resilience of opaque predicates true false true false true false P? PT PF true false true false true false2|(x2 + x) x mod2=0x2 < 0 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 31
  • 55. Bogus control-flow via opaque predicates E.g. add an opaquely true predicate (PT ) to a while loop condition (P) Before 1 i = 1; 2 while (i < 100) { 3 dostuff(i); 4 i++; 5 } After 1 i = 1; j = 100; 2 while ((i < 100) && 3 (j*j*(j+1) *(j+1)%4 == 0)) { 4 dostuff(i); 5 i++; 6 j = j*i+3; 7 } true false false false truetrue P PTP 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 32
  • 56. Breaking opaque predicates via abstract interp. Dalla Preda et al. [7] used abstract interpretation to break opaque predicates: Opaque predicates are confined in a single basic block Only opaque predicate of the following form: n|p(x), ∀x ∈ Z, where p(x) is a polynomial in x and n ∈ N 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 33
  • 57. Breaking opaque predicates via abstract interp. Dalla Preda et al. [7] used abstract interpretation to break opaque predicates: Opaque predicates are confined in a single basic block Only opaque predicate of the following form: n|p(x), ∀x ∈ Z, where p(x) is a polynomial in x and n ∈ N E.g. opaquely true predicate: 2|(x2 + x), ∀x ∈ Z 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 33
  • 58. Breaking opaque predicates via abstract interp. Dalla Preda et al. [7] used abstract interpretation to break opaque predicates: Opaque predicates are confined in a single basic block Only opaque predicate of the following form: n|p(x), ∀x ∈ Z, where p(x) is a polynomial in x and n ∈ N E.g. opaquely true predicate: 2|(x2 + x), ∀x ∈ Z x is odd 1 x = ...; // any odd number 2 y = x * x; // odd 3 y = y + x; // even 4 if ( y % 2 == 0) // always 5 ... // true 6 else // dead branch 7 ... 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 33
  • 59. Breaking opaque predicates via abstract interp. Dalla Preda et al. [7] used abstract interpretation to break opaque predicates: Opaque predicates are confined in a single basic block Only opaque predicate of the following form: n|p(x), ∀x ∈ Z, where p(x) is a polynomial in x and n ∈ N E.g. opaquely true predicate: 2|(x2 + x), ∀x ∈ Z x is odd 1 x = ...; // any odd number 2 y = x * x; // odd 3 y = y + x; // even 4 if ( y % 2 == 0) // always 5 ... // true 6 else // dead branch 7 ... x is even 1 x = ...; // any even number 2 y = x * x; // even 3 y = y + x; // even 4 if ( y % 2 == 0) // always 5 ... // true 6 else // dead branch 7 ... 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 33
  • 60. Breaking opaque predicates via abstract interp. Dalla Preda et al. [7] used abstract interpretation to break opaque predicates: Opaque predicates are confined in a single basic block Only opaque predicate of the following form: n|p(x), ∀x ∈ Z, where p(x) is a polynomial in x and n ∈ N E.g. opaquely true predicate: 2|(x2 + x), ∀x ∈ Z x is odd 1 x = ...; // any odd number 2 y = x * x; // odd 3 y = y + x; // even 4 if ( y % 2 == 0) // always 5 ... // true 6 else // dead branch 7 ... x is even 1 x = ...; // any even number 2 y = x * x; // even 3 y = y + x; // even 4 if ( y % 2 == 0) // always 5 ... // true 6 else // dead branch 7 ... Abstract interpretation is able to infer that regardless of x’s value, the IF condition is always true 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 33
  • 61. Opaquely intermediate predicate (Range divider) Range dividers can lead to different paths in the code No dead code All branches have the same behavior but different syntax Before 1 int main(int ac , char* av []){ 2 char *str = av [1]; 3 int hash = 0; 4 for(int i = 0; 5 i < strlen(str); 6 str++, i++) { 7 hash = (hash <<7) ^(* str); 8 } 9 if (hash == 809267) 10 printf("You win!"); 11 return 0; 12 } After 1 int main(int ac , char* av []){ 2 char *str = av [1]; 3 int hash = 0; 4 for(int i = 0; 5 i < strlen(str); 6 str++, i++) { 7 char chr = *str; 8 if (chr > 42) { 9 hash = (hash << 7) ^ chr; 10 } else { 11 hash = (hash * 128) ^ chr; 12 } 13 } 14 if (hash == 809267) 15 printf("You win!"); 16 return 0; 17 } 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 34
  • 62. Control-flow flattening Remove the control-flow structure of functions: 1. Put each basic block as a case inside a switch statement 2. Wrap the switch inside an infinite loop 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 35
  • 63. Control-flow flattening Remove the control-flow structure of functions: 1. Put each basic block as a case inside a switch statement 2. Wrap the switch inside an infinite loop Let’s take one function, e.g. GCD 1 int gcd(int a, int b){ 2 while (a != b) 3 if (a > b) 4 a = a - b; 5 else 6 b = b - a; 7 return a; 8 } 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 35
  • 64. Control-flow flattening GCD example Before 1 int gcd(int a, int b){ 2 while (a != b) 3 if (a > b) 4 a = a - b; 5 else 6 b = b - a; 7 return a; 8 } After 1 int gcd(int a, int b){ 2 int next = 0; 3 while (1) { 4 switch(next) { 5 case 0: 6 next = (int)(a != b) + 1; 7 break; 8 case 1: return a; 9 break; 10 case 2: 11 next = (a > b) + 3; 12 break; 13 case 3: b = b - a; 14 next = 0; 15 break; 16 case 4: a = a - b; 17 next = 0; 18 break; 19 default: 20 break; 21 }}} 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 36
  • 65. Control-flow flattening GCD example The CFG of the resulting code: 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 37
  • 66. Control-flow flattening discussion Performance penalty: For 3 SPEC programs: 4× slowdown, 2× size Reasons: The wrapper loop condition check, plus jump The switch next value check, plus indirect jump How to optimize: Keep tight loops as one switch entry (don’t split) Use gcc’s labels-as-values → allows jumping to next basic block 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 38
  • 67. Control-flow flattening discussion Performance penalty: For 3 SPEC programs: 4× slowdown, 2× size Reasons: The wrapper loop condition check, plus jump The switch next value check, plus indirect jump How to optimize: Keep tight loops as one switch entry (don’t split) Use gcc’s labels-as-values → allows jumping to next basic block Attack on control-flow flattening: 1. Find next block of every basic block 2. Rebuild original CFG 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 38
  • 68. Control-flow flattening discussion Performance penalty: For 3 SPEC programs: 4× slowdown, 2× size Reasons: The wrapper loop condition check, plus jump The switch next value check, plus indirect jump How to optimize: Keep tight loops as one switch entry (don’t split) Use gcc’s labels-as-values → allows jumping to next basic block Attack on control-flow flattening: 1. Find next block of every basic block 2. Rebuild original CFG Mitigation: assign opaque expressions (E=v ) to next Question: How do we build such opaque expressions? 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 38
  • 69. Opaque expressions from array aliasing 1. A statically initialized array with seemingly random values: g = 10 5 13 3 27 5 24 38 0 73 115 3 66 60 17 31 2. The values are generated such that some invariants hold, e.g.: Every 3rd cell starting from cell 0, contains a value v ≡ 3 mod 7 Every 3rd cell starting from cell 1, contains a value v ≡ 5 mod 11 Cells 2, 5, 8, 11 and 14 contain values 13, 5, 0, 3, respectively 17 3. Update array cells with values that respect invariants 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 39
  • 70. Opaque expressions from array aliasing 1. A statically initialized array with seemingly random values: g = 10 5 13 3 27 5 24 38 0 73 115 3 66 60 17 31 2. The values are generated such that some invariants hold, e.g.: Every 3rd cell starting from cell 0, contains a value v ≡ 3 mod 7 Every 3rd cell starting from cell 1, contains a value v ≡ 5 mod 11 Cells 2, 5, 8, 11 and 14 contain values 13, 5, 0, 3, respectively 17 3. Update array cells with values that respect invariants Let’s replace right-hand values 0, 1, 2, 3 and 4 of assignments to next with E=0 , E=1 , E=2 , E=3 , respectively E=4 : next = 0 → next = g[3] % g[11] - g[8] next = 1 → next = 3 * g[11] - 4 * (g[4] % g[5]) next = 2 → next = g[5] - g[3] next = 3 → next = g[2] % g[1] next = 4 → next = g[15] - g[4] Next slide shows how this looks in the flattened GCD example 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 39
  • 71. Control-flow flattening + opaque expressions 1 int gcd(int a, int b){ 2 int g[] = {10, 5, 13, 3, 27, 5, 24, 38, 0, 73, 115, 3, 66, 60, 17, 31}; 3 int next = g[3] % g[11] - g[11]; // 0 4 while (1) { 5 switch(next) { 6 case 0: 7 if(a != b) 8 next = 3 * g[11] - 4 * (g[4] % g[5]); // 1 9 else next = g[5] - g[3]; // 2 10 break; 11 case 1: 12 if (a > b) next = g[2] % g[1]; // 3 13 else next = g[15] - g[4]; // 4 14 break; 15 case 2: return a; 16 break; 17 case 3: a = a - b; 18 next = g[3] % g[11] - g[8]; // 0 19 break; 20 case 4: b = b - a; 21 next = g[3] % g[11] - g[8]; // 0 22 break; 23 default: 24 break; 25 }}} 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 40
  • 72. Opaque expressions from pointer aliasing Assumption: pointer aliasing is a computationally hard static analysis problem 1. Construct one or more linked-lists 2. Set pointers into those linked-lists 3. Create opaque predicates by checking properties you know to be true / false, e.g.: → q1 points to a node with value v > 53 → q2 points to a node with value v ≤ 53 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 41
  • 73. Opaque expressions from pointer aliasing Assumption: pointer aliasing is a computationally hard static analysis problem 1. Construct one or more linked-lists 2. Set pointers into those linked-lists 3. Create opaque predicates by checking properties you know to be true / false, e.g.: → q1 points to a node with value v > 53 → q2 points to a node with value v ≤ 53 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 41
  • 74. Opaque expressions from pointer aliasing One opaquely true predicate: (∗q1 > ∗q2)T After performing several operations to confuse alias analysis another opaque predicate is: (q1 = q2)T 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 42
  • 75. Virtualization Obfuscation Goal: Hide secret algorithm of program P 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 43
  • 76. Virtualization Obfuscation Goal: Hide secret algorithm of program P Obfuscation Procedure: 1. Generate random bytecode ISA (L) covering all instructions of P 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 43
  • 77. Virtualization Obfuscation Goal: Hide secret algorithm of program P Obfuscation Procedure: 1. Generate random bytecode ISA (L) covering all instructions of P 2. Translate P to L bytecode program 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 43
  • 78. Virtualization Obfuscation Goal: Hide secret algorithm of program P Obfuscation Procedure: 1. Generate random bytecode ISA (L) covering all instructions of P 2. Translate P to L bytecode program 3. Generate emulator to interpret L bytecode on x86 machine 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 43
  • 79. Virtualization Obfuscation Goal: Hide secret algorithm of program P Obfuscation Procedure: 1. Generate random bytecode ISA (L) covering all instructions of P 2. Translate P to L bytecode program 3. Generate emulator to interpret L bytecode on x86 machine Obfuscated program (P ) consists of bytecode program and emulator 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 43
  • 80. Virtualization Obfuscation Example [11] We apply virtualization obfuscation to function foo, written in C 1 void foo(int x){ 2 int y = 10; 3 y++; 4 y++; 5 if (x > 0){ 6 y++; 7 } 8 else {} 9 printf("%dn",y); 10 } Figure : CFG of foo 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 44
  • 81. Virtualization Obfuscation Example Step 1: Generate random bytecode ISA covering all instructions of foo 1 void foo(int x){ 2 int y = 10; 3 y++; 4 y++; 5 if (x > 0){ 6 y++; 7 } 8 else {} 9 printf("%dn",y); 10 } Random ISA: 1. Integer assignment (line 2) encode −−−−→ 52, LH op, RH op 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 45
  • 82. Virtualization Obfuscation Example Step 1: Generate random bytecode ISA covering all instructions of foo 1 void foo(int x){ 2 int y = 10; 3 y++; 4 y++; 5 if (x > 0){ 6 y++; 7 } 8 else {} 9 printf("%dn",y); 10 } Random ISA: 1. Integer assignment (line 2) encode −−−−→ 52, LH op, RH op 2. Integer increment (lines 3,4,6) encode −−−−→ 03, op 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 45
  • 83. Virtualization Obfuscation Example Step 1: Generate random bytecode ISA covering all instructions of foo 1 void foo(int x){ 2 int y = 10; 3 y++; 4 y++; 5 if (x > 0){ 6 y++; 7 } 8 else {} 9 printf("%dn",y); 10 } Random ISA: 1. Integer assignment (line 2) encode −−−−→ 52, LH op, RH op 2. Integer increment (lines 3,4,6) encode −−−−→ 03, op 3. Branch if > 0 (line 5) encode −−−−→ 08, tst op, jmp offset 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 45
  • 84. Virtualization Obfuscation Example Step 1: Generate random bytecode ISA covering all instructions of foo 1 void foo(int x){ 2 int y = 10; 3 y++; 4 y++; 5 if (x > 0){ 6 y++; 7 } 8 else {} 9 printf("%dn",y); 10 } Random ISA: 1. Integer assignment (line 2) encode −−−−→ 52, LH op, RH op 2. Integer increment (lines 3,4,6) encode −−−−→ 03, op 3. Branch if > 0 (line 5) encode −−−−→ 08, tst op, jmp offset 4. Call to printf (line 9) encode −−−−→ 18, op 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 45
  • 85. Virtualization Obfuscation Example Variables and constants of bytecode program stored in array: data data[0] represents variable x data[1] represents variable y data[2] represents constant for initialization to 10 (line 2 of foo) data[3] represents constant jump offset of conditional branch 1 void foo(int x){ 2 int y = 10; 3 y++; 4 y++; 5 if (x > 0){ 6 y++; 7 } 8 else {} 9 printf("%dn",y); 10 } Random ISA: 1. Integer assignment (line 2) encode −−−−→ 52, LH op, RH op 2. Integer increment (lines 3,4,6) encode −−−−→ 03, op 3. Branch if > 0 (line 5) encode −−−−→ 08, tst op, jmp offset 4. Call to printf (line 9) encode −−−−→ 18, op 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 46
  • 86. Virtualization Obfuscation Example Step 2: Translate foo to bytecode program data = {00,00,10,05} // {x, y, init const, jmp offset} Bytecode: {52, 01, 02, 03, 01, 03, 01, 08, 00, 03, 03, 01, 18, 01, 00} 1 void foo(int x){// bytecode 2 int y = 10; // 52, 01, 02 3 y++; // 03, 01 4 y++; // 03, 01 5 if (x > 0){ // 08, 00, 03 6 y++; // 03, 01 7 } 8 else {} 9 printf("%d",y); // 18, 01 10 } // 00 Random ISA: 1. Integer assignment (line 2) encode −−−−→ 52, LH op, RH op 2. Integer increment (lines 3,4,6) encode −−−−→ 03, op 3. Branch if > 0 (line 5) encode −−−−→ 08, tst op, jmp offset 4. Call to printf (line 9) encode −−−−→ 18, op 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 47
  • 87. Virtualization Obfuscation Example Step 3: Generate emulator to interpret bytecode on x86 machine int data = {00,00,10,05}; {x, y, init const, jmp off} int code = { 52, 01, 02, 03, 01, 03, 01, 08, 00, 03, 03, 01, 18, 01, 00}; 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 48
  • 88. Virtualization Obfuscation Example Step 3: Generate emulator to interpret bytecode on x86 machine int data = {00,00,10,05}; {x, y, init const, jmp off} int code = { 52, 01, 02, 03, 01, 03, 01, 08, 00, 03, 03, 01, 18, 01, 00}; 1 int vpc = 0, op1 , op2; 2 while (true) { 3 switch(code[vpc]) { 4 case 03: // increment 5 op1 = code[vpc + 1]; 6 data[op1 ]++; 7 vpc += 2; break; 8 case 08: // conditional jump 9 op1 = code[vpc + 1]; 10 op2 = code[vpc + 2]; 11 if (data[op1] > 0) 12 vpc += 3; 13 else 14 vpc += data[op2]; 15 break; 16 case 18: // call printf 17 op1 = code[vpc + 1]; 18 printf("%dn", data[op1]); 19 vpc += 2; break; 20 case 52: // assignment 21 op1 = code[vpc + 1]; 22 op2 = code[vpc + 2]; 23 data[op1] = data[op2 ]; 24 vpc += 3; break; 25 default: return; // halt 26 } // end switch 27 } // end while 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 48
  • 89. Virtualization Obfuscation Example Figure : Control flow graph of obfuscated foo 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 49
  • 90. Virtualization Obfuscation Goal: obfuscate secret algorithm (not secret data) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 50
  • 91. Virtualization Obfuscation Goal: obfuscate secret algorithm (not secret data) Strengths: Random ISA, different for each instance 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 50
  • 92. Virtualization Obfuscation Goal: obfuscate secret algorithm (not secret data) Strengths: Random ISA, different for each instance Control-flow flattening → Different locations in original program mapped to same case branch 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 50
  • 93. Virtualization Obfuscation Goal: obfuscate secret algorithm (not secret data) Strengths: Random ISA, different for each instance Control-flow flattening → Different locations in original program mapped to same case branch Many other obfuscation techniques can be added on top of: Each case branch The switch statement The data and code arrays 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 50
  • 94. Virtualization Obfuscation Goal: obfuscate secret algorithm (not secret data) Strengths: Random ISA, different for each instance Control-flow flattening → Different locations in original program mapped to same case branch Many other obfuscation techniques can be added on top of: Each case branch The switch statement The data and code arrays Tools: CodeVirtualizer (59 e - 119 e ) VMProtect (69 e - 599 e ) Tigress, Diablo (Free research tools) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 50
  • 95. White-Box Cryptography (WBC) Idea: Embed the key inside the cipher (e.g. in S-boxes) Convert cipher computation into large network of look-up tables 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 51
  • 96. White-Box Cryptography (WBC) Idea: Embed the key inside the cipher (e.g. in S-boxes) Convert cipher computation into large network of look-up tables Source: http://www.whiteboxcrypto.com/ 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 51
  • 97. White-Box Cryptography (WBC) Idea: Embed the key inside the cipher (e.g. in S-boxes) Convert cipher computation into large network of look-up tables Attack: WB-AES and WB-DES broken (222 operations) Source: http://www.whiteboxcrypto.com/ 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 51
  • 98. Some Details on WBC Before 1 char xor(char inputs) 2 { 3 char a = inputs & 0x0F; 4 char b = inputs >> 4; 5 return a ^ b; 6 } After 1 char lut [256] = 2 { 3 0x00 , 0x01 , 0x02 , ..., 0x0F , 4 0x01 , 0x00 , 0x03 , ..., 0x0E , 5 0x02 , 0x03 , 0x00 , ..., 0x0D , 6 | | 7 0x0F , 0x0E , 0x0D , ..., 0x00 8 }; 9 10 char xor(char inputs) 11 { 12 return lut[inputs ]; 13 } 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 52
  • 99. Some Details on WBC Before 1 char xor(char inputs) 2 { 3 char a = inputs & 0x0F; 4 char b = inputs >> 4; 5 return a ^ b; 6 } After 1 char lut [256] = 2 { 3 0x00 , 0x01 , 0x02 , ..., 0x0F , 4 0x01 , 0x00 , 0x03 , ..., 0x0E , 5 0x02 , 0x03 , 0x00 , ..., 0x0D , 6 | | 7 0x0F , 0x0E , 0x0D , ..., 0x00 8 }; 9 10 char xor(char inputs) 11 { 12 return lut[inputs ]; 13 } Source: PhD presentation of B. Wyseur 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 52
  • 100. Outline 1 Introduction 2 Obfuscation in Theory 3 Obfuscation in Practice Static Obfuscation Compiler Optimizations Automated Code Obfuscation Software Diversity Dynamic Obfuscation 4 The Strength of Obfuscation 5 Hands-on Tutorial 6 Conclusions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 53
  • 101. Software Diversity 1. Software developer distributes software X to all end-users 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 54
  • 102. Software Diversity 1. Software developer distributes software X to all end-users 2. Some end-users are MATE attackers 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 54
  • 103. Software Diversity 1. Software developer distributes software X to all end-users 2. Some end-users are MATE attackers 3. MATE reverse engineers X and builds a hijacker of X 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 54
  • 104. Software Diversity 1. Software developer distributes software X to all end-users 2. Some end-users are MATE attackers 3. MATE reverse engineers X and builds a hijacker of X 4. MATE distributes hijacker to other end-users of X 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 54
  • 105. Software Diversity What can we do to protect victims? 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 55
  • 106. Software Diversity What can we do to protect victims? Give everyone a different version 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 55
  • 107. Software Diversity What can we do to protect victims? Give everyone a different version Generate 1000s of different versions using obfuscation 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 55
  • 108. Software Diversity What can we do to protect victims? Give everyone a different version Generate 1000s of different versions using obfuscation Assumption: Same attack will not work on different binaries 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 55
  • 109. Software Diversity What can we do to protect victims? Give everyone a different version Generate 1000s of different versions using obfuscation Assumption: Same attack will not work on different binaries Issues: Analyzing crash-dumps Incremental updates Digitally signing all versions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 55
  • 110. Pre- vs. Post-Distribution Software Diversity Pre-Distribution Software Diversity Obfuscation transformations involve randomly generated code & data Many different binaries generated by software developer Different binaries distributed to different end-users All previously presented techniques can be used for pre-distribution diversification 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 56
  • 111. Pre- vs. Post-Distribution Software Diversity Pre-Distribution Software Diversity Obfuscation transformations involve randomly generated code & data Many different binaries generated by software developer Different binaries distributed to different end-users All previously presented techniques can be used for pre-distribution diversification Post-Distribution Software Diversity Software developer embeds self-modifying code in application All end-users may get the same binary from software developer Code may change differently for different users depending on inputs Next we present dynamic obfuscation techniques which can be used for post-distribution diversification 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 56
  • 112. Outline 1 Introduction 2 Obfuscation in Theory 3 Obfuscation in Practice Static Obfuscation Compiler Optimizations Automated Code Obfuscation Software Diversity Dynamic Obfuscation 4 The Strength of Obfuscation 5 Hands-on Tutorial 6 Conclusions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 57
  • 113. Dynamic Obfuscation A dynamic obfuscator runs in two phases: 1. At compile-time: Transform the program to an initial configuration Add a runtime code-transformer 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 58
  • 114. Dynamic Obfuscation A dynamic obfuscator runs in two phases: 1. At compile-time: Transform the program to an initial configuration Add a runtime code-transformer 2. At runtime: Interleave execution of the program with calls to the transformer T T changes the code segment content at runtime Ideally a non-repeating series of configurations, in practice they repeat 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 58
  • 115. Dynamic Obfuscation Techniques General Techniques: Build-and-execute: generate code for a routine at runtime, and jump to it Self-modification: modify the executable code Encryption: the self-modification is decrypting the encrypted code before executing it Move code: every time the code executes, it is in a different location 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 59
  • 116. Dynamic Obfuscation Techniques General Techniques: Build-and-execute: generate code for a routine at runtime, and jump to it Self-modification: modify the executable code Encryption: the self-modification is decrypting the encrypted code before executing it Move code: every time the code executes, it is in a different location These techniques can be applied at the following levels of granularity: File level Function level Basic block level Instruction level 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 59
  • 117. Dynamic Obfuscation Techniques General Techniques: Build-and-execute: generate code for a routine at runtime, and jump to it Self-modification: modify the executable code Encryption: the self-modification is decrypting the encrypted code before executing it Move code: every time the code executes, it is in a different location These techniques can be applied at the following levels of granularity: File level Function level Basic block level Instruction level The attacker’s goals can be to: Recover the original code Modify the original code 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 59
  • 118. Replacing instructions (Kanzaki [10]) Motivation: prevent code recovery via memory snapshot 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 60
  • 119. Replacing instructions (Kanzaki [10]) Motivation: prevent code recovery via memory snapshot Algorithm idea: 1. Replace real instructions by bogus instructions 2. Just before execution, replace bogus instruction with real instruction 3. After execution, replace real instruction with bogus instruction 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 60
  • 120. Replacing instructions (Kanzaki [10]) Motivation: prevent code recovery via memory snapshot Algorithm idea: 1. Replace real instructions by bogus instructions 2. Just before execution, replace bogus instruction with real instruction 3. After execution, replace real instruction with bogus instruction Implementation by choosing 3 points A, B and C in the CFG: 1. All paths to B must flow through A and all paths from B must flow through C 2. A replaces bogus instruction at B with real instruction 3. C replaces real instruction at B with bogus instruction 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 60
  • 121. Dynamic Code Merging (Madou [12]) Motivation: keep code in constant flux 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 61
  • 122. Dynamic Code Merging (Madou [12]) Motivation: keep code in constant flux Algorithm idea: 1. Have two or more functions share the same location in memory 2. Create templates for functions that share same location 3. Before a function is called, patch memory using edit script to load it 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 61
  • 123. Dynamic Code Merging (Madou [12]) Motivation: keep code in constant flux Algorithm idea: 1. Have two or more functions share the same location in memory 2. Create templates for functions that share same location 3. Before a function is called, patch memory using edit script to load it Implementation example for program with 2 functions f1 and f2: 1. Create template T with the same size as the largest function, i.e. f1 2. T contains values at memory offsets which are common for f1 and f2, i.e. offsets 1 and 2 3. T contains wildcard values at all other memory offsets f1 f2 memory offset: 0 1 2 3 4 0 1 2 3 memory value: b7 48 a0 53 fa e9 48 a0 33 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 61
  • 124. Dynamic Code Merging (Madou [12]) Implementation example for program with 2 functions f1 and f2: 1. Create template T with the same size as the largest function, i.e. f1 2. T contains values at memory offsets which are common for f1 and f2, i.e. offsets 1 and 2 3. T contains wildcard values at all other memory offsets f1 f2 memory offset: 0 1 2 3 4 0 1 2 3 memory value: b7 48 a0 53 fa e9 48 a0 33 T Edit Scripts memory offset: 0 1 2 3 4 e1 = [0 → b7, 3 → 53, 4 → fa] memory value: ? 48 a0 ? ? e2 = [0 → e9, 3 → 33] 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 62
  • 125. Dynamic Code Merging (Madou [12]) Implementation example for program with 2 functions f1 and f2: 1. Create template T with the same size as the largest function, i.e. f1 2. T contains values at memory offsets which are common for f1 and f2, i.e. offsets 1 and 2 3. T contains wildcard values at all other memory offsets 4. Create edit scripts e1 and e2 which replace wildcards of T to load the functions f1, respectively f2 f1 f2 memory offset: 0 1 2 3 4 0 1 2 3 memory value: b7 48 a0 53 fa e9 48 a0 33 T Edit Scripts memory offset: 0 1 2 3 4 e1 = [0 → b7, 3 → 53, 4 → fa] memory value: ? 48 a0 ? ? e2 = [0 → e9, 3 → 33] 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 62
  • 126. Dynamic Decryption and Re-encryption 1. Execute current basic block 2. At some point in the current basic block decrypt the next basic block 3. Decryption key could be hash of some other basic block 4. Jump to decrypted basic block 5. Encrypt the previously executed basic block 6. Go to step 1 Source: http: //tigress.cs.arizona.edu/ 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 63
  • 127. Dynamic Decryption and Re-encryption 1. Execute current basic block 2. At some point in the current basic block decrypt the next basic block 3. Decryption key could be hash of some other basic block 4. Jump to decrypted basic block 5. Encrypt the previously executed basic block 6. Go to step 1 Source: http: //tigress.cs.arizona.edu/ 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 63
  • 128. Dynamic Decryption and Re-encryption 1. Execute current basic block 2. At some point in the current basic block decrypt the next basic block 3. Decryption key could be hash of some other basic block 4. Jump to decrypted basic block 5. Encrypt the previously executed basic block 6. Go to step 1 Source: http: //tigress.cs.arizona.edu/ 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 63
  • 129. Dynamic Decryption and Re-encryption 1. Execute current basic block 2. At some point in the current basic block decrypt the next basic block 3. Decryption key could be hash of some other basic block 4. Jump to decrypted basic block 5. Encrypt the previously executed basic block 6. Go to step 1 Source: http: //tigress.cs.arizona.edu/ 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 63
  • 130. Dynamic Decryption and Re-encryption 1. Execute current basic block 2. At some point in the current basic block decrypt the next basic block 3. Decryption key could be hash of some other basic block 4. Jump to decrypted basic block 5. Encrypt the previously executed basic block 6. Go to step 1 Source: http: //tigress.cs.arizona.edu/ 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 63
  • 131. Dynamic Decryption and Re-encryption 1. Execute current basic block 2. At some point in the current basic block decrypt the next basic block 3. Decryption key could be hash of some other basic block 4. Jump to decrypted basic block 5. Encrypt the previously executed basic block 6. Go to step 1 Source: http: //tigress.cs.arizona.edu/ 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 63
  • 132. Outline 1 Introduction 2 Obfuscation in Theory 3 Obfuscation in Practice Static Obfuscation Compiler Optimizations Automated Code Obfuscation Software Diversity Dynamic Obfuscation 4 The Strength of Obfuscation 5 Hands-on Tutorial 6 Conclusions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 64
  • 133. Classical Obfuscation Strength Dimensions According to Collberg et al. [5] we should keep the following 4 factors in mind when choosing obfuscation transformations: 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 65
  • 134. Classical Obfuscation Strength Dimensions According to Collberg et al. [5] we should keep the following 4 factors in mind when choosing obfuscation transformations: Potency → comprehensibility of code by humans Time and skills needed to understand how the code works Directly proportional with software complexity and size 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 65
  • 135. Classical Obfuscation Strength Dimensions According to Collberg et al. [5] we should keep the following 4 factors in mind when choosing obfuscation transformations: Potency → comprehensibility of code by humans Time and skills needed to understand how the code works Directly proportional with software complexity and size Stealth → identifiability of obfuscated code Time and computation needed to identify the parts of the code which were obfuscated 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 65
  • 136. Classical Obfuscation Strength Dimensions According to Collberg et al. [5] we should keep the following 4 factors in mind when choosing obfuscation transformations: Potency → comprehensibility of code by humans Time and skills needed to understand how the code works Directly proportional with software complexity and size Stealth → identifiability of obfuscated code Time and computation needed to identify the parts of the code which were obfuscated Resilience → resistance against automatic deobfuscation Time and computation needed to deobfuscate the code with a software tool Plus resources needed to build such a tool 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 65
  • 137. Classical Obfuscation Strength Dimensions According to Collberg et al. [5] we should keep the following 4 factors in mind when choosing obfuscation transformations: Potency → comprehensibility of code by humans Time and skills needed to understand how the code works Directly proportional with software complexity and size Stealth → identifiability of obfuscated code Time and computation needed to identify the parts of the code which were obfuscated Resilience → resistance against automatic deobfuscation Time and computation needed to deobfuscate the code with a software tool Plus resources needed to build such a tool Cost → performance and resource overhead of obfuscation Time overhead added to program execution after obfuscation Size of the obfuscated binary relative to original binary 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 65
  • 138. The Strength of Obfuscation Question: Which obfuscation transformation is better? 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
  • 139. The Strength of Obfuscation Question: Which obfuscation transformation is better? Answer: It depends ... 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
  • 140. The Strength of Obfuscation Question: Which obfuscation transformation is better? Answer: It depends ... What do you want to protect?: intellectual property keys integrity of software 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
  • 141. The Strength of Obfuscation Question: Which obfuscation transformation is better? Answer: It depends ... What do you want to protect?: intellectual property keys integrity of software What kind of attacker are you facing? (e.g. amateur hacker, professional organization, nation state) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
  • 142. The Strength of Obfuscation Question: Which obfuscation transformation is better? Answer: It depends ... What do you want to protect?: intellectual property keys integrity of software What kind of attacker are you facing? (e.g. amateur hacker, professional organization, nation state) What is economically feasible for the attacker? (e.g. static attacks, dynamic attacks) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
  • 143. The Strength of Obfuscation Question: Which obfuscation transformation is better? Answer: It depends ... What do you want to protect?: intellectual property keys integrity of software What kind of attacker are you facing? (e.g. amateur hacker, professional organization, nation state) What is economically feasible for the attacker? (e.g. static attacks, dynamic attacks) What is the impact of protection on performance? 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
  • 144. The Strength of Obfuscation Question: Which obfuscation transformation is better? Answer: It depends ... What do you want to protect?: intellectual property keys integrity of software What kind of attacker are you facing? (e.g. amateur hacker, professional organization, nation state) What is economically feasible for the attacker? (e.g. static attacks, dynamic attacks) What is the impact of protection on performance? Example Scenario: Protecting a hard-coded password / license key Software: offline game / word processor, costs 50 USD 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
  • 145. The Strength of Obfuscation Question: Which obfuscation transformation is better? Answer: It depends ... What do you want to protect?: intellectual property keys integrity of software What kind of attacker are you facing? (e.g. amateur hacker, professional organization, nation state) What is economically feasible for the attacker? (e.g. static attacks, dynamic attacks) What is the impact of protection on performance? Example Scenario: Protecting a hard-coded password / license key Software: offline game / word processor, costs 50 USD → cost (obfuscation overhead) < 10% performance impact → stealth less important for this scenario 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
  • 146. The Strength of Obfuscation Question: Which obfuscation transformation is better? Answer: It depends ... What do you want to protect?: intellectual property keys integrity of software What kind of attacker are you facing? (e.g. amateur hacker, professional organization, nation state) What is economically feasible for the attacker? (e.g. static attacks, dynamic attacks) What is the impact of protection on performance? Example Scenario: Protecting a hard-coded password / license key Software: offline game / word processor, costs 50 USD → cost (obfuscation overhead) < 10% performance impact → stealth less important for this scenario Each user needs a different password / license key → potency > 50 USD (exclude: professional org., nation state) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
  • 147. The Strength of Obfuscation Question: Which obfuscation transformation is better? Answer: It depends ... What do you want to protect?: intellectual property keys integrity of software What kind of attacker are you facing? (e.g. amateur hacker, professional organization, nation state) What is economically feasible for the attacker? (e.g. static attacks, dynamic attacks) What is the impact of protection on performance? Example Scenario: Protecting a hard-coded password / license key Software: offline game / word processor, costs 50 USD → cost (obfuscation overhead) < 10% performance impact → stealth less important for this scenario Each user needs a different password / license key → potency > 50 USD (exclude: professional org., nation state) Attacker develops automated crack based on symbolic execution → resilience > 50 USD × Nr. of users not willing to pay 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 66
  • 148. Symbolic Execution in a Nutshell Interpret program using symbolic values instead of concrete ones int main(int ac , char* av []) { int a = atoi(av [1]); int b = atoi(av [2]); int c = atoi(av [3]); if (a > b) a = a - b if (b < 1) c = a + b b = 1; return 0; } 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 67
  • 149. Symbolic Execution in a Nutshell Interpret program using symbolic values instead of concrete ones Fork execution on each branch dependent on symbolic values int main(int ac , char* av []) { int a = atoi(av [1]); int b = atoi(av [2]); int c = atoi(av [3]); if (a > b) a = a - b if (b < 1) c = a + b b = 1; return 0; } 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 67
  • 150. Symbolic Execution in a Nutshell Interpret program using symbolic values instead of concrete ones Fork execution on each branch dependent on symbolic values Collect path constraints for each execution path int main(int ac , char* av []) { int a = atoi(av [1]); int b = atoi(av [2]); int c = atoi(av [3]); if (a > b) a = a - b if (b < 1) c = a + b b = 1; return 0; } 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 67
  • 151. Symbolic Execution in a Nutshell Interpret program using symbolic values instead of concrete ones Fork execution on each branch dependent on symbolic values Collect path constraints for each execution path Get concrete input values from path conditions using SMT solver int main(int ac , char* av []) { int a = atoi(av [1]); int b = atoi(av [2]); int c = atoi(av [3]); if (a > b) a = a - b if (b < 1) c = a + b b = 1; return 0; } 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 67
  • 152. Bypassing License Checks via Symbolic Execution 1. Make license input symbolic 2. Indicate distinct statement executed when license key is correct 3. Explore paths until desired instruction (sequence) is found 4. Solve path constraints on paths that lead to desired instruction via SMT solver 5. Find satisfiable path constraints → concrete inputs to bypass check 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 68
  • 153. Are other Attacks Possible via Symbolic Execution? Yes! Symbolic execution is a prerequisite for several attacks: Simplify control-flow graph Identify & disable tamper-proofing checks Bypass authentication checks / trigger conditions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 69
  • 154. Simplifying the CFG (Yadegari et al. [14]) 1. Explore paths such that all code is covered 2. Simplify traces using compiler optimization tricks 3. Reconstruct CFG from traces 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 70
  • 155. Identify & Disable Checks (Qiu et al. [13]) 1. Taint code segment 2. Explore paths until enough self-checks disabled (cyclic checks → explore all code) 3. Disable self-checking instructions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 71
  • 156. A Common Sub-Problem of Deobfuscation Attacks Common sub-problem: path exploration How do we explore paths of a given program? 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 72
  • 157. A Common Sub-Problem of Deobfuscation Attacks Common sub-problem: path exploration How do we explore paths of a given program? Generate test cases: Black-box test generation: Fuzzing, Random testing White-box test generation: Symbolic/Concolic execution 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 72
  • 158. Measuring Obfuscation Strength Strength of obfuscation: increase in test case generation time 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 73
  • 159. Measuring Obfuscation Strength Strength of obfuscation: increase in test case generation time Observation: Generally, obfuscation does not change input-output behavior → No increase in black-box test case generation time 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 73
  • 160. Measuring Obfuscation Strength Strength of obfuscation: increase in test case generation time Observation: Generally, obfuscation does not change input-output behavior → No increase in black-box test case generation time Example: 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 73
  • 161. Measuring Obfuscation Strength Strength of obfuscation: increase in test case generation time Observation: Generally, obfuscation does not change input-output behavior → No increase in black-box test case generation time Example: Observation: Could be faster to use black-box test generator than white-box 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 73
  • 162. Measuring Obfuscation Strength Strength of obfuscation: increase in test case generation time Observation: Generally, obfuscation does not change input-output behavior → No increase in black-box test case generation time Example: Observation: Could be faster to use black-box test generator than white-box Conclusion: Apply obfuscation transformations until white-box slower than black-box test case generation 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 73
  • 163. Code Tampering Attacks Question: Why do we need code obfuscation? Just use cryptographic hash 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 74
  • 164. Code Tampering Attacks Question: Why do we need code obfuscation? Just use cryptographic hash Example: 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 74
  • 165. Code Tampering Attacks Question: Why do we need code obfuscation? Just use cryptographic hash Example: Hard for symbolic execution (SMT solver) to break crypto hash functions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 74
  • 166. Code Tampering Attacks Question: Why do we need code obfuscation? Just use cryptographic hash Example: Hard for symbolic execution (SMT solver) to break crypto hash functions Answer: Test case generation is non-invasive attack, i.e. code is read, not changed Obfuscation aims to defend against MATE attacker (can tamper with code) Easy to find and patch-out crypto hash functions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 74
  • 167. Metrics for Measuring Obfuscation Strength Number of lines of code McCabe cyclomatic complexity Nesting complexity Data flow complexity Object oriented design metrics Data structure complexity ... 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 75
  • 168. Metrics for Measuring Obfuscation Strength Number of lines of code McCabe cyclomatic complexity Nesting complexity Data flow complexity Object oriented design metrics Data structure complexity ... SAT metrics: Graph metrics on a SAT formula represented as a graph (x +y +z) · (!x+!y +z) · (x+!y+!z) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 75
  • 169. SAT Instance Before & After Obfuscation Figure : Before Obfuscation (7.5 sec) 1 unsigned int SDBMHash(char* str , unsigned int len) 2 { 3 unsigned int hash = 0; 4 unsigned int i = 0; 5 for(i = 0; i < len; str++, i++) 6 hash = (*str) + (hash << 6) 7 + (hash << 16) - hash; 8 return hash; 9 } 10 11 int main(int argc , char* argv []) { 12 unsigned char *str = argv [1]; 13 unsigned int hash = SDBMHash(str , strlen(str)); 14 15 if (hash == 0x89dcd66e) 16 printf("You win!n"); 17 return 0; 18 } 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 76
  • 170. SAT Instance Before & After Obfuscation Figure : Before Obfuscation (7.5 sec) Figure : After Obfuscation (438 sec) Strong obfuscation transformations destroy community structures 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 76
  • 171. Outline 1 Introduction 2 Obfuscation in Theory 3 Obfuscation in Practice Static Obfuscation Compiler Optimizations Automated Code Obfuscation Software Diversity Dynamic Obfuscation 4 The Strength of Obfuscation 5 Hands-on Tutorial 6 Conclusions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 77
  • 172. Hands-on Tutorial Overview Example of bypassing password check in C programs 1. Start with plain unobfuscated program (break with static analysis) 2. Discuss the option of using hash functions (break with patching) 3. Add one obfuscation layer (break with symbolic execution) 4. Add more obfuscation layers (harder to break) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 78
  • 173. Hands-on Tutorial Overview Example of bypassing password check in C programs 1. Start with plain unobfuscated program (break with static analysis) 2. Discuss the option of using hash functions (break with patching) 3. Add one obfuscation layer (break with symbolic execution) 4. Add more obfuscation layers (harder to break) Tools we are going to use: We assume you have Docker installed and have basic user knowledge Docker image based on Ubuntu contains: Tigress, KLEE, STP, Z3, SatGraf, etc. $ docker pull banescusebi/obfuscation-symex $ docker run -it banescusebi/obfuscation-symex For instructions on how to start the Docker image read description at https://hub.docker.com/r/banescusebi/obfuscation-symex/ Instructions for running GUI apps available for Ubuntu and Mac OS (not mandatory, only needed for SatGraf) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 78
  • 174. Bypassing Password Check in Unobfuscated Program Example C Program with Hard-Coded Password 1 #include <stdlib.h> 2 #include <stdio.h> 3 #include <string.h> 4 5 int main(int argc , char* argv []) { 6 if (strcmp(argv [1], 7 " my_license_key ") == 0) 8 printf("You win!n"); 9 return 0; 10 } Hard-coded password can be found via static analysis $ ./nohash my license key You win! $ strings nohash | grep ’’license’’ my license key 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 79
  • 175. Bypassing Password Check even when Using Hash Example C Program where Hard-Coded Password Replaced with Hash 1 unsigned int BPHash(char* str , unsigned int len) 2 { 3 unsigned int hash = 0; 4 unsigned int i = 0; 5 for(i = 0; i < len; str++, i++) { 6 hash = hash << 7 ^ (*str); 7 } 8 return hash; 9 } 10 11 int main(int argc , char* argv []) { 12 unsigned char *str = argv [1]; 13 unsigned int hash = BPHash(str , strlen(str)); 14 if (hash == 0x5bfaf2f9) 15 printf("You win!n"); 16 return 0; 17 } Check can still be disabled using binary patching $ strings bphash | grep ’’license’’ $ objdump -D bphash Use vi in hex editor mode (:%!xxd) to patch check (jne) 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 80
  • 176. Bypassing Password Check after Obfuscation Add one layer of obfuscation: $ tigress --Transform=EncodeArithmetic --Functions=DEKHash --out=dekhash-obf/dekhash-encA.c dekhash.c $ tigress --Transform=Flatten --Functions=DEKHash --out=dekhash-obf/dekhash-flat.c dekhash.c $ tigress --Transform=Virtualize --Functions=DEKHash --out=dekhash-obf/dekhash-virt.c dekhash.c 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 81
  • 177. Bypassing Password Check after Obfuscation Add one layer of obfuscation: $ tigress --Transform=EncodeArithmetic --Functions=DEKHash --out=dekhash-obf/dekhash-encA.c dekhash.c $ tigress --Transform=Flatten --Functions=DEKHash --out=dekhash-obf/dekhash-flat.c dekhash.c $ tigress --Transform=Virtualize --Functions=DEKHash --out=dekhash-obf/dekhash-virt.c dekhash.c Add another layer of obfuscation: $ tigress --Transform=Flatten --Functions=DEKHash --Transform=EncodeArithmetic --Functions=DEKHash --out=dekhash-obf/dekhash-flat-encA.c dekhash.c 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 81
  • 178. Bypassing Password Check after Obfuscation Add one layer of obfuscation: $ tigress --Transform=EncodeArithmetic --Functions=DEKHash --out=dekhash-obf/dekhash-encA.c dekhash.c $ tigress --Transform=Flatten --Functions=DEKHash --out=dekhash-obf/dekhash-flat.c dekhash.c $ tigress --Transform=Virtualize --Functions=DEKHash --out=dekhash-obf/dekhash-virt.c dekhash.c Add another layer of obfuscation: $ tigress --Transform=Flatten --Functions=DEKHash --Transform=EncodeArithmetic --Functions=DEKHash --out=dekhash-obf/dekhash-flat-encA.c dekhash.c Apply symbolic execution to each of these programs (see KLEE command on next slide) $ ∼/scripts/step05-klee-symex-until-win.sh . exe 15 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 81
  • 179. The KLEE Symbolic Execution Tool 1 klee --optimize 2 // Optimize before execution 3 --emit -all -errors 4 // Don’t stop on first error 5 --libc=uclibc 6 // Choose libc version 7 --posix -runtime 8 // Link with POSIX runtime 9 --only -output -states -covering -new 10 // Only tests covering new code 11 --max -time =3600 12 // Halt after given nr. of sec. 13 --write -smt2s 14 // Write smt2 file per test 15 --output -dir=klee -out -${file_name} 16 // Output directory for tests 17 ./${file_name }.bc 18 // Bitcode file under test 19 --sym -arg 5 20 // Length of symbolic arg. 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 82
  • 180. Viewing Results See number of nanoseconds needed for KLEE to analyze a bitcode program $ cat klee-time-to-win.txt Check differences between 1 and 2 layers of obfuscation 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 83
  • 181. Viewing Results See number of nanoseconds needed for KLEE to analyze a bitcode program $ cat klee-time-to-win.txt Check differences between 1 and 2 layers of obfuscation Convert the SMT2 instances generated by KLEE into CNF instances $ ∼/scripts/step07-convert-smt-to-cnf.sh . obfuscated-cnf-instances.txt $ ll ./obfuscated-cnf-instances 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 83
  • 182. Viewing Results See number of nanoseconds needed for KLEE to analyze a bitcode program $ cat klee-time-to-win.txt Check differences between 1 and 2 layers of obfuscation Convert the SMT2 instances generated by KLEE into CNF instances $ ∼/scripts/step07-convert-smt-to-cnf.sh . obfuscated-cnf-instances.txt $ ll ./obfuscated-cnf-instances View CNF instances in SatGraf: $ java -jar ∼/satgraf/dist/SatGraf.jar com -f obfuscated-cnf-instances/dekhash-virt.cnf Check differences between 1 and 2 layers of obfuscation 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 83
  • 183. Outline 1 Introduction 2 Obfuscation in Theory 3 Obfuscation in Practice Static Obfuscation Compiler Optimizations Automated Code Obfuscation Software Diversity Dynamic Obfuscation 4 The Strength of Obfuscation 5 Hands-on Tutorial 6 Conclusions 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 84
  • 184. Conclusions Theoretical obfuscation offers security guarantees, but is unpractical 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 85
  • 185. Conclusions Theoretical obfuscation offers security guarantees, but is unpractical Practical obfuscation raises the bar against MATE attackers 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 85
  • 186. Conclusions Theoretical obfuscation offers security guarantees, but is unpractical Practical obfuscation raises the bar against MATE attackers Static obfuscation transformations: Applied on source code, IR or machine code Obfuscated program fixed (doesn’t change at runtime) Makes static analysis harder Many transformations available: compiler optimizations, scramble identifiers, instruction substitution, merging & splitting functions, opaque predicates & expressions, control flow flattening, virtualization obfuscation, white-box cryptography, etc. 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 85
  • 187. Conclusions Theoretical obfuscation offers security guarantees, but is unpractical Practical obfuscation raises the bar against MATE attackers Static obfuscation transformations: Applied on source code, IR or machine code Obfuscated program fixed (doesn’t change at runtime) Makes static analysis harder Many transformations available: compiler optimizations, scramble identifiers, instruction substitution, merging & splitting functions, opaque predicates & expressions, control flow flattening, virtualization obfuscation, white-box cryptography, etc. Software diversity improves potency and resilience against reverse engineering attacks 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 85
  • 188. Conclusions Theoretical obfuscation offers security guarantees, but is unpractical Practical obfuscation raises the bar against MATE attackers Static obfuscation transformations: Applied on source code, IR or machine code Obfuscated program fixed (doesn’t change at runtime) Makes static analysis harder Many transformations available: compiler optimizations, scramble identifiers, instruction substitution, merging & splitting functions, opaque predicates & expressions, control flow flattening, virtualization obfuscation, white-box cryptography, etc. Software diversity improves potency and resilience against reverse engineering attacks Dynamic obfuscation transformations: Program changes during execution Makes dynamic analysis harder Several transformations available: replacing instructions, dynamic code merging, dynamic decryption and re-encryption. 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 85
  • 189. Conclusions Theoretical obfuscation offers security guarantees, but is unpractical Practical obfuscation raises the bar against MATE attackers Static obfuscation transformations: Applied on source code, IR or machine code Obfuscated program fixed (doesn’t change at runtime) Makes static analysis harder Many transformations available: compiler optimizations, scramble identifiers, instruction substitution, merging & splitting functions, opaque predicates & expressions, control flow flattening, virtualization obfuscation, white-box cryptography, etc. Software diversity improves potency and resilience against reverse engineering attacks Dynamic obfuscation transformations: Program changes during execution Makes dynamic analysis harder Several transformations available: replacing instructions, dynamic code merging, dynamic decryption and re-encryption. Multiple transformations should be combined to improve strength 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 85
  • 190. Thank you for your attention Questions ? 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 86
  • 191. References I S. Banescu, M. Ochoa, A. Pretschner, and N. Kunze. Benchmarking indistinguishability obfuscation - a candidate implementation. In Proc. of 7th International Symposium on ESSoS, number 8978 in LNCS, 2015. Boaz Barak. Hopes, fears, and software obfuscation. Communications of the ACM, 59(3):88–96, 2016. Boaz Barak, Oded Goldreich, Rusell Impagliazzo, Steven Rudich, Amit Sahai, Salil Vadhan, and Ke Yang. On the (im) possibility of obfuscating programs. In Advances in Cryptology CRYPTO 2001, pages 1–18. Springer, 2001. 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 87
  • 192. References II Christian Collberg and Jasvir Nagra. Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection. Addison-Wesley, 2009. Christian Collberg, Clark Thomborson, and Douglas Low. A taxonomy of obfuscating transformations. Technical report, Department of Computer Science, The University of Auckland, New Zealand, 1997. Christian Collberg, Clark Thomborson, and Douglas Low. Manufacturing cheap, resilient, and stealthy opaque constructs. In Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, POPL ’98, pages 184–196, New York, NY, USA, 1998. ACM. 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 88
  • 193. References III M. Dalla Preda, M. Madou, K. De Bosschere, and R. Giacobazzi. Opaque predicates detection by abstract interpretation. In Michael Johnson and Varmo Vene, editors, Algebraic Methodology and Software Technology, volume 4019 of Lecture Notes in Computer Science, pages 81–95. Springer Berlin Heidelberg, 2006. S. Garg, C. Gentry, S. Halevi, M. Raykova, A Sahai, and B. Waters. Candidate indistinguishability obfuscation and functional encryption for all circuits. In Proc. of the 54th Annual Symp. on Foundations of Computer Science, pages 40–49, 2013. S. Goldwasser and G. N. Rothblum. On best-possible obfuscation. In Theory of Cryptography, pages 194–213. Springer, 2007. 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 89
  • 194. References IV Yuichiro Kanzaki, Akito Monden, Masahide Nakamura, and Ken-ichi Matsumoto. Exploiting self-modification mechanism for program protection. In Computer Software and Applications Conference, 2003. COMPSAC 2003. Proceedings. 27th Annual International, pages 170–179, 2003. J. Kinder. Towards static analysis of virtualization-obfuscated binaries. In 19th Working Conference on Reverse Engineering (WCRE), pages 61–70, Oct 2012. Matias Madou, Bertrand Anckaert, Patrick Moseley, Saumya Debray, Bjorn De Sutter, and Koen De Bosschere. Software protection through dynamic code mutation. In Information Security Applications, pages 194–206. Springer, 2006. 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 90
  • 195. References V Jing Qiu, Babak Yadegari, Brian Johannesmeyer, Saumya Debray, and Xiaohong Su. Identifying and understanding self-checksumming defenses in software. In Proceedings of the 5th ACM Conference on Data and Application Security and Privacy, pages 207–218. ACM, 2015. Babak Yadegari, Brian Johannesmeyer, Ben Whitely, and Saumya Debray. A generic approach to automatic deobfuscation of executable code. In Security and Privacy (SP), 2015 IEEE Symposium on, pages 674–691. IEEE, 2015. 4 December 2017 Copyright c 2017 Sebastian Banescu, Technische Universit¨at M¨unchen 91