SlideShare une entreprise Scribd logo
1  sur  47
Binary Analysis for Vulnerability
Detection
National University of Singapore
http://www.comp.nus.edu.sg/~abhik
Visit to University of Luxembourg S&T center, January 2017.
1
Research project with DSO National Labs, 2013-16.
“TSUNAMi: Trustworthy systems from un-trusted
component amalgamations”
National Research Foundation (NRF), 2015-2020.
Singapore
2
274 sq. mi., 5 million population, about 12 hours flight from Luxembourg.
NUS
3
Founded 1905.
9000 grad. &
23000
undergrad.
from 88
countries.
Cybersecurity research
4
The National Cybersecurity R&D Programme seeks to develop R&D expertise and
capabilities in cybersecurity for Singapore. It aims to improve the trustworthiness of
cyber infrastructures with an emphasis on security, reliability, resiliency and usability.
A 5-year S$130 million funding will be available to support research efforts into both
technological and human-science aspects of cybersecurity in the following outcome-
based R&D themes. The themes are designed to provide an element of operational
context, while not restricting “game-changing” ideas from the community.
Cybersecurity research spans six themes:
Scalable Trustworthy Systems:
Resilient Systems:
Effective Situation Awareness and Attack Attribution:
Combatting Insider Threats:
Threats Detection, Analysis and Defence: Efficient and Effective Digital Forensics:
https://www.nrf.gov.sg/programmes/national-cybersecurity-r-d-
programme
Outline
• NCR project – Trustworthy systems from Un-trusted
Components
• Technical contributions in Binary Analysis
• Technology showcase
• Initiatives – Consortium
5
COTS-integrated
Platforms
6
Trustworthy System
Outsourced and Shared Data
Vulnerability
Malicious
Behavior
Flaws
Data Breach
Binary analysis of paramount need for software acquisition or assembly.
Vulnerability
Discovery
Binary
Hardening
Verification
Data
Protection
7
Agency
Collaboration
– DSTA, …
Industry
Collaboration
ST, Symantec,
NEC, …
Education – NUS
(New module)
Research Outputs – Publications, Tools, Academic
Collaboration, Exchanges, Seminars, Workshops
Enhancing local
capabilities
Use of research in NRF
project
• Binary Analysis
o Useful to government agencies for procuring software.
o Deep binary analysis on evaluation version prior to
procurement.
• Binary hardening
o Useful to government agencies on procured software.
• Point technologies from individual work-packages.
8
Contributions
• Binary analysis for
 Fuzz testing
 Comprehension
 Debugging
 Patching (Latest work)
• -> Research Program at NUS since 2008, with
DRTech, DSO, …
9
Video
• https://youtu.be/C1hl_ujw6B0
• (1 Minute)
• https://youtu.be/EHBjMSQvIpg
• (1 Minute)
10
Who cares?
11
A team of hackers won $2 million by
building a machine that could hack better
than they could
Read more at
http://www.businessinsider.sg/forallsecure-
mayhem-darpa-cyber-grand-challenge-2016-
8/#ZuIF7Dmq3aaCAdaq.99
DARPA Cyber Grand
Challenge
-> Automation of Security
[detecting and fixing
vulnerabilities in binaries
automatically]
Fuzz Testing
12
Springfield Project - Fuzzing as a service
OSS-Fuzz - Continuous fuzzing for open-source projects
Pioneered by Barton Miller at Unv. of Wisconsin in 1988
And now, in 2016 …
A true story – why fuzz?
• May 4, 2015
o Abhik was preparing lecture notes on fuzzing.
o 11:00 AM – finished deciding on structure and trying to decide on a motivating
example for fuzzing to interest the students, there are so many of them.
o 11:11 AM – I get email update about a latest incident – an integer overflow in
Boeing – a classic case where an automated method for sending out mal-formed
or boundary inputs can reveal errors.
13
Presented by Thuan Pham
(Model-Based) Black-
box Fuzzing
1
📄 Model-
Based
Blackbox
Fuzzing
Input model
Peach, Spike …
Seed Input
📄
📄
📄
Pass all checks
Satisfy some checks
Satisfy some checks
Mutated Inputs
Presented by Thuan Pham
📄 📄📄 📄AFLFast
(Coverage-based) Grey-box
Fuzzing
15
Seed Inputs Mutated Inputs
…
📄📄
Input Queue
Put “interesting” inputs back in the queue
EnqueueDequeue
White-box Fuzzing
16
Problem Statement
• How to direct the exploration to
reach certain locations or targets, or
enhance coverage
o in large-scale program binaries
o with highly-structured inputs (e.g., multi-media
files)
o given inadequate test suite or seeds.
17
Directed Search in White-
box Fuzzing
Apply to Crash Reproduction Problem
18
Crash reproducing supports
- In-house debugging and
fixing
- Vulnerability checking
Overview
19
Program binary
Benign input files
(Crash instruction, loaded
modules, call stack, register
values)
Crash input files
Hercules
Toolset
1. Directed Search Algorithm
2. Guided Selective Symbolic Execution
Control Flow Graph
Construction
Resolve indirect jumps/calls
20
IDA Pro
CFG
Generator
Jump Table Extraction
Edge Profiling
•Assembly code
•Direct Jumps/Calls
Indirect Jumps/Calls
CFG
Program binaries
First-cut Analyzer
21
• Output of Stage-1 : Flow Structures and input file(s) that can reach crash module
• Output of Stage-2 : refined CFG, MDG and Hybrid symbolic file
• Output of Stage-3: Crash input(s) and crash explanation (based on UNSAT core)
UNSAT-core
22
… …
b1
b2
b3
B4
bc1¬bc1
¬bc2 ¬bc3
¬bc4
bc2 bc3
bc4
First attempt:
PC = bc1 ^ ¬bc3 ^ bc4
PC ^ CC == UNSAT
bc1 contradicts CC
Second attempt:
PC’ = ¬bc1 ^ bc2 ^ bc4
PC’ ^ CC == SAT
1) Backtrack to b1
2) Take another
branch
Notations:
bx: branch instruction
bcx: branch condition at bx
PC: path condition
CC: crash condition
Crash
instruction
Evaluation
23
Progra
m
Advisory ID #Seed
files
Hercules Peach S2E
WMP 9.0 CVE-2014-2671 10
WMP 9.0 CVE-2010-0718 10
AR 9.2 CVE-2010-2204 10
RP 1.0 CVE-2010-3000 10
MP 0.35 CVE-2011-0502 10
OV 1.04 CVE-2010-0688 10
Time bound: 24hrs
Vulnerabilities in file-processing programs
24
315
399
328
352
304 310
199 203
343
169
0
100
200
300
400
500
2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
#CVE-assigned vulnerabilities by year
(US National Vulnerability Database) (By 30/8)
File Processing Programs
Combining Black-box and
White-box Fuzzing
25
Augmented MoBF
MoBF + Transplantation
Selective and Targeted
Whitebox Fuzzing
• Handles missing
data chunks by
data chunk
transplantation
• Enforces integrity
checks
• Guides data chunk
transplantation
• Explores deep
paths
• Generates specific
values causing
program crashes
Peach Fuzzer
Production-quality MoBF
Hercules (ICSE’15)
Scale to WMP, Adobe Reader
Combination
26
Crucial IF
27
Input File with
necessary part
Input File with
a missing part
Test
suites
Crucial
IFs
Experimental Results28
Program Advisory ID Input Model #Seed files Hercules++ Peach Hercules
VLC 2.0.7 OSVDB-95632 PNG 0 – 10
VLC 2.0.3 CVE-2012-5470 PNG 0 – 10
LTP 1.5.4 CVE-2011-3328 PNG 0 – 10
XNV1.98 Unknown-1 PNG 0 – 10
XNV1.98 Unknown-2 PNG 0 – 10
XNV1.98 Unknown-3 PNG 0 – 10
WMP 9.0 Unknown-4 WAV 10
WMP 9.0 CVE-2014-2671 WAV 10
WMP 9.0 CVE-2010-0718 MIDI 0 – 10
AR 9.2 CVE-2010-2204 PDF 10
RP 1.0 CVE-2010-3000 FLV 10
MP 0.35 CVE-2011-0502 MIDI 0 – 10
OV 1.04 CVE-2010-0688 ORB 0 – 10
Coverage-based Grey-box
Fuzzing
AFL, LibFuzzer …
2
Mutators
Test suite
Mutated files
Input Queue
EnqueueDequeue
Exposing paths in Grey-
Box Fuzzing
30
Key change
31
• Input: Seed Inputs S
• 1: T✗ = ∅
• 2: T = S
• 3: if T = ∅ then
• 4: add empty file to T
• 5: end if
• 6: repeat
• 7: t = chooseNext(T)
• 8: p = assignEnergy(t)
• 9: for i from 1 to p do
• 10: t0 = mutate_input(t)
• 11: if t0 crashes then
• 12: add t0 to T✗
• 13: else if isInteresting(t0 ) then
• 14: add t0 to T
• 15: end if
• 16: end for
• 17: until timeout reached or abort-signal
• Output: Crashing Inputs T✗
• Constant:
o AFL uses this schedule (fuzzing ~1 minute)
o (i) .. how AFL judges fuzzing time for the test exercising path i
• Cut-off Exponential:
Power Schedules
p(i) = (i)
p(i) = 0, if f(i) > µ
min( (i)/β*2s(i), M) otherwise
β is a constant
s(i) #times the input exercising path i has been chosen for
fuzzing
f(i) #fuzz exercising path i (path-frequency)
µ mean #fuzz exercising a discovered path (avg. path-
frequency)
M maximum energy expendable on a state
Prioritize low probability paths
[CCS16]
 Use grey-box fuzzer which keeps track of path id for a test.
 Find probabilities that fuzzing a test t which exercises π leads to
an input which exercises π’
 Higher weightage to low probability paths discovered, to
gravitate to those -> discover new states in Markov Chain with
minimal effort.
33
π
π
'
1 void crashme (char* s) {
2 if (s[0] == ’b’)
3 if (s[1] == ’a’)
4 if (s[2] == ’d’)
5 if (s[3] == ’!’)
6 abort ();
7 }
p
8 CVEs in Binutils (3 new over
GB fuzzing)
Finds crashes 7x faster, as
compared to plain GB fuzzing.
Independent evaluation found
crashes 19x faster on DARPA
Cyber Grand Challenge (CGC)
binaries.
Coverage-based Greybox Fuzzing as Markov Chain
From Hackernews
1
Other works – Crash
Bucketing
35
p1
f1
f2 f3 f4x x
x
b2
b1
b4
b3
b5
 Identify culprit constraint
 Use culprit constraint as “reason” of failure
 Group failing paths having same “reason” together
Culprit constraint[Upcoming work
FASE17]
Point-of-Failure based Approach
Call-stack based Approach
Symbolic analysis based Approach
Program Repository Size
(kLOC)
#Failing
Tests
#Cluster
Point-of-
Failure
#Cluster
Stack hash
#Cluster
Symbolic
Analysis
mkfifo Coreutils 38 2 1 1 1
mkdir Coreutils 40 2 1 1 1
mknod Coreutils 39 2 1 1 1
md5sum Coreutils 43 48 1 1 1
pr Coreutils 54 6 2 2 4
ptx Coreutils 62 3095 16 1 3
seq Coreutils 39 72 1 1 18
paste Coreutils 38 4510 10 1 3
touch Coreutils 18 406 2 3 14
du Coreutils 41 100 2 2 8
cut Coreutils 43 5 1 1 1
grep SIR 61 7122 1 1 11
gzip SIR 44 265 1 1 1
seq SIR 57 31 1 1 1
polymorph BugBench 25 67 1 1 2
xmail Exploit-db 30 129 1 1 1
exim Exploit-db 253 16 1 1 6
gpg Exploit-db 218 2 1 1 1
Recall CGC
37
A team of hackers won $2 million by
building a machine that could hack better
than they could
Read more at
http://www.businessinsider.sg/forallsecure-
mayhem-darpa-cyber-grand-challenge-2016-
8/#ZuIF7Dmq3aaCAdaq.99
DARPA Cyber Grand
Challenge
-> Automation of Security
[detecting and fixing
vulnerabilities in binaries
automatically]
Auto-Patching
38
Automated Patching
• Automated patching – source code and binaries
o Vulnerability localization [where to fix]
• Hypothesize the error causes – suspect
o Symbolic execution [what values should be returned: angelic values]
• Specification of the suspicious fragment
• Input-output requirements from each test
• Repair constraint
o Program synthesis [which code can return these values]
• Decide operators which can appear in the fix
• Generate a fix by solving repair constraint.
39
Buggy
Program
Failing /
Passing
Tests
Patched
Program
Patching
Tool
Example
40
1 int is_upward( int inhibit, int up_sep, int down_sep){
2 int bias;
3 if (inhibit)
4 bias = down_sep; // bias= up_sep + 100
5 else bias = up_sep ;
6 if (bias > down_sep)
7 return 1;
8 else return 0;
9 }
inhibit up_sep down_se
p
Observed
output
Expected
Output
Result
1 0 100 0 0 pass
1 11 110 0 1 fail
0 100 50 1 1 pass
1 -20 60 0 1 fail
0 0 10 0 0 pass
Repair Constraint
41
1 int is_upward( int inhibit, int up_sep, int
down_sep){
2 int bias;
3 if (inhibit)
4 bias = f(inhibit, up_sep, down_sep)
5 else bias = up_sep ;
6 if (bias > down_sep)
7 return 1;
8 else return 0;
9 }
Inhibit
== 1
up_sep ==
11
down_se
p == 110
Symbolic Execution
f(1,11,110) > 110
Conjure up a function
• Instead of solving
• Select primitive components to be used by the synthesized program
based on complexity
• Look for a program that uses only these primitive components and
satisfy the repair constraint
o Done via another constraint solving problem – pgm. synthesis
• Solving the repair constraint is the key, not how it is solved
• Enumerate expressions over a given set of components / operators
o Enforce axioms of the operators
o If candidate repair contains a constant, solve using SMT
42
Repair Constraint:
f(1,11,110) > 110  f(1,0,100) ≤ 100
 f(1,-20,60) > 60
Patching Tool Released
43
SEMFIX: ICSE 2013, Angelix: ICSE 2016
http://angelix.io
Repair-ed
44
0
10
20
30
40
wireshark
php
gzip
gmp
libtiff
Overall
Angelix
SPR
GenProg
#Fixes Del Del, Per
Angelix 28 5 18%
SPR 31 13 42%
Subject LoC
wireshark 2814K
php 1046K
gzip 491K
gmp 145K
libtiff 77K
Over-fitting problem in
Program Repair
• Searches for arbitrary modifications could lead to
undesirable program modifications like deletion of
functionality
45
static void BadPPM(char file) {
fprintf(stderr, "%s: Not a PPM file.n",
file);
exit(-2);
}
➢Derived rules that disallow patches that cause significant changes to the
control flow or data-flow of the program
➢Benefits of Anti-patterns:
○ Can be easily integrated with any automated repair tools
○ Localizes Better
○ Generate Fixes Faster
Example of automatically generated patches Goal of Repair tools: Make all
test pass
Test: Pass if non-zero exit status
Trivial Patch: Delete exit(-2)
➢Should disallow this
modifications
“Latest”
Results
46
1 i f ( hbtype == TLS1 HB REQUEST) {
2 . . .
3 memcpy (bp , pl , payload ) ;
4 . . .
5 }
(a) The buggy part of the Heartbleed-
vulnerable OpenSSL
1 i f ( hbtype == TLS1 HB REQUEST
2 && payload + 18 < s->s3->rrec.length) {
3 . . .
4 }
(b) A fix generated automatically
1 if (1 + 2 + payload + 16 > s->s3->rrec.length)
2 return 0;
3 . . .
4 i f ( hbtype == TLS1_HB_REQUEST) {
5 . . .
6 }
7 e l s e i f ( hbtype == TLS1_HB_RESPONSE) {
8 . . .
9 }
10 r e t u r n 0 ;
(c) The developer-provided repair
The Heartbleed Bug is a serious vulnerability in the popular
OpenSSL cryptographic software library. This weakness allows
stealing the information protected, under normal conditions, by the
SSL/TLS encryption used to secure the Internet. SSL/TLS provides
communication security and privacy over the Internet for
applications such as web, email, instant messaging (IM) and some
virtual private networks (VPNs).
--- Source: heartbleed.com
• Scalable white-box analysis on binaries
• How Why For whom
• Cluster paths online Guide search SW Acquisition
• Control Symbolic Variables Extract semantics Developers with 3rd party code
• Hybrid symbolic file COTS system assembly
• Inject path sensitivity into GB
47
Collaborators: Marcel Boehme, Satish Chandra (Facebook), Sergey Mechtaev, Van
Thuan Pham, Mukul Prasad (Fujitsu), Shin Hwei Tan, Jooyong Yi, Hiroaki Yoshida
(Fujitsu).
Relevant papers: http://www.comp.nus.edu.sg/~abhik/projects/Repair/index.html
http://www.comp.nus.edu.sg/~abhik/projects/Fuzz/

Contenu connexe

Tendances

Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Sangmin Park
 
Effective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent Software
Sangmin Park
 
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...
Akram El-Korashy
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use case
inovex GmbH
 

Tendances (20)

Symbexecsearch
SymbexecsearchSymbexecsearch
Symbexecsearch
 
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
 
Automated Repair - ISSTA Summer School
Automated Repair - ISSTA Summer SchoolAutomated Repair - ISSTA Summer School
Automated Repair - ISSTA Summer School
 
Repair dagstuhl jan2017
Repair dagstuhl jan2017Repair dagstuhl jan2017
Repair dagstuhl jan2017
 
Effective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent Software
 
TRECVID 2016 : Concept Localization
TRECVID 2016 : Concept LocalizationTRECVID 2016 : Concept Localization
TRECVID 2016 : Concept Localization
 
Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016
 
TRECVID 2016 : Ad-hoc Video Search
TRECVID 2016 : Ad-hoc Video Search TRECVID 2016 : Ad-hoc Video Search
TRECVID 2016 : Ad-hoc Video Search
 
TRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text DescriptionTRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text Description
 
Vulnerability Detection Based on Git History
Vulnerability Detection Based on Git HistoryVulnerability Detection Based on Git History
Vulnerability Detection Based on Git History
 
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program RepairLSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
 
Abhik-Satish-dagstuhl
Abhik-Satish-dagstuhlAbhik-Satish-dagstuhl
Abhik-Satish-dagstuhl
 
ICSE2013
ICSE2013ICSE2013
ICSE2013
 
CSMR13b.ppt
CSMR13b.pptCSMR13b.ppt
CSMR13b.ppt
 
Programas y Pruebas en Dafny
Programas y Pruebas en DafnyProgramas y Pruebas en Dafny
Programas y Pruebas en Dafny
 
Boetticher Presentation Promise 2008v2
Boetticher Presentation Promise 2008v2Boetticher Presentation Promise 2008v2
Boetticher Presentation Promise 2008v2
 
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Tu...
 
Impact of Tool Support in Patch Construction
Impact of Tool Support in Patch ConstructionImpact of Tool Support in Patch Construction
Impact of Tool Support in Patch Construction
 
Open Source Scientific Software
Open Source Scientific SoftwareOpen Source Scientific Software
Open Source Scientific Software
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use case
 

Similaire à Binary Analysis - Luxembourg

From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
Priyanka Aash
 

Similaire à Binary Analysis - Luxembourg (20)

Fuzzing.pptx
Fuzzing.pptxFuzzing.pptx
Fuzzing.pptx
 
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
 
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
 
ASE2023_SCPatcher_Presentation_V5.pptx
ASE2023_SCPatcher_Presentation_V5.pptxASE2023_SCPatcher_Presentation_V5.pptx
ASE2023_SCPatcher_Presentation_V5.pptx
 
How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.
 
Network dialog minimization and network dialog diffing: Two novel primitives ...
Network dialog minimization and network dialog diffing: Two novel primitives ...Network dialog minimization and network dialog diffing: Two novel primitives ...
Network dialog minimization and network dialog diffing: Two novel primitives ...
 
Estimating Security Risk Through Repository Mining
Estimating Security Risk Through Repository MiningEstimating Security Risk Through Repository Mining
Estimating Security Risk Through Repository Mining
 
Structural Biology in the Clouds: A Success Story of 10 years
Structural Biology in the Clouds: A Success Story of 10 yearsStructural Biology in the Clouds: A Success Story of 10 years
Structural Biology in the Clouds: A Success Story of 10 years
 
From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
From Thousands of Hours to a Couple of Minutes: Automating Exploit Generation...
 
Chaos Engineering 시작하기 - 윤석찬 (AWS 테크에반젤리스트) :: 한국 카오스엔지니어링 밋업
Chaos Engineering 시작하기 - 윤석찬 (AWS 테크에반젤리스트) ::  한국 카오스엔지니어링 밋업Chaos Engineering 시작하기 - 윤석찬 (AWS 테크에반젤리스트) ::  한국 카오스엔지니어링 밋업
Chaos Engineering 시작하기 - 윤석찬 (AWS 테크에반젤리스트) :: 한국 카오스엔지니어링 밋업
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
 
A Software Design and Algorithms for Multicore Capture in Data Center Forensics
A Software Design and Algorithms for Multicore Capture in Data Center ForensicsA Software Design and Algorithms for Multicore Capture in Data Center Forensics
A Software Design and Algorithms for Multicore Capture in Data Center Forensics
 
Ns fundamentals 1
Ns fundamentals 1Ns fundamentals 1
Ns fundamentals 1
 
SC20 SYCL and C++ Birds of a Feather 19th Nov 2020
SC20 SYCL and C++ Birds of a Feather 19th Nov 2020SC20 SYCL and C++ Birds of a Feather 19th Nov 2020
SC20 SYCL and C++ Birds of a Feather 19th Nov 2020
 
Test-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxTest-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptx
 
20220622-ETRI-IoT-Testing.pdf
20220622-ETRI-IoT-Testing.pdf20220622-ETRI-IoT-Testing.pdf
20220622-ETRI-IoT-Testing.pdf
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
 
Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!
Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!
Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!
 

Plus de Abhik Roychoudhury (7)

16May_ICSE_MIP_APR_2023.pptx
16May_ICSE_MIP_APR_2023.pptx16May_ICSE_MIP_APR_2023.pptx
16May_ICSE_MIP_APR_2023.pptx
 
IFIP2023-Abhik.pptx
IFIP2023-Abhik.pptxIFIP2023-Abhik.pptx
IFIP2023-Abhik.pptx
 
Art of Computer Science Research Planning
Art of Computer Science Research PlanningArt of Computer Science Research Planning
Art of Computer Science Research Planning
 
Issta13 workshop on debugging
Issta13 workshop on debuggingIssta13 workshop on debugging
Issta13 workshop on debugging
 
Repair dagstuhl
Repair dagstuhlRepair dagstuhl
Repair dagstuhl
 
PAS 2012
PAS 2012PAS 2012
PAS 2012
 
Pas oct12
Pas oct12Pas oct12
Pas oct12
 

Dernier

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
SoniaTolstoy
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
fonyou31
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 

Dernier (20)

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 

Binary Analysis - Luxembourg

  • 1. Binary Analysis for Vulnerability Detection National University of Singapore http://www.comp.nus.edu.sg/~abhik Visit to University of Luxembourg S&T center, January 2017. 1 Research project with DSO National Labs, 2013-16. “TSUNAMi: Trustworthy systems from un-trusted component amalgamations” National Research Foundation (NRF), 2015-2020.
  • 2. Singapore 2 274 sq. mi., 5 million population, about 12 hours flight from Luxembourg.
  • 3. NUS 3 Founded 1905. 9000 grad. & 23000 undergrad. from 88 countries.
  • 4. Cybersecurity research 4 The National Cybersecurity R&D Programme seeks to develop R&D expertise and capabilities in cybersecurity for Singapore. It aims to improve the trustworthiness of cyber infrastructures with an emphasis on security, reliability, resiliency and usability. A 5-year S$130 million funding will be available to support research efforts into both technological and human-science aspects of cybersecurity in the following outcome- based R&D themes. The themes are designed to provide an element of operational context, while not restricting “game-changing” ideas from the community. Cybersecurity research spans six themes: Scalable Trustworthy Systems: Resilient Systems: Effective Situation Awareness and Attack Attribution: Combatting Insider Threats: Threats Detection, Analysis and Defence: Efficient and Effective Digital Forensics: https://www.nrf.gov.sg/programmes/national-cybersecurity-r-d- programme
  • 5. Outline • NCR project – Trustworthy systems from Un-trusted Components • Technical contributions in Binary Analysis • Technology showcase • Initiatives – Consortium 5
  • 6. COTS-integrated Platforms 6 Trustworthy System Outsourced and Shared Data Vulnerability Malicious Behavior Flaws Data Breach Binary analysis of paramount need for software acquisition or assembly.
  • 7. Vulnerability Discovery Binary Hardening Verification Data Protection 7 Agency Collaboration – DSTA, … Industry Collaboration ST, Symantec, NEC, … Education – NUS (New module) Research Outputs – Publications, Tools, Academic Collaboration, Exchanges, Seminars, Workshops Enhancing local capabilities
  • 8. Use of research in NRF project • Binary Analysis o Useful to government agencies for procuring software. o Deep binary analysis on evaluation version prior to procurement. • Binary hardening o Useful to government agencies on procured software. • Point technologies from individual work-packages. 8
  • 9. Contributions • Binary analysis for  Fuzz testing  Comprehension  Debugging  Patching (Latest work) • -> Research Program at NUS since 2008, with DRTech, DSO, … 9
  • 10. Video • https://youtu.be/C1hl_ujw6B0 • (1 Minute) • https://youtu.be/EHBjMSQvIpg • (1 Minute) 10
  • 11. Who cares? 11 A team of hackers won $2 million by building a machine that could hack better than they could Read more at http://www.businessinsider.sg/forallsecure- mayhem-darpa-cyber-grand-challenge-2016- 8/#ZuIF7Dmq3aaCAdaq.99 DARPA Cyber Grand Challenge -> Automation of Security [detecting and fixing vulnerabilities in binaries automatically]
  • 12. Fuzz Testing 12 Springfield Project - Fuzzing as a service OSS-Fuzz - Continuous fuzzing for open-source projects Pioneered by Barton Miller at Unv. of Wisconsin in 1988 And now, in 2016 …
  • 13. A true story – why fuzz? • May 4, 2015 o Abhik was preparing lecture notes on fuzzing. o 11:00 AM – finished deciding on structure and trying to decide on a motivating example for fuzzing to interest the students, there are so many of them. o 11:11 AM – I get email update about a latest incident – an integer overflow in Boeing – a classic case where an automated method for sending out mal-formed or boundary inputs can reveal errors. 13
  • 14. Presented by Thuan Pham (Model-Based) Black- box Fuzzing 1 📄 Model- Based Blackbox Fuzzing Input model Peach, Spike … Seed Input 📄 📄 📄 Pass all checks Satisfy some checks Satisfy some checks Mutated Inputs
  • 15. Presented by Thuan Pham 📄 📄📄 📄AFLFast (Coverage-based) Grey-box Fuzzing 15 Seed Inputs Mutated Inputs … 📄📄 Input Queue Put “interesting” inputs back in the queue EnqueueDequeue
  • 17. Problem Statement • How to direct the exploration to reach certain locations or targets, or enhance coverage o in large-scale program binaries o with highly-structured inputs (e.g., multi-media files) o given inadequate test suite or seeds. 17
  • 18. Directed Search in White- box Fuzzing Apply to Crash Reproduction Problem 18 Crash reproducing supports - In-house debugging and fixing - Vulnerability checking
  • 19. Overview 19 Program binary Benign input files (Crash instruction, loaded modules, call stack, register values) Crash input files Hercules Toolset 1. Directed Search Algorithm 2. Guided Selective Symbolic Execution
  • 20. Control Flow Graph Construction Resolve indirect jumps/calls 20 IDA Pro CFG Generator Jump Table Extraction Edge Profiling •Assembly code •Direct Jumps/Calls Indirect Jumps/Calls CFG Program binaries
  • 21. First-cut Analyzer 21 • Output of Stage-1 : Flow Structures and input file(s) that can reach crash module • Output of Stage-2 : refined CFG, MDG and Hybrid symbolic file • Output of Stage-3: Crash input(s) and crash explanation (based on UNSAT core)
  • 22. UNSAT-core 22 … … b1 b2 b3 B4 bc1¬bc1 ¬bc2 ¬bc3 ¬bc4 bc2 bc3 bc4 First attempt: PC = bc1 ^ ¬bc3 ^ bc4 PC ^ CC == UNSAT bc1 contradicts CC Second attempt: PC’ = ¬bc1 ^ bc2 ^ bc4 PC’ ^ CC == SAT 1) Backtrack to b1 2) Take another branch Notations: bx: branch instruction bcx: branch condition at bx PC: path condition CC: crash condition Crash instruction
  • 23. Evaluation 23 Progra m Advisory ID #Seed files Hercules Peach S2E WMP 9.0 CVE-2014-2671 10 WMP 9.0 CVE-2010-0718 10 AR 9.2 CVE-2010-2204 10 RP 1.0 CVE-2010-3000 10 MP 0.35 CVE-2011-0502 10 OV 1.04 CVE-2010-0688 10 Time bound: 24hrs
  • 24. Vulnerabilities in file-processing programs 24 315 399 328 352 304 310 199 203 343 169 0 100 200 300 400 500 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 #CVE-assigned vulnerabilities by year (US National Vulnerability Database) (By 30/8) File Processing Programs
  • 25. Combining Black-box and White-box Fuzzing 25 Augmented MoBF MoBF + Transplantation Selective and Targeted Whitebox Fuzzing • Handles missing data chunks by data chunk transplantation • Enforces integrity checks • Guides data chunk transplantation • Explores deep paths • Generates specific values causing program crashes Peach Fuzzer Production-quality MoBF Hercules (ICSE’15) Scale to WMP, Adobe Reader
  • 27. Crucial IF 27 Input File with necessary part Input File with a missing part Test suites Crucial IFs
  • 28. Experimental Results28 Program Advisory ID Input Model #Seed files Hercules++ Peach Hercules VLC 2.0.7 OSVDB-95632 PNG 0 – 10 VLC 2.0.3 CVE-2012-5470 PNG 0 – 10 LTP 1.5.4 CVE-2011-3328 PNG 0 – 10 XNV1.98 Unknown-1 PNG 0 – 10 XNV1.98 Unknown-2 PNG 0 – 10 XNV1.98 Unknown-3 PNG 0 – 10 WMP 9.0 Unknown-4 WAV 10 WMP 9.0 CVE-2014-2671 WAV 10 WMP 9.0 CVE-2010-0718 MIDI 0 – 10 AR 9.2 CVE-2010-2204 PDF 10 RP 1.0 CVE-2010-3000 FLV 10 MP 0.35 CVE-2011-0502 MIDI 0 – 10 OV 1.04 CVE-2010-0688 ORB 0 – 10
  • 29. Coverage-based Grey-box Fuzzing AFL, LibFuzzer … 2 Mutators Test suite Mutated files Input Queue EnqueueDequeue
  • 30. Exposing paths in Grey- Box Fuzzing 30
  • 31. Key change 31 • Input: Seed Inputs S • 1: T✗ = ∅ • 2: T = S • 3: if T = ∅ then • 4: add empty file to T • 5: end if • 6: repeat • 7: t = chooseNext(T) • 8: p = assignEnergy(t) • 9: for i from 1 to p do • 10: t0 = mutate_input(t) • 11: if t0 crashes then • 12: add t0 to T✗ • 13: else if isInteresting(t0 ) then • 14: add t0 to T • 15: end if • 16: end for • 17: until timeout reached or abort-signal • Output: Crashing Inputs T✗
  • 32. • Constant: o AFL uses this schedule (fuzzing ~1 minute) o (i) .. how AFL judges fuzzing time for the test exercising path i • Cut-off Exponential: Power Schedules p(i) = (i) p(i) = 0, if f(i) > µ min( (i)/β*2s(i), M) otherwise β is a constant s(i) #times the input exercising path i has been chosen for fuzzing f(i) #fuzz exercising path i (path-frequency) µ mean #fuzz exercising a discovered path (avg. path- frequency) M maximum energy expendable on a state
  • 33. Prioritize low probability paths [CCS16]  Use grey-box fuzzer which keeps track of path id for a test.  Find probabilities that fuzzing a test t which exercises π leads to an input which exercises π’  Higher weightage to low probability paths discovered, to gravitate to those -> discover new states in Markov Chain with minimal effort. 33 π π ' 1 void crashme (char* s) { 2 if (s[0] == ’b’) 3 if (s[1] == ’a’) 4 if (s[2] == ’d’) 5 if (s[3] == ’!’) 6 abort (); 7 } p 8 CVEs in Binutils (3 new over GB fuzzing) Finds crashes 7x faster, as compared to plain GB fuzzing. Independent evaluation found crashes 19x faster on DARPA Cyber Grand Challenge (CGC) binaries.
  • 34. Coverage-based Greybox Fuzzing as Markov Chain From Hackernews 1
  • 35. Other works – Crash Bucketing 35 p1 f1 f2 f3 f4x x x b2 b1 b4 b3 b5  Identify culprit constraint  Use culprit constraint as “reason” of failure  Group failing paths having same “reason” together Culprit constraint[Upcoming work FASE17] Point-of-Failure based Approach Call-stack based Approach Symbolic analysis based Approach
  • 36. Program Repository Size (kLOC) #Failing Tests #Cluster Point-of- Failure #Cluster Stack hash #Cluster Symbolic Analysis mkfifo Coreutils 38 2 1 1 1 mkdir Coreutils 40 2 1 1 1 mknod Coreutils 39 2 1 1 1 md5sum Coreutils 43 48 1 1 1 pr Coreutils 54 6 2 2 4 ptx Coreutils 62 3095 16 1 3 seq Coreutils 39 72 1 1 18 paste Coreutils 38 4510 10 1 3 touch Coreutils 18 406 2 3 14 du Coreutils 41 100 2 2 8 cut Coreutils 43 5 1 1 1 grep SIR 61 7122 1 1 11 gzip SIR 44 265 1 1 1 seq SIR 57 31 1 1 1 polymorph BugBench 25 67 1 1 2 xmail Exploit-db 30 129 1 1 1 exim Exploit-db 253 16 1 1 6 gpg Exploit-db 218 2 1 1 1
  • 37. Recall CGC 37 A team of hackers won $2 million by building a machine that could hack better than they could Read more at http://www.businessinsider.sg/forallsecure- mayhem-darpa-cyber-grand-challenge-2016- 8/#ZuIF7Dmq3aaCAdaq.99 DARPA Cyber Grand Challenge -> Automation of Security [detecting and fixing vulnerabilities in binaries automatically]
  • 39. Automated Patching • Automated patching – source code and binaries o Vulnerability localization [where to fix] • Hypothesize the error causes – suspect o Symbolic execution [what values should be returned: angelic values] • Specification of the suspicious fragment • Input-output requirements from each test • Repair constraint o Program synthesis [which code can return these values] • Decide operators which can appear in the fix • Generate a fix by solving repair constraint. 39 Buggy Program Failing / Passing Tests Patched Program Patching Tool
  • 40. Example 40 1 int is_upward( int inhibit, int up_sep, int down_sep){ 2 int bias; 3 if (inhibit) 4 bias = down_sep; // bias= up_sep + 100 5 else bias = up_sep ; 6 if (bias > down_sep) 7 return 1; 8 else return 0; 9 } inhibit up_sep down_se p Observed output Expected Output Result 1 0 100 0 0 pass 1 11 110 0 1 fail 0 100 50 1 1 pass 1 -20 60 0 1 fail 0 0 10 0 0 pass
  • 41. Repair Constraint 41 1 int is_upward( int inhibit, int up_sep, int down_sep){ 2 int bias; 3 if (inhibit) 4 bias = f(inhibit, up_sep, down_sep) 5 else bias = up_sep ; 6 if (bias > down_sep) 7 return 1; 8 else return 0; 9 } Inhibit == 1 up_sep == 11 down_se p == 110 Symbolic Execution f(1,11,110) > 110
  • 42. Conjure up a function • Instead of solving • Select primitive components to be used by the synthesized program based on complexity • Look for a program that uses only these primitive components and satisfy the repair constraint o Done via another constraint solving problem – pgm. synthesis • Solving the repair constraint is the key, not how it is solved • Enumerate expressions over a given set of components / operators o Enforce axioms of the operators o If candidate repair contains a constant, solve using SMT 42 Repair Constraint: f(1,11,110) > 110  f(1,0,100) ≤ 100  f(1,-20,60) > 60
  • 43. Patching Tool Released 43 SEMFIX: ICSE 2013, Angelix: ICSE 2016 http://angelix.io
  • 44. Repair-ed 44 0 10 20 30 40 wireshark php gzip gmp libtiff Overall Angelix SPR GenProg #Fixes Del Del, Per Angelix 28 5 18% SPR 31 13 42% Subject LoC wireshark 2814K php 1046K gzip 491K gmp 145K libtiff 77K
  • 45. Over-fitting problem in Program Repair • Searches for arbitrary modifications could lead to undesirable program modifications like deletion of functionality 45 static void BadPPM(char file) { fprintf(stderr, "%s: Not a PPM file.n", file); exit(-2); } ➢Derived rules that disallow patches that cause significant changes to the control flow or data-flow of the program ➢Benefits of Anti-patterns: ○ Can be easily integrated with any automated repair tools ○ Localizes Better ○ Generate Fixes Faster Example of automatically generated patches Goal of Repair tools: Make all test pass Test: Pass if non-zero exit status Trivial Patch: Delete exit(-2) ➢Should disallow this modifications
  • 46. “Latest” Results 46 1 i f ( hbtype == TLS1 HB REQUEST) { 2 . . . 3 memcpy (bp , pl , payload ) ; 4 . . . 5 } (a) The buggy part of the Heartbleed- vulnerable OpenSSL 1 i f ( hbtype == TLS1 HB REQUEST 2 && payload + 18 < s->s3->rrec.length) { 3 . . . 4 } (b) A fix generated automatically 1 if (1 + 2 + payload + 16 > s->s3->rrec.length) 2 return 0; 3 . . . 4 i f ( hbtype == TLS1_HB_REQUEST) { 5 . . . 6 } 7 e l s e i f ( hbtype == TLS1_HB_RESPONSE) { 8 . . . 9 } 10 r e t u r n 0 ; (c) The developer-provided repair The Heartbleed Bug is a serious vulnerability in the popular OpenSSL cryptographic software library. This weakness allows stealing the information protected, under normal conditions, by the SSL/TLS encryption used to secure the Internet. SSL/TLS provides communication security and privacy over the Internet for applications such as web, email, instant messaging (IM) and some virtual private networks (VPNs). --- Source: heartbleed.com
  • 47. • Scalable white-box analysis on binaries • How Why For whom • Cluster paths online Guide search SW Acquisition • Control Symbolic Variables Extract semantics Developers with 3rd party code • Hybrid symbolic file COTS system assembly • Inject path sensitivity into GB 47 Collaborators: Marcel Boehme, Satish Chandra (Facebook), Sergey Mechtaev, Van Thuan Pham, Mukul Prasad (Fujitsu), Shin Hwei Tan, Jooyong Yi, Hiroaki Yoshida (Fujitsu). Relevant papers: http://www.comp.nus.edu.sg/~abhik/projects/Repair/index.html http://www.comp.nus.edu.sg/~abhik/projects/Fuzz/

Notes de l'éditeur

  1. It is the reason why the model-based blackbox fuzzing technique comes in. The technique has been implemented in some well-known tools like Peach Fuzzer and Spike. Basically, the idea is using an input model (someone calls it input grammar) which specifies the information of file format such as the data chunk types and data fields. With that support of input model, the fuzzing tool can generate more valid and semi-valid inputs, As a result, these inputs can lead to deeper program paths and have more chance to expose vulnerabilities.
  2. The first and common technique is blackbox fuzzing. It considers the PUT as a black box, and have no information about it. Given a seed input, the tool randomly mutate or modify some parts of the seed file to generate massive number of new files before feeding them to the program under test, and monitor the program to detect abnormal behaviours like program crashes. However, since the seed file is randomly mutated, it is very likely that a large portion of the mutated files will be rejected by the parser code because these file are invalid respect to the file format.
  3. File processing programs are everywhere. Even though these programs are carefully tested, according to the data we collect from the US National Vulnerability Database, in 10 years, since 2007 the NVD has assigned CVE ID for more than 3000 vulnerabilities found in these programs. The number could be much bigger because we do not know how many vulnerabilities which have been discovered but not reported to NVD. Maybe several of them are sold in the black market so attackers can use them to exploit the affected programs and attack our systems. In fact, a large portion of these vulnerabilities has been exposed by crafted common media and document file formats which we use very often in our daily life, such as MIDI, FLV, PDF, PNG. Because of that, it is the pressing need for us to design a better testing technique to effectively and efficiently discover before some attackers can do it.
  4. Data chunk transplantation is the key idea in our new Whitebox Fuzzing approach - we call it Model-based Whitebox Fuzzing because this is a combination with substantial modifications between Model-based Blackbox Fuzzing and normal Whitebox Fuzzing. Model-based Blackbox Fuzzing side handles the missing data chunk problem by implementing the data chunk transplantation idea. Moreover, having the input model, it also enforces the integrity constraints of generated test cases. On the right hand side, the whitebox fuzzing supports the data chunk transplantation by providing some guidance. I will explain how Whitebox Fuzzing can support Data chunk transplantation in details in the next few slides. Moreover, Whitebox Fuzzing does concolic exploration to reach potential target crash locations and generate specific values that can cause program to crash. In terms of implementation, we build our system on top of Peach Fuzzer - a production-quality fuzzer and Hercules -- a selective and targeted whitebox fuzzing. Now, let me explain in details how our system is designed and implemented. First of all, let me explain how the input model is written and how the original version of Peach Fuzzer works. These things are important to fully understand our approach.
  5. More satisfying to me as a security researcher than any academic award.
  6. Suppose f1 is a failing path. To identify the culprit constraint of f1, out technique explore all paths in DFS search strategy until it finds the closest passing path p. During the exploration, some new failing paths (f2,f3,f4) and some infeasible paths will be traversed/detected. The branch condition of the branch from which the passing path p deviates is identified as culprit constraint.