2. Introduction
• Simseer.com is a set of web services to analyse
malware using program structure as a signature..
Why?
• AV String signatures not very robust.
• Can’t detect ‘approximate’ matches.
• Hard to generate signature for an entire family.
• Program structure improves signature-based
methods.
3. Who am I?
• Ph.D. Student at Deakin University.
• Presented at Ruxcon, Black Hat, AusCERT, etc.
• Published in academia.
• Book author
• Recently relocated to Canberra.
5. Signatures
• In my other presentations.
• Signature is based on ‘set of control flow graphs’
6. Signature Extraction
• Transform ‘set of control flow graphs’ into a
‘feature vector’
• Decompilation + N-Grams W|IE
|IEH
W|IEH}R
IEH}
EH}R
proc(){
L_0 L_0: W|IEH}R
while (v1 || v2) {
L_3 L_1:
if (v3) {
true L_2:
L_6
} else {
true L_4:
}
L_1 L_7 L_5:
true }
true L_7:
return;
L_2 L_4
}
true
L_5
7. Simseer
• Begin start of demo...
• A revamp of my existing
http://www.FooCodeChu.com service.
• Submit an archive of malware samples.
• Results
▫ A similarity matrix comparing samples.
▫ An evolutionary tree showing relationships.
10. Simseer
• Demo complete...
• Use ‘distance between vectors’ to show
similarity.
• Visualize using phylogenetics software.
11. SimseerCluster
• Begin demo...
• A new service.
• Submit an archive of malware samples.
• Define the number of clusters.
• Results
▫ Samples grouped into clusters.
▫ Cross checking samples with AV.
▫ Identification of families.
14. SimseerCluster
• Demo complete...
• Use ‘similarity matrix’ and ‘cosine similarity’.
• Pass to ‘cluster analysis software’ – The Weka
Machine Learning Toolkit.
• Use Hierarchical clustering.
15. SimseerSearch
• Begin demo...
• A new service.
• Submit a malware sample.
• Specify threshold of similarity.
• Results
▫ All samples in database similar to query.
▫ An AV report.
▫ Heuristics to detect obfuscations (packing).
18. Query Benign
r
SimseerSearch p
d(p,q)
q
Query Malicious
Query
• Demo complete...
Malware
• Use ‘nearest neighbour similarity search’ based
on ‘Euclidean distance’.
• Packer detection based on entropy analysis.