In this presentation, I introduce the concepts of malware analysis, threat intelligence and reverse engineering. Experience or knowledge is not required.
Feel free to send me feedback via Twitter (@bartblaze) or email.
Blog post: https://bartblaze.blogspot.com/2018/02/malware-analysis-threat-intelligence.html
Labs: https://github.com/bartblaze/MaTiRe
Mind the disclaimer.
2. Introduction
● Career of +8 years in information security
● Last 4 years even more involved in malware research & analysis
● Maintain a personal blog (https://bartblaze.blogspot.com)
● Twitter: @bartblaze
● Email: bartblaze@gmail.com
● Please do reach out!
2
3. What we will see today
● Short introductions for each section
malware analysis, threat intelligence, reverse engineering
● Combining all three together while taking a deep(er) dive
● Hands-on exercises of course!
And also…
● Feel free to interrupt me at any point during the course
● Contact me online or offline at any given point
3
4. Preparation: for this course
Verify that…
● You have already downloaded the Virtual Machine provided
● You have not installed VirtualBox guest additions
● The VM is in a clean state - it cannot be already infected with malware
● Take a snapshot of the clean state
No VM available? Ask if you can partner up.
In case of any other issues or questions… Shout :-)
4
5. Preparation: for your own lab at home
● Always isolate your virtual environment
○ Use NAT
○ Use a VPN on the host, if you allow communication
● Snapshot capability
○ VirtualBox (Free)
○ VMWare Workstation (Not free)
● Never install VirtualBox guest additions or VMware tools
○ Why? Scrutinizes your VM - easier to identify by malware
● If possible, have a non-Windows machine as host
● Keep your host machine, antivirus and virtualization software updated!
5
7. Malware Analysis: The basics...
Malware is any form of malicious software. (Plural: malware)
Which types of malware do you know? Can you give me an example?
● Virus: needs user interaction & infects other applications; for example a file infector
● Worm: self-replicates; for example to shares or removable media
● Trojan or trojan horse: disguised as an innocuous application
● Backdoor: allows for persistent access; for example RATs
● Rootkit: any application that allows for privileged and persistent access
● Spyware: any application that spies on the user; for example a keylogger
● Ransomware: holds the user hostage in return for a price (either files, browser or the
whole machine)
● PUP/PUA/Adware: modifies browser settings and/or installs unwanted applications
7
8. A word on ransomware
Source: https://blogs.technet.microsoft.com/mmpc/2017/09/06/ransomware-1h-2017-review-global-outbreaks-reinforce-the-value-of-security-hygiene/
(2017, Microsoft) 8
9. How did it start?
1989: AIDS trojan, first case of ransomware
2005: GPcode (PGPCoder)
2009/2010: WinLock
2012: ACCDFISA, Urausy, Reveton
2013: CryptoLocker
2014: CTB-Locker (Critroni), TorrentLocker, CryptoWall
2015: Mobile ransomware (on Android), such as Fusob
2016: Locky, ‘Open-source’ ransomware such as; Eda2, Hidden Tear
2017: WannaCry (May), NotPetya (June), BadRabbit (October) 9
10. CryptoLocker
● Introduced the end of
rogueware (fake antivirus)
● Innovative …
● Inspirational …
● … And very annoying :-)
Many assumed that any form of
cryptographic ransomware
(“cryptoware”) is Cryptolocker,
however this was one ransomware
variant. It has been dead since 2014.
10
11. How does one get malware? The bad way
11
● Phishing or spear-phishing
● Exploit kits
● Drive-by download
● USB drive or other removable media
● Network (shares, SMB)
● Manual installation (RDP, VNC, TeamViewer, …)
● Watering hole (Strategic Web Compromise)
● Other malware that downloads and/or installs ‘companions’
12. How does one get malware? For analysis purposes
● Malware Sample Sources for Researchers
https://zeltser.com/malware-sample-sources/
● List of Malware Sources
http://www.kernelmode.info/forum/viewtopic.php?f=16&t=308
● Get a job in this field that applies most to you. Keywords for jobs:
Malware analyst/researcher, threat intelligence (analyst), reverser/reverse
engineer - and any of these but add ‘cyber’ in front - yes really
● Ask other researchers :-)
12
13. Analysing malware: static vs dynamic
13
● Static: do not run the malware, look at static properties
○ Can you think of tools, or what could be considered static
properties?
● Dynamic: run the malware, and examine onwards
○ Can you think of tools, or what could only be discovered by
running the malware?
Why not both?
14. Static malware analysis: primer
First of, consider the type of a file. Is it a(n)…
● Executable? EXE, COM, SCR, PIF, DLL
○ Strings, compile time, imports, sections, …
● Image? PNG, BMP, JPG, GIF
○ Steganography, hidden content, creator/creation date, ...
● Office file? DOC/DOCX, XLS/XLSX, RTF
○ Creator/creation date, embedded content, filename, …
● Adobe file? PDF, EPS, SWF/FWS
○ Creator/creation date, embedded content, filename, …
● Archive? ZIP, RAR, 7z, ISO
○ Creation date, contents, …
14
15. Static malware analysis: tools
It is important to have a proper toolbox, or toolset
● Executable?
○ ExeinfoPE, Detect it Easy (DIE), PEViewer (RogueKillerPE)
● Office document?
○ Oletools, oledump, OfficeMalScanner, QuickSand
● Adobe document?
○ Pdfid, pdf-parser, PDF Stream Dumper
Additionally: strings2, FLOSS, and… calculate the hash! (MD5, SHA1, SHA256)
15
16. Lab 1: static analysis
● On the desktop, you can find a folder named ‘Labs’
● Examine the files inside the LAB1 folder
Instructions
● Use HxD, ExeinfoPE and PEViewer to look at the files
● Go over at least these tabs in PEViewer:
○ Dashboard, Indicators, Hex/Strings, PE Sections/Overlay, PE
Imports/Exports/TLS, PE Debug, PE Resources, Version Info/Digital
Signature
● Run FLOSS and/or strings2 over the files, and identify any strings of interest
16
17. Lab 1: addendum
PE - what’s in a file?
● Short for PECOFF - Portable Executable
and Common Object File Format Specification
● Windows only! x86 and x64
● Executables, object code, DLLs
● For extensive reading:
“PE Format”
https://msdn.microsoft.com/en-us/l
ibrary/windows/desktop/ms68054
7%28v=vs.85%29.aspx (Microsoft)
17
DOS header - MZ - 0x4D5A
PE header - PE 0x5045
Sections - code, data, imports & original
entrypoint (OEP)
Resources - icons
Overlay - appended data
18. Dynamic malware analysis: primer
You have two different ways of doing dynamic analysis:
● Do it yourself: run the malware in a VM
○ Manual dynamic analysis
● Use a sandbox: let a sandbox take care of the malware
○ Automatic dynamic analysis
What are some of the pros and cons of, on one hand, running the
malware yourself, and on the other hand, let a machine take care of it?
18
19. VirusTotal
19
https://www.virustotal.com
● Scans with over 60
engines
(antivirus/machine learning)
● Supports, in theory, all
file types
● Extensive file details
● Limited VT sandbox,
Tencent’s sandbox
● Useful for a second
opinion
● All uploads are public
21. Online sandboxes - part II
21
● https://malware.sekoia.fr/new
(documents only, no executables)
● https://iris-h.malwageddon.com/
(documents only, no executables)
● https://manalyzer.org/
(executables only)
22. Quick concept - packed malware
● Traditionally used for…
○ shrinking the file, in size
● Now used for ‘obfuscating’ the file,
and its strings, imports, …
22
DOS header - MZ - 0x4D5A
PE header - PE 0x5045
Unpacker code & entrypoint (stub)
Packed sections
23. Lab 2: dynamic analysis
● On the desktop, you can find a folder named ‘Labs’
● Examine the files inside the LAB2 folder
Instructions
● Execute the file. Check out some of the buttons. What functionality does this file
have, at least?
● Open Process Hacker and examine the strings of the process. Click the process
Properties > Memory tab > Strings button
● Can you find any peculiar string(s), that static analysis or strings does not reveal?
23
24. Additional tools
Fakenet
● Create a “fake network”
● Tricks the malware into thinking there’s connection
● Serves back files correspondingly
CaptureBAT
● x86 only (32-bit Windows)
● Create a log file to analyse registry, file changes and more
● Creates a copy of deleted files
24
25. Lab 3: static + dynamic analysis
● On the desktop, you can find a folder named ‘Labs’
● Examine the files inside the LAB3 folder
Instructions
● Statically analyse the file. What can you discover already?
● Start Fakenet by double-clicking the icon, and CaptureBAT by opening command
prompt, navigate to the directory, and start it with the following command:
○ CaptureBAT.exe -c -l lab3.log (-c will capture events, -l will write to a log file)
● Open Process Hacker, and execute the file. Let it run for a minute, and check
process memory strings. What can you discover with CaptureBAT and Fakenet?
25
26. Recap: malware analysis
26
● Malware can assume many forms
● It does not discriminate, as you have malware for most modern
operating systems
● Some malware can exist cross-platform (think of a malicious
macro in Word, for example)
● Static vs dynamic analysis, and combined
● Know which tools are at your disposal, but also…
● Know how to perform analysis manually!
28. Threat Intelligence: The basics...
28
● “ [...] the process of understanding the threats to an organization
based on available data points.”
Threat Intelligence: What It Is, and How to Use It Effectively, SANS 2016
https://www.sans.org/reading-room/whitepapers/analyst/threat-intelligence-is-effectively-37282)
● It entails several parts:
○ Tactical
○ Strategic
○ Operational
● Be able to see the bigger picture!
29. But first… how did we get here?
● Mandiant’s APT1 report, released in February 2013
https://www.fireeye.com/content/dam/fireeye-www/services/p
dfs/mandiant-apt1-report.pdf (PDF)
● Directly implicated a PLA unit
● Significant impact on both attackers and defenders
Cyber-espionage is very real and can occur at any point, anytime and
anywhere.
29
30. Cyber Kill Chain vs Diamond Model
Cyber Kill Chain:
● Published in 2009 by Lockheed
Martin
● Based on a series of events, from an
attacker’s perspective, but
defender-centric
● Disrupt or deny a chain and you
may gain the upper hand - resulting
in minimal financial losses or
compromise
30
Diamond Model:
● Published in 2013 by Active
Response
● More attacker-centric
● Tactics, techniques and procedures,
also known as TTPs
● As a defender, it may enable a
better overall response to a threat -
resulting in minimal financial losses
or compromise
33. What else is there?
33
Apply or map Mitre’s ATT&CK matrix to an attacker:
Source: https://attack.mitre.org/wiki/ATT%26CK_Matrix
34. Pyramid of pain in Threat Intelligence (TI)
34
Source: http://detect-respond.blogspot.co.uk/2013/03/the-pyramid-of-pain.html (2013, David Bianco)
35. Exercise: investigate a real case
35
CrunchyRoll hack delivers malware:
https://bartblaze.blogspot.com/2017/11/crunchyroll-hack-delivers-
malware.html
Together, we will go through the stages of this attack, and apply or
map this to (part of) the diamond model, and/or the (cyber) kill chain.
36. Threat Intelligence: tactical
36
What does an organisation need to defend itself?
● Applied to real-time events
● More temporal. Why?
○ Threat actors can ‘burn’ TTPs
○ They can re-tool
● Think of the pyramid of pain!
37. Threat Intelligence: strategic
37
How does an organisation defend itself?
● Be able to respond correctly in case of an incident
● What is needed to protect the organisation
● Forms more of an overall picture, more so for management
○ But again: pyramid of pain
● Often analyses long-term trends
38. Threat Intelligence: operational
38
Operational and technical are usually glued together
● Handles on details of an attack or intrusion
● Provides guidance and technically-focused intelligence
● Often, indicators are also provided
○ What’s that pyramid again?
39. Finding intelligence
● Use threat intelligence feeds
○ For example: AlienVault OTX
○ And act on them
● Enable automated alerts
○ For example: Google Alerts
● Twitter - a rich data source, sometimes
● Security blogs
○ Vendors or individuals
● Resource: “Awesome Threat Intelligence”
○ https://github.com/hslatman/awesome-threat-intelligence
39
41. What’s in our toolbox? Internal intelligence
41
● Logs from your software and hardware
○ Antivirus, firewall, anti spam, event logs, …
○ SIEM: Security Information & Management System
■ Includes log management, compliance, analysis, specific
correlation and aggregation of data, … In a dashboard
● Previous attacks, successful or not
○ Attacker methodology (think of the matrices!)
● People: experience, insight - create, structure & maintain a team
42. How to leverage Indicators of Compromise
● As seen before, indicators come in many different forms
○ Can you name some?
● STIX, TAXII, CyBox: helpful for standardising data, and transforming it
into intelligence
○ Read: How STIX, TAXII and CybOX Can Help With Standardizing Threat
Information
https://securityintelligence.com/how-stix-taxii-and-cybox-can-help-with-standardizing-threat-information
/ SecurityIntelligence, 2015
● Leverage rules or rulesets, most commonly:
○ OpenIOC
○ Yara
42
43. OpenIOC
● Created by Mandiant
● We will, however, not leverage
OpenIOC in this course
● Mostly used is its IOC Editor
● To learn more
OpenIOC Series: Investigating
with Indicators of Compromise
(IOCs)
https://www.fireeye.com/blog/threat-research/
2013/12/openioc-series-investigating-indicator
s-compromise-iocs.html Mandiant, 2013
43
https://www.fireeye.com/services/freeware.html
44. Yara - part I
● Yet Another Recursive Acronym
● “The pattern matching Swiss knife”
● Can identify and classify files, not only malware
● Repository on Github
https://github.com/VirusTotal/yara
● Public list of Yara rules
https://github.com/InQuest/awesome-yara
44
46. Yara - part III
46
● You can scan a file, folder or process with Yara
● The commandline is as follows:
○ For a file: yara32.exe rules_file file
○ For a folder, recursively: yara32.exe rules_file folder -r
Exercise: Let’s write some Yara rules together!
47. Threat Intelligence: pitfalls
Threat Intelligence is not…
● A silver bullet! It often takes more than just Indicators of
Compromise (IOCs)
● Easy. You need to cover all angles, vectors, possibilities, …
● Difficult. Sometimes, it’s easier… TTPs & sharing come a long
way
What about attribution?
47
48. Roll the attribution dice
48
It’s North-Korea…
Theoretically: any country with capability
49. Attribution: scenario
49
Imagine a scenario, where a threat actor or cyber criminal attacks an
individual, an organisation, or even a country.
● What do we establish first? What is or isn’t actionable?
● What are some of the possible issues?
● What are some identifiables? (Think TTP!)
● What’s next?
50. Lab 4: Yara
Write efficient Yara rules for the file and its droppers, if any, in Lab 4.
These can be either or both for the file on disk, or in memory.
Instructions
● Run FLOSS or Strings2 on the file. See if you can find something
specific, interesting and/or suspicious. Write a Yara rule, and test!
● Run the file, and observe the behaviour. Fakenet, CaptureBAT &
Process Hacker are your friends!
50
51. Addendum: time
Time is a very important aspect in threat intelligence
● Five minutes can make a lot of difference
○ To the attacker: execute the operation
○ To the defender: identify an operation (intrusion)
● Time is in everything: do you have a plan ready, that can kick out
an attacker as fast as possible?
● What about attackers that stay on the network?
● What about attackers that perform rapid lateral movement?
51
52. Recap: threat intelligence
● Ideally, you leverage the full threat
intelligence cycle
● Know what you don’t know
● Trust, but validate/verify
● There are no silver bullets
● Attribution: use, but don’t over-use
● Always maintain a healthy sense of
paranoia
● Be aware, and wary, of time
52
Strategic
Tactical
Operational
54. Reverse Engineering: The basics...
● “Reverse engineering is the process of analyzing a subject
system to create representations of the system at a higher level
of abstraction.” Reverse engineering and design recovery: A taxonomy
http://win.ua.ac.be/~lore/Research/Chikofsky1990-Taxonomy.pdf (1999, Chikofsky & Cross)
● You can compare it with a bicycle… It is assembled at the factory,
but, for whatever reason, you may wish to disassemble it
● Basically, you attempt to understand the soft -or hardware
● Re-assembling, re-engineering parts may be prohibited.
Read the license agreement (EULA)
54
55. Different layers
55
● Each of these layers has
multiple languages
● Can you think of any
examples for each?
● The operating system can
only understand machine
language
● We will be focusing on
assembly, however.
More specifically: x86
56. Registers
56
General EAX EBX ECX EDX
Segment CS DS ES/FS/GS SS
Pointer ESP EBP EDI ESI
What else is there?
● EIP: instruction pointer - points to the next instruction
● EFLAGS: flags - these hold the state of the CPU, where each bit is a … flag: 0 or 1
Examples are: CF, SF, ZF, PF, TF (and many more… )
32- bit registers!
57. Instructions
Moving data MOV MOVZX MOVSZ LEA
Arithmetic (math) ADD SUB INC DEC
Logic OR/AND/XOR SHR/SHL SAL/SAR ROR/ROL
Control Flow JMP CMP/TEST CALL/RET JZ/JNZ/JB/JG
(...some more)
What else is there?
● Manipulating the “stack”: PUSH, POP (pushad, popad)
● NOP: No OPeration - represented with 0x90, and does nothing
57
58. Examples
58
mov eax, ebx Move the value of ebx in eax
inc eax Increment the value of eax with 1
xor eax, eax Clear the eax register
instruction destination, source
59. Stack, heap and memory: visualised
59
~error catching
Stack
Heap
Program image (base image of binary)
DLLs
TEB
PEB
~shared user page
Kernel land (no user access)
0x00000000
Grows UP
Grows DOWN
0x0040000
Loaded libraries/images
Thread Environment Block
Process Environment Block
0x7fffffff
0xffffffff
60. Stack
60
● Stores functions/function parameters, local variables, and for program control
flow. A stack usually has a limited and fixed location in memory, where it begins
● Last In First Out or LIFO
○ The order in which elements come off a stack
● Push and Pop
○ PUSH: Adds an element to the stack
○ POP: Removes the most recently added element (sometimes referred to as pull)
● The stack grows upwards to lower addresses
● Used for short-term storage only
● Stack overflow: if the stack is full; and does not contain enough space to accept the
next element - the call stack pointer exceeds thus its boundary
61. Heap
● Used for dynamic memory during program execution
● The heap grows downwards to higher addresses
● Has typically more memory available to it than the stack
● The heap requires pointers to access it
● Programmer has to define or explicitly allocate and deallocate (free) the
memory
○ If this is not done properly, you may experience a memory leak
● Allocated memory is referred to as malloc
61
62. Exercises
● Name a typical return instruction
● Do we know of any 8-bit, 16-bit, 32-bit or 64-bit registers for
general purposes? Name two (2) in total
● Name a typical return register
● Which register/s is/are typically used for loops?
62
63. Debuggers, disassemblers and decompilers
● Debuggers: debug a binary, this means running the sample!
○ Examples: OllyDBG, Immunity, x64dbg
● Disassemblers: disassemble or tear apart a binary. Static!
○ IDA Pro Free, Radare
○ Some disassemblers can also be used as a debugger
● Decompilers: decompile a binary, this means reverting back to
its source code (more or less)
○ ILSpy, dnSpy
○ Some decompilers can also be used as a debugger
63
64. Compression and obfuscation
● Compression is, as mentioned, for reducing the file size, but also
for ‘obfuscating’ data. Examples include but are not limited to:
○ UPX, Themida, ASPack, MPRESS, VMProtect, …
○ But also: Winzip, Winrar, 7z, …
● Obfuscation is actually masquerading sections of code (and text).
There is a plethora of methods, including, but not limited to:
○ Fake C2 servers planted as strings
○ Base64 (remember?), XOR, RC4, AES, …
○ Garbage instructions or junk code
○ But also: obfuscators! For example, for .NET: SmartAssembly, Confuser, ...
64
65. Lab 5: IDA, x64dbg and dnSpy
We will examine the files in LAB5 together.
Instructions
● Open LAB5-1 in IDA, visit the strings tab, and find the ‘Hello World’ string, and its
accompanying code block. What does it do?
● Open LAB5-2 in x64dbg (x32dbg), and step through. The file is packed with UPX, and
we will need to unpack it manually!
● Open LAB5-3 in dnSpy, and go through the code, and figure out what it is doing
from only reading the code.
65
66. Lab 5 addendum: UPX
● In LAB5, one of the files we investigated, turned out to be
packed with (default) UPX
● We can also leverage the same packer, UPX, to unpack the file
● Try it yourself! UPX is located in the Tools folder > upx394w
○ upx -d file -o unpacked_file
66
67. XOR encryption
● A form of encryption
(sometimes referred to as
encoding)
● Exclusive OR
● Will return true only if one
of the operators is true
67
Operator Operator Result
0 0 False (0)
0 1 True (1)
1 0 True (1)
1 1 False (0)
68. Lab 6: XOR
We will examine the file in LAB6.
Instructions
● Open the command prompt, and navigate to the Tools folder >
XOR
● Use either unxor, xorsearch or xorstrings to find the secret
message in the file
● All tools provide a help file/manual with the “-h” parameter
○ Example: unxor.py -h
68
69. Addendum: CyberChef
● “The Cyber Swiss Army Knife - a web app for encryption,
encoding, compression and data analysis”
● Can perform a ton of operations, such as encoding/decoding,
extracting, brute-forcing, …
○ For example: XOR decoding + bruteforcing!
● Utilise online: https://gchq.github.io/CyberChef/
● Send your ideas, collaborate:
https://github.com/gchq/CyberChef
69
70. Lab 7: reversing time
Investigate, analyse (and write Yara rules for) the files in LAB7.
Hints for the first LAB7 file (LAB7-1):
● Are you iMPRESSed with my packer?
● I smell a RAT…
Hints for the second LAB7 file (LAB7-2):
● Windows? Never heard of it.
● Who will become the next xorrior?
● Mirror mirror on the wall, who has the largest botnet of them all?
70
71. Recap: reverse engineering
● Reverse engineering, or reversing, takes time to learn - this is
perfectly normal
● A good method is to write a small piece of software, or file, and
consequently analyse it (for example, hello world)
● ‘Crackmes’ often provide a great experience
● Every engineer, including a reverse engineer, uses Google (or
MSDN) from time to time
● Familiarise yourself with the language, and languages
71
73. LAB 8
73
● Lab 8 is for the fearless… So all of you!
● It includes a real campaign from an APT actor
○ What’s an APT again?
● Analyse the full chain, this means:
○ Context
○ Purpose
○ Full analysis (pre, during, post)
○ Write a (short) report on what the malware does
74. LAB 8 - hints
● Examine the file in LAB8. What kind of file is this?
○ It appears to be an email, so rename the file to LAB8.msg
● Investigate where it is sent from (or by whom) and where to
○ Additionally check the date, subject & content of the email
● Save the attachment, and unzip it. What kind of file do you get?
○ It appears to be a Word document! Use static and dynamic analysis to investigate what
happens
○ Don’t forget to get and set your tools ready before starting dynamic analysis!
● What files does it drop? Any outgoing connections? Can we write a Yara rule to detect any part(s)
of the attack? How is the attack performed?
○ The attack is mostly done in PowerShell, and appears to use shellcode
● Can we make an effort to do attribution at the end?
○ Qatar has been having political issues, and is in a diplomatic crisis. Possibly, one of its
neighbouring countries?
74
76. Where and how to learn more
Books:
● Malware Analyst's Cookbook
and DVD
● Practical Malware Analysis
● Practical Reverse Engineering
76
Online:
● Reverse Engineering for Beginners
https://beginners.re/
● Github: awesome X
○ https://github.com/rshipp/awesome-mal
ware-analysis
○ https://github.com/fdivrp/awesome-reve
rsing
○ https://github.com/hslatman/awesome-t
hreat-intelligence
And also: Psychology of Intelligence Analysis (CIA)
https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/books
-and-monographs/psychology-of-intelligence-analysis/