This document discusses principles of threat attribution through analyzing artifacts left behind by threat actors. It explains that attribution is difficult as there are usually multiple possible explanations but outlines approaches like examining metadata, anomalies, targeting patterns, and mistakes over time. Specific case studies are presented on a hacking back of a command and control server, using stolen certificates to attribute Lazarus, and clues in timestamps and encoding errors that point to Lazarus having Korean origins.
4. Why Attribution?
What are the common questions people try to answer
with attribution?
- Am I a targeted victim or was hit by a random
chance?
- What should I do to address the situation?
4
5. Disclaimer
- We try not to attribute any of the activity to any
Nation State
- When we say “attribution” - we try to give answers
to the following questions:
- Are we dealing with any known threat actor?
- Is the actor commonly targeting specific targets?
- Learn more about the threat actor: working hours, possible
geographic location, association with other peers
- Targets of interest, tools and techniques commonly used by
the threat actor
- And more..
5
6. Attribution is Difficult: WHY?
Attribution of targeted attacks is often similar to
Intelligence/Counter-Intelligence
operations where for every question multiple
answers could be found and you need to make
an educated decision which answers are right.
6
Battle for Inchon
JUST LIKE WITH REAL SPIES, RIGHT? :)
7. There are multiple versions of Truth
7
https://
www.youtube.com/
watch?v=uTXs-
XKA4pI
8. Economical Metrics TO Attribution
Scientists like quantifiable results (show me the
numbers!)
Some artifacts are harder to fake than others.
Consistently faking certain artifacts to match a
particular threat actor is expensive.
Examining validity/truthfulness of certain artifacts
from the point of cost-evaluation can be effective.
8
9. Common Approaches in Attribution
• Use known data and metadata: IP, Domain, Character
Encoding, Strings, Time-Zones
• Based on Humint (human intelligence)
• Based on Known/Visible Actor Targets
• Based on Known/Visible Actor Objectives
• Binary structure and anomalies in binary structures or
implementation of algorithms
• Call back/Hack back/Interact with attacker
• Attacker Errors
• Average dwelling time (how long actor stays undetected)
• Data leaks, like Wiki Leaks (Often Reliable)
9
10. Cost?
“Mistakenly” leaking language encoding in binaries is
cheaper.
Consistently leaking the same encoding, as mistake,
across multiple campaigns, is expensive and requires
perfect OPSEC discipline.
10
11. Cost(of falsifying an Attribution Artifact)
Cost(Attribution Artifact) =
Expense required to fake given
artifact over period of time
11
12. Awesome talk on cost
of IOCs by pinkflawd
and blackswanburst:
IOCannon: Blasting
back on Attackers with
Economics
https://github.com/pinkflawd/IoCannon
17. Equation Group
What did Equation do wrong, and
how can we avoid doing the same?
...
17
https://wikileaks.org/ciav7p1/cms/page_14588809.html
18. Example of equation group mistakes
18
Source: https://securelist.com/files/2015/02/Equation_group_questions_and_answers.pdf
19. Difficulty of faking artifacts
C2 calling pattern: easy - just cut-n-paste
Domain name - harder, requires ability of taking over a
domain name
IP address - even harder. May need to be able to
compromise the hosting system.
19
21. How reliable approaches based on data and metadata
• IP, Domain, Encoding, Strings, Time-Zones
– Have you seen rental contract for hosting? How long it
typically lasts? Never heard about proxies? Innocent victims,
when c2 hosted on EDU, Gov resources or News outlets?
• Code snippets
– Could it be just public library?
– Have you seen it at the first time? Along one campaign? Inside
legit code?
– Rarely used programming languages? Crypto algorithms?
Encoding?
21
22. How reliable approaches based on Anomalies in Binary structure
• Host indicators and file dependencies
• How maps to killchain stages
• Anomalies in Binary structure (anti-forensics tricks)
– Coders and packers?
– Uniq compilation and optimization tools?
– Execution paths
– Code protection methods
22
Fuzzy Hashing algorithms can help to find related binaries:
https://github.com/trendmicro/tlsh - Locality Sensitive Hashing (opensource)
24. How reliable approaches based on non technical indicators
• Based on Humint
– How reliable the source? Does the source have enough capabilities to make a
judgement on this topic? Does the source have hidden agenda?
• Based on Visible actor Targets
– Actor targets particular industry? Particular countries?
– Almost everyone? Almost every country exept…?
• Based on Visible actor Objectives
– Money focused
– Tactical tasks
– Strategic tasks
• Based on Visible actor Objectives over angle
– Enhance actor capabilities
– Reduce victim capabilities
24
25. How reliable approaches based on Statistics and Long term
campaign analysis
• Attacker errors (faults - Hangover), especially crypto
• Average dwelling time (how long actor stays
undetected)
– Detected in days and weeks
– Detected in Months
– Detected in Years
• Data leaks, like Wiki Leaks (Often Reliable)
– Leaked source code of the binaries
– Leaked unique exploits
– Dumps of the (De)Classified documents
25
26. How reliable approaches based on Active actions
• Call back
– Feed attacker with something, that do passive or silent
fingerprinting
• Hack Back
– Mostly the field of LE agencies, but there are a number of
public examples
• Interact with attacker
– Force the attacker to make a mistake during the interaction
and win your time and try to reveal the attacker identity
26
27. Campaigns
Attributing binaries is difficult. No context. Binaries are
often shared between the groups.
Attribution activity campaigns can be a little bit more
simpler, because of presence of additional components:
a victim, attacker TTPs and so on, time, compromised
hosts, time zone of attacker activity and so on
27
37. Lazarus: Hints to Korea
Interesting encoding error was discovered by Kaspersky
Labs:
c05329f101979fa75ca297c4f77c8cd69fe8eb499d4f693550b734beb9f564b
9
37