The document provides information on operationalizing YARA for analyzing malware indicators. It begins with an introduction to YARA and discusses how YARA breaks rules down into atomic substrings called "atoms" to efficiently scan for malware patterns. Examples are given of atoms in regular expressions and hexadecimal strings. The document then demonstrates how YARA can be used to detect malware indicators in network traffic and static files by writing YARA rules with relevant string patterns and associated conditions.
2. CircleCityCon 2015 -‐ TLP:WHITE
“YARA is to files what Snort is to network traffic.”
-- Victor Manual Alvarez, YARA Developer
3. Bio
CircleCityCon 2015 -‐ TLP:WHITE
Chad Robertson
Threat Researcher
Fidelis Cybersecurity
YARA Exchange since 2012
CCE, GCIH Gold, GPEN Gold, GCFA Gold, CISA
Prior incident response lead
Authored research papers on HIPS, memory forensics, and malicious PDF obfuscation
9. CircleCityCon 2015 -‐ TLP:WHITE
YARA -‐ Atoms
/(abc|efg)/
Sometimes a single atom is enough (like in the previous example "abc" is
enough for finding /abc.*ed[0-9]+fgh/), but sometimes a single atom isn't
enough like in the regexp /(abc|efg)/. In this case YARA must search for both
"abc" AND "efg" and fully evaluate the regexp whenever one of those atoms is
found.
Source: https://code.google.com/p/yara-project/source/browse/trunk/libyara/atoms.c?r=261
10. CircleCityCon 2015 -‐ TLP:WHITE
YARA -‐ Atoms
Atom Tree:
/Look(at|into)this/
-AND
|- "Look"
|
|- OR
| |
| |- "at"
| - "into"
|
- "this”
In the regexp /Look(at|into)this/ YARA can search for "Look", or search for
"this", or search for both "at" and "into".
Source: https://code.google.com/p/yara-project/source/browse/trunk/libyara/atoms.c?r=261
12. CircleCityCon 2015 -‐ TLP:WHITE
YARA -‐ Atoms
{00 00}
Atom 00 00 has a very low quality, because it's only two bytes long and both
bytes are zeroes.
{01 01 01 01}
Atom 01 01 01 01 is better but still not optimal, because the same byte is
repeated.
{01 02 03 04}
Atom 01 02 03 04 is an optimal one.
Source: https://code.google.com/p/yara-project/source/browse/trunk/libyara/atoms.c?r=261
13. CircleCityCon 2015 -‐ TLP:WHITE
YARA -‐ Atoms
The worse strings are those that contain no atoms at all:
/d.*d/
/[A-Za-z]{50,100}w+/
Source: https://code.google.com/p/yara-project/source/browse/trunk/libyara/atoms.c?r=261
14. CircleCityCon 2015 -‐ TLP:WHITE
YARA -‐ Atoms
FASTEST - only one atom is generated
$s1 = "cmd.exe" (ascii only)
$s2 = "cmd.exe" ascii (ascii only, same than $s1)
$s3 = "cmd.exe" wide (UTF-16 only)
FAST - two atoms will be generated
$s4 = "cmd.exe" ascii wide (both ascii and UTF-16)
SLOW - many atoms will be generated
$s5 = "cmd.exe" nocase (all different cases, e.g. "Cmd.exe", "cMd.exe",
"cmD.exe" ..
https://gist.github.com/Neo23x0/e3d4e316d7441d9143c7
31. Input a directory of malware samples and it outputs Yara
rules that try to avoid known goodware strings and
a[empts to use blacklisted strings from PE Studio
YarGen
hYps://github.com/Neo23x0/yarGen
CircleCityCon 2015 -‐ TLP:WHITE