In this work we describe how threat actors may use AI algorithms to bypass AI phishing detection systems. We analyzed more than a million phishing URLs to understand the different strategies that threat actors use to create phishing URLs. Assuming the role of an attacker, we simulate how different threat actors may leverage Deep Neural Networks to enhance their effectiveness rate. Using Long Short-Term Memory Networks, we created DeepPhish, an algorithm that learns to create better phishing attacks. By training the DeepPhish algorithm for two different threat actors, they were able to increase their effectiveness from 0.69% to 20.9%, and 4.91% to 36.28%, respectively.
2. CYXTERA TECHNOLOGIES 2
Portfolio of cybersecurity software and services
Intelligent and adaptive
Cloud-native and hybrid-ready
Global colocation leader
57 data centers in 29 global markets
2.6M sq. feet of data center space
195 megawatts of power
3,500 customers
1,100 employees
Headquartered in Miami with offices globally
Experienced leadership in infrastructure
and security
CyxteraTechnologies
3. CYXTERA TECHNOLOGIES 3
80 % of cyber
crimes are being
committed by
sophisticated
attackers
The total
USA market
for cyber
insurance is
3B in 2017
6. CYXTERA TECHNOLOGIES
AI to Classify Phishing URLs
6
Identify & Classify Malicious URLs and Domains with
Prediction - Not Blacklists.
The system calculates the probability of a URL being used to
host a phishing attacks using Deep Neural Networks. It
correctly classify URLs with over 98% of accuracy.
7. CYXTERA TECHNOLOGIES
Long-Short Term Memory Networks
7
URL
h
t
t
p
:
/
/
w
w
w
.
p
a
p
a
y
a
.
c
o
m
One hot
Encoding
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
Embedding
3.2 1.2 … 1.7
6.4 2.3 … 2.6
6.4 3.0 … 1.7
3.4 2.6 … 3.4
2.6 3.8 … 2.6
3.5 3.2 … 6.4
1.7 4.2 … 6.4
8.6 2.4 … 6.4
4.3 2.9 … 6.4
2.2 3.4 … 3.4
3.2 2.6 … 2.6
4.2 2.2 … 3.5
2.4 3.2 … 1.7
2.9 1.7 … 8.6
3.0 6.4 … 2.6
2.6 6.4 … 3.8
3.8 3.4 … 3.2
3.3 2.6 … 2.2
3.1 2.2 … 2.9
1.8 3.2 … 3.0
2.5 6.4 … 2.6
LSTM
LSTM
LSTM
LSTM
Sigmoid
…
12. CYXTERA TECHNOLOGIES
Uncovering Threat Actors
12
Objective: We want to understand effective patterns of
each attacker to improve them through a AI model
As we can not know them directly, we must learn from
them through their attacks
Database with 1.1M confirm phishing URLs collected from
Phishtank
17. CYXTERA TECHNOLOGIES
DeepPhish Algorithm - Training
17
Non Effective URLs
Effective URLs
Encoding
…
…
…
…
…
Model
Az
Rolling
Window
Concatenate
andcreate
Transform
Train
http://www.naylorantiques.com/content/centrais/fone_facil
http://kisanart.com/arendivento/menu-opcoes-fone-facil/
http://naylorantiques.com/atendimento/menu-opcoes-fone-facil/3
http://www.naylorantiques.com/content/centr
ais/fone_facilhttp://kisanart.com/arendivento/
menu-opcoes-fone-
facil/http://naylorantiques.com/atendimento/
menu-opcoes-fone-facil/3
18. CYXTERA TECHNOLOGIES
DeepPhish LSTM Network
18
URL
h
t
t
p
:
/
/
w
w
w
.
p
a
p
a
y
a
.
c
o
m
One hot
Encoding
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
LSTM
LSTM
LSTM
LSTM
Softmax
…
tanH
tanH
tanH
tanH
…
19. CYXTERA TECHNOLOGIES
DeepPhish Algorithm – Prediction
19
Compromised
Domains
Allowed
Paths
+
Model
Filterpaths
Predict
Next
Character
Iteratively
Synthetic URLs
/arendipemto/nenu-opcines-fone-facil vfone/faci/Atondime+ http:// + www.naylorantiques.com + /arendipemto/nenu-opcines-fone-facilvone/facil/Atondime
Create
20. CYXTERA TECHNOLOGIES
Simulating Malicious AI using DeepPhish
20
We selected the two most effective threat
actors.With each subsample of effective URLs
by threat actor, we implemented DeepPhish
algorithm.
Phishing is a form of fraud in which the attacker tries to learn information such as login credentials or account information by masquerading as a reputable entity or person in email, IM or other communication channels.
Phishing is a form of fraud in which the attacker tries to learn information such as login credentials or account information by masquerading as a reputable entity or person in email, IM or other communication channels.
Phishing is a form of fraud in which the attacker tries to learn information such as login credentials or account information by masquerading as a reputable entity or person in email, IM or other communication channels.
Phishing is a form of fraud in which the attacker tries to learn information such as login credentials or account information by masquerading as a reputable entity or person in email, IM or other communication channels.
Phishing is a form of fraud in which the attacker tries to learn information such as login credentials or account information by masquerading as a reputable entity or person in email, IM or other communication channels.
Phishing is a form of fraud in which the attacker tries to learn information such as login credentials or account information by masquerading as a reputable entity or person in email, IM or other communication channels.
Phishing is a form of fraud in which the attacker tries to learn information such as login credentials or account information by masquerading as a reputable entity or person in email, IM or other communication channels.
Phishing is a form of fraud in which the attacker tries to learn information such as login credentials or account information by masquerading as a reputable entity or person in email, IM or other communication channels.
Phishing is a form of fraud in which the attacker tries to learn information such as login credentials or account information by masquerading as a reputable entity or person in email, IM or other communication channels.
Phishing is a form of fraud in which the attacker tries to learn information such as login credentials or account information by masquerading as a reputable entity or person in email, IM or other communication channels.
Phishing is a form of fraud in which the attacker tries to learn information such as login credentials or account information by masquerading as a reputable entity or person in email, IM or other communication channels.
Phishing is a form of fraud in which the attacker tries to learn information such as login credentials or account information by masquerading as a reputable entity or person in email, IM or other communication channels.