SlideShare une entreprise Scribd logo
1  sur  28
Towards a Threat Hunting Automation
Maturity Model
Alex Pinto - Chief Data Scientist – Niddel
@alexcpsec
@NiddelCorp
• Who am I?
• Why does this talk exist?
• The Automation Barrier
• The Context Barrier
• The Experience Barrier
• The Creativity Barrier
• Hunting Automation Maturity Model
Agenda
• Brazilian Immigrant (or US Resident)
• Security Data Scientist
• Capybara Enthusiast
• Co-Founder at Niddel (@NiddelCorp)
• Founder of MLSec Project (@MLSecProject)
• What is MLSec Project? - Community of like-minded infosec
professionals working to improve data science and machine learning
application in security.
• What is Niddel? – Niddel is a security vendor that provides a SaaS-
based Autonomous Threat Hunting System
Who am I?
Why does this talk exist?
Like any good story, it all started with a discussion on the Internet
David Bianco to the Rescue!
[This is my first presentation without citing the PoP in 3 years]
Why not describe hunting automation as a maturity model?
The Automation Barrier
Breaking the Automation Barrier
First Order
(Indicator
Matching)
• When 9 of 10 of you think of
automation, you think of this.
• File hashes, YARA Rules, IP
addresses, domain names
• Lowest possible bar for a
vendor to claim they automate
threat hunting
• Batch analysis / ”Retro-hunting”
Choosing Indicators – RIG EK
Active actor registering domains - NOT Domain Shadowing
 Yay! Let’s go block this!!
Choosing Indicators – RIG EK
AS48096 – ITGRAD (any Russian offices?)
AS16276 – OVH SAS  (maybe block?)
AS14576 – Hosting Solution Ltd
(actually king-servers.com)
Choosing Indicators – Context Matters
 Can’t block this one, lol
 Or this one either
Without context that ”.com” and ”.org” are usually ok, automation fails
 Would not touch this one
The Context Barrier
Breaking the Context Barrier
Second Order
(Context Analysis)
First Order
(Indicator Matching)
• Using internal and external enrichments
to improve decision making
• Internal:
• Statistical analysis internal data (a.k.a
all of the UEBA stuff, PCR, ”stacking”)
• Knowledge from internal incidents
• External:
• Pivoting / Visual Aids
• Statistical analysis from enrichment
data (pDNS / WHOIS)
Example - Maliciousness Ratio
Let’s build aggregation metrics for ”good places” and ”bad places” in traffic
We propose a ratio that compares the cardinality of the node connectedness:
• Bpp – count of ”bad entities” connected to a specific pivoting point
• Gpp – count of ”good entities” connected to a specific pivoting point
𝑀𝑅 𝑝𝑝 =
𝐵 𝑝𝑝
𝐺 𝑝𝑝+𝐵 𝑝𝑝
Example - Maliciousness Ratio
• Looking at the base rate:
• ASN Base Rate 0.6%
• Country Base Rate 0.58%
• TLD Base Rate 1.9%
• Telemetry from an pool of Niddel customers:
• AS48096 – ITGRAD 87.5% => 145.9x more likely
• Country RU 5.2% => 8.96x more likely
• .org TLD 2.9% => 1.52x more likely
Challenges with the Approach
• How can we best define the cutting scores on all those potential
maliciousness ratings?
• How to combine and weight the multivariate composition of these
pivoting points?
• Solution is unique per
company, including
understanding telemetry
patterns, risk appetite for
FPs / FNs and decision
points on when to block
and when to alert on
something.
The Experience Barrier
Breaking the Experience Barrier
Third Order
(Multivariate
Decision
Engine)
Second Order
(Context
Analysis)
First Order
(Indicator
Matching)
• Combining all the signals from the
hunting investigation and making a
”call”:
• Does being registered in REG-RU
and hosted in OVH enough for a
conviction?
• This shady thing is registered in Mark
Monitor. Viral legit campaign?
• This ”gut feeling” comes from years and
years of knowledge and experience of
handling alerts and incidents IRL.
Supervised Machine Learning!!
VS
A More Involved Example (1)
A More Involved Example (2)
Build the campaign based on the
relationships - they all share the
same support infrastructure on
the IP Address and Name
Servers.
The Creativity Barrier
Now what?
• As threats evolve, new types of signals
may be necessary for a conviction.
• If the system does not have access to
the data that it requires, it cannot
evaluate it for decision making.
• Some examples of recent ”new” threats -
Domain fronting, IDN phishing
• This is no different from ”Writing a new
Runbook” for your team
But what about Deep Learning?
• Convolutional Neural Networks are very
good at looking at unstructured data and
”figuring out” what the features should
be.
• Great success for image and voice
recognition:
• Needs a lot of samples
• Trivial to classify by a human
• Neither of these is the case for security
– run away from DL vendors
Introducing HAMM
Hunting Automation Maturity Model (HAMM)
Fourth Order
(Human
Domain)
Third Order
(Multivariate
Decision
Engine)
Second
Order
(Context
Analysis)
First Order
(Indicator
Matching)
First Order: most “automating hunting” plays - a
simple match. Prone to lots of false positives (badly
vetted lists) and false negatives (lists will naturally
be incomplete).
Second Order: evaluate individual pivoting points,
identify entries related to high maliciousness, and
even determine what they are related to based on
the connections to known indicators.
Third Order: multivariate decision making -
determining on the fly which are the most relevant
variables from First and Second Order for each
individual detection decision.
Fourth Order: The realm where analysts may add
the most value – new hypothesis and datasets - but
rarely find the time under prevailing conditions of
high-volume, low-value alerts.
Hunting Automation Maturity Model (HAMM)
• IOC Matching
• Signatures
• Anti-virus
• Security / Hunting
Analytics
• Stats methods
• (Some) UEBA –
maybe?
• Supervised
machine learning
with previous
signals
• Top analysts
unlocking,
organizing and
labeling new
datasets
Hunting Automation Maturity Model (HAMM)
[Continuous Monitoring?]
[LAME]
[MAGIC]
[Threat Farming?]
[Prescriptive Incident Response?]
Share, like, subscribe
Q&A and Feedback please!
Alex Pinto – alexcp@niddel.com
@alexcpsec
@NiddelCorp
"Computers are useless. They can only give you answers.” – Pablo Picasso

Contenu connexe

Tendances

Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Alex Pinto
 
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Alex Pinto
 
AI & ML in Cyber Security - Why Algorithms are Dangerous
AI & ML in Cyber Security - Why Algorithms are DangerousAI & ML in Cyber Security - Why Algorithms are Dangerous
AI & ML in Cyber Security - Why Algorithms are Dangerous
Raffael Marty
 
Delivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationDelivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and Visualization
Raffael Marty
 

Tendances (20)

Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
 
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
 
Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Applying Machine Learning to Network Security Monitoring - BayThreat 2013Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Applying Machine Learning to Network Security Monitoring - BayThreat 2013
 
2016 FS-ISAC Annual Summit (Miami) - Developing Effective Encryption Strategies
2016 FS-ISAC Annual Summit (Miami) - Developing Effective Encryption Strategies2016 FS-ISAC Annual Summit (Miami) - Developing Effective Encryption Strategies
2016 FS-ISAC Annual Summit (Miami) - Developing Effective Encryption Strategies
 
AI & ML in Cyber Security - Why Algorithms are Dangerous
AI & ML in Cyber Security - Why Algorithms are DangerousAI & ML in Cyber Security - Why Algorithms are Dangerous
AI & ML in Cyber Security - Why Algorithms are Dangerous
 
Delivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationDelivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and Visualization
 
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't ChangedAI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
 
Enabling effective hunt teaming and incident response
Enabling effective hunt teaming and incident responseEnabling effective hunt teaming and incident response
Enabling effective hunt teaming and incident response
 
Threat Hunting with Elastic at SpectorOps: Welcome to HELK
Threat Hunting with Elastic at SpectorOps: Welcome to HELKThreat Hunting with Elastic at SpectorOps: Welcome to HELK
Threat Hunting with Elastic at SpectorOps: Welcome to HELK
 
Abstract Tools for Effective Threat Hunting
Abstract Tools for Effective Threat HuntingAbstract Tools for Effective Threat Hunting
Abstract Tools for Effective Threat Hunting
 
SOC2016 - The Investigation Labyrinth
SOC2016 - The Investigation LabyrinthSOC2016 - The Investigation Labyrinth
SOC2016 - The Investigation Labyrinth
 
AI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are DangerousAI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are Dangerous
 
Visualization in the Age of Big Data
Visualization in the Age of Big DataVisualization in the Age of Big Data
Visualization in the Age of Big Data
 
Luncheon 2016-07-16 - Topic 2 - Advanced Threat Hunting by Justin Falck
Luncheon 2016-07-16 -  Topic 2 - Advanced Threat Hunting by Justin FalckLuncheon 2016-07-16 -  Topic 2 - Advanced Threat Hunting by Justin Falck
Luncheon 2016-07-16 - Topic 2 - Advanced Threat Hunting by Justin Falck
 
Security Insights at Scale
Security Insights at ScaleSecurity Insights at Scale
Security Insights at Scale
 
MITRE ATT&CKcon 2.0: Prioritizing Data Sources for Minimum Viable Detection; ...
MITRE ATT&CKcon 2.0: Prioritizing Data Sources for Minimum Viable Detection; ...MITRE ATT&CKcon 2.0: Prioritizing Data Sources for Minimum Viable Detection; ...
MITRE ATT&CKcon 2.0: Prioritizing Data Sources for Minimum Viable Detection; ...
 
Cyber Threat Hunting with Phirelight
Cyber Threat Hunting with PhirelightCyber Threat Hunting with Phirelight
Cyber Threat Hunting with Phirelight
 
Building a Successful Threat Hunting Program
Building a Successful Threat Hunting ProgramBuilding a Successful Threat Hunting Program
Building a Successful Threat Hunting Program
 
How To Drive Value with Security Data
How To Drive Value with Security DataHow To Drive Value with Security Data
How To Drive Value with Security Data
 

Similaire à Towards a Threat Hunting Automation Maturity Model

BsidesLVPresso2016_JZeditsv6
BsidesLVPresso2016_JZeditsv6BsidesLVPresso2016_JZeditsv6
BsidesLVPresso2016_JZeditsv6
Rod Soto
 

Similaire à Towards a Threat Hunting Automation Maturity Model (20)

Technical track chris calvert-1 30 pm-issa conference-calvert
Technical track chris calvert-1 30 pm-issa conference-calvertTechnical track chris calvert-1 30 pm-issa conference-calvert
Technical track chris calvert-1 30 pm-issa conference-calvert
 
BsidesLVPresso2016_JZeditsv6
BsidesLVPresso2016_JZeditsv6BsidesLVPresso2016_JZeditsv6
BsidesLVPresso2016_JZeditsv6
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
 
Navy security contest-bigdataforsecurity
Navy security contest-bigdataforsecurityNavy security contest-bigdataforsecurity
Navy security contest-bigdataforsecurity
 
Rise of the machines -- Owasp israel -- June 2014 meetup
Rise of the machines -- Owasp israel -- June 2014 meetupRise of the machines -- Owasp israel -- June 2014 meetup
Rise of the machines -- Owasp israel -- June 2014 meetup
 
Are you ready for the next attack? Reviewing the SP Security Checklist
Are you ready for the next attack? Reviewing the SP Security ChecklistAre you ready for the next attack? Reviewing the SP Security Checklist
Are you ready for the next attack? Reviewing the SP Security Checklist
 
Are you ready for the next attack? reviewing the sp security checklist (apnic...
Are you ready for the next attack? reviewing the sp security checklist (apnic...Are you ready for the next attack? reviewing the sp security checklist (apnic...
Are you ready for the next attack? reviewing the sp security checklist (apnic...
 
Vulnerability Ass... Penetrate What?
Vulnerability Ass... Penetrate What?Vulnerability Ass... Penetrate What?
Vulnerability Ass... Penetrate What?
 
Demystifying Security Analytics: Data, Methods, Use Cases
Demystifying Security Analytics: Data, Methods, Use CasesDemystifying Security Analytics: Data, Methods, Use Cases
Demystifying Security Analytics: Data, Methods, Use Cases
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)
 
Understanding Intrusion Detection & Prevention Systems (1).pptx
Understanding Intrusion Detection & Prevention Systems (1).pptxUnderstanding Intrusion Detection & Prevention Systems (1).pptx
Understanding Intrusion Detection & Prevention Systems (1).pptx
 
RSA 2016 Security Analytics Presentation
RSA 2016 Security Analytics PresentationRSA 2016 Security Analytics Presentation
RSA 2016 Security Analytics Presentation
 
SmartData Webinar: Applying Neocortical Research to Streaming Analytics
SmartData Webinar: Applying Neocortical Research to Streaming AnalyticsSmartData Webinar: Applying Neocortical Research to Streaming Analytics
SmartData Webinar: Applying Neocortical Research to Streaming Analytics
 
Putting the Human Back in the Loop for Analysis
Putting the Human Back in the Loop for AnalysisPutting the Human Back in the Loop for Analysis
Putting the Human Back in the Loop for Analysis
 
AI Cybersecurity: Pros & Cons. AI is reshaping cybersecurity
AI Cybersecurity: Pros & Cons. AI is reshaping cybersecurityAI Cybersecurity: Pros & Cons. AI is reshaping cybersecurity
AI Cybersecurity: Pros & Cons. AI is reshaping cybersecurity
 
The Future Of Threat Intelligence Platforms
The Future Of Threat Intelligence PlatformsThe Future Of Threat Intelligence Platforms
The Future Of Threat Intelligence Platforms
 
Digital Forensics for Artificial Intelligence (AI ) Systems.pdf
Digital Forensics for Artificial Intelligence (AI ) Systems.pdfDigital Forensics for Artificial Intelligence (AI ) Systems.pdf
Digital Forensics for Artificial Intelligence (AI ) Systems.pdf
 
SANSFIRE18: War Stories on Using Automated Threat Intelligence for Defense
SANSFIRE18: War Stories on Using Automated Threat Intelligence for DefenseSANSFIRE18: War Stories on Using Automated Threat Intelligence for Defense
SANSFIRE18: War Stories on Using Automated Threat Intelligence for Defense
 
Advanced malware analysis training session3 botnet analysis part2
Advanced malware analysis training session3 botnet analysis part2Advanced malware analysis training session3 botnet analysis part2
Advanced malware analysis training session3 botnet analysis part2
 
Forensic Analysis - Empower Tech Days 2013
Forensic Analysis - Empower Tech Days 2013Forensic Analysis - Empower Tech Days 2013
Forensic Analysis - Empower Tech Days 2013
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 

Towards a Threat Hunting Automation Maturity Model

  • 1. Towards a Threat Hunting Automation Maturity Model Alex Pinto - Chief Data Scientist – Niddel @alexcpsec @NiddelCorp
  • 2. • Who am I? • Why does this talk exist? • The Automation Barrier • The Context Barrier • The Experience Barrier • The Creativity Barrier • Hunting Automation Maturity Model Agenda
  • 3. • Brazilian Immigrant (or US Resident) • Security Data Scientist • Capybara Enthusiast • Co-Founder at Niddel (@NiddelCorp) • Founder of MLSec Project (@MLSecProject) • What is MLSec Project? - Community of like-minded infosec professionals working to improve data science and machine learning application in security. • What is Niddel? – Niddel is a security vendor that provides a SaaS- based Autonomous Threat Hunting System Who am I?
  • 4. Why does this talk exist? Like any good story, it all started with a discussion on the Internet
  • 5. David Bianco to the Rescue! [This is my first presentation without citing the PoP in 3 years] Why not describe hunting automation as a maturity model?
  • 7. Breaking the Automation Barrier First Order (Indicator Matching) • When 9 of 10 of you think of automation, you think of this. • File hashes, YARA Rules, IP addresses, domain names • Lowest possible bar for a vendor to claim they automate threat hunting • Batch analysis / ”Retro-hunting”
  • 8. Choosing Indicators – RIG EK Active actor registering domains - NOT Domain Shadowing  Yay! Let’s go block this!!
  • 9. Choosing Indicators – RIG EK AS48096 – ITGRAD (any Russian offices?) AS16276 – OVH SAS  (maybe block?) AS14576 – Hosting Solution Ltd (actually king-servers.com)
  • 10. Choosing Indicators – Context Matters  Can’t block this one, lol  Or this one either Without context that ”.com” and ”.org” are usually ok, automation fails  Would not touch this one
  • 12. Breaking the Context Barrier Second Order (Context Analysis) First Order (Indicator Matching) • Using internal and external enrichments to improve decision making • Internal: • Statistical analysis internal data (a.k.a all of the UEBA stuff, PCR, ”stacking”) • Knowledge from internal incidents • External: • Pivoting / Visual Aids • Statistical analysis from enrichment data (pDNS / WHOIS)
  • 13. Example - Maliciousness Ratio Let’s build aggregation metrics for ”good places” and ”bad places” in traffic We propose a ratio that compares the cardinality of the node connectedness: • Bpp – count of ”bad entities” connected to a specific pivoting point • Gpp – count of ”good entities” connected to a specific pivoting point 𝑀𝑅 𝑝𝑝 = 𝐵 𝑝𝑝 𝐺 𝑝𝑝+𝐵 𝑝𝑝
  • 14. Example - Maliciousness Ratio • Looking at the base rate: • ASN Base Rate 0.6% • Country Base Rate 0.58% • TLD Base Rate 1.9% • Telemetry from an pool of Niddel customers: • AS48096 – ITGRAD 87.5% => 145.9x more likely • Country RU 5.2% => 8.96x more likely • .org TLD 2.9% => 1.52x more likely
  • 15. Challenges with the Approach • How can we best define the cutting scores on all those potential maliciousness ratings? • How to combine and weight the multivariate composition of these pivoting points? • Solution is unique per company, including understanding telemetry patterns, risk appetite for FPs / FNs and decision points on when to block and when to alert on something.
  • 17. Breaking the Experience Barrier Third Order (Multivariate Decision Engine) Second Order (Context Analysis) First Order (Indicator Matching) • Combining all the signals from the hunting investigation and making a ”call”: • Does being registered in REG-RU and hosted in OVH enough for a conviction? • This shady thing is registered in Mark Monitor. Viral legit campaign? • This ”gut feeling” comes from years and years of knowledge and experience of handling alerts and incidents IRL.
  • 19. A More Involved Example (1)
  • 20. A More Involved Example (2) Build the campaign based on the relationships - they all share the same support infrastructure on the IP Address and Name Servers.
  • 22. Now what? • As threats evolve, new types of signals may be necessary for a conviction. • If the system does not have access to the data that it requires, it cannot evaluate it for decision making. • Some examples of recent ”new” threats - Domain fronting, IDN phishing • This is no different from ”Writing a new Runbook” for your team
  • 23. But what about Deep Learning? • Convolutional Neural Networks are very good at looking at unstructured data and ”figuring out” what the features should be. • Great success for image and voice recognition: • Needs a lot of samples • Trivial to classify by a human • Neither of these is the case for security – run away from DL vendors
  • 25. Hunting Automation Maturity Model (HAMM) Fourth Order (Human Domain) Third Order (Multivariate Decision Engine) Second Order (Context Analysis) First Order (Indicator Matching) First Order: most “automating hunting” plays - a simple match. Prone to lots of false positives (badly vetted lists) and false negatives (lists will naturally be incomplete). Second Order: evaluate individual pivoting points, identify entries related to high maliciousness, and even determine what they are related to based on the connections to known indicators. Third Order: multivariate decision making - determining on the fly which are the most relevant variables from First and Second Order for each individual detection decision. Fourth Order: The realm where analysts may add the most value – new hypothesis and datasets - but rarely find the time under prevailing conditions of high-volume, low-value alerts.
  • 26. Hunting Automation Maturity Model (HAMM) • IOC Matching • Signatures • Anti-virus • Security / Hunting Analytics • Stats methods • (Some) UEBA – maybe? • Supervised machine learning with previous signals • Top analysts unlocking, organizing and labeling new datasets
  • 27. Hunting Automation Maturity Model (HAMM) [Continuous Monitoring?] [LAME] [MAGIC] [Threat Farming?] [Prescriptive Incident Response?]
  • 28. Share, like, subscribe Q&A and Feedback please! Alex Pinto – alexcp@niddel.com @alexcpsec @NiddelCorp "Computers are useless. They can only give you answers.” – Pablo Picasso

Notes de l'éditeur

  1. Lack of guidance on CTI usage it a huge problem, so having aggregated views like this are sure to help.