SlideShare a Scribd company logo
1 of 44
Rise of the machines
2
Agenda
• Some pains of Ops./IT engineers and Security folk
• Rise of the machines: what’s behavioral analysis and what’s
Machine Learning
• (Example) Device fingerprinting from 802.11 traffic
• (Example) Browser fingerprinting from HTTP headers
• Wrap up and resources
3
Typical stuff Operations/IT/Security engineers
do to be in control
• Have important information logged to logging service.
• Perhaps have some Security Information and Event Management
(SIEM) solution(s).
• If you want online awareness, you implement and deploy event
monitors and setup dashboards.
• If “something interesting” or “something important” happens, you
want to understand and respond such that you keep the business
running, fix what you can and wake up people to fix what you can’t.
• So we/they/someone-else writes rules (x happened y times, foo
happened with some relationship to bar…)
4
Problems with the approach
• Services and websites change over time:
• Deployments (infrastructure change, design change, functionality
change…)
• Routine or occasional maintenance to frontend/backend
• A/B testing
• Security tests
• Etc.
5
Problems with the approach
• Traffic changes:
• Seasonality: hourly, daily, weekly, holidays, etc.
• Rare events: successful campaigns, bad reviews
Changes aren’t necessarily an unexpected anomaly
or an attack. So, how to distinguish?
6
Problems with the approach
Rules written in SIEM systems almost always either
• over-generalize
or
• under-generalize
So they are too simplistic to capture complex reality
or over-fitted such that they don’t generalize well
• After a while are too complicated to maintain
• What happens when need to change a rule?
• What about managing ordering and dependencies?
• How to debug?
7
Problems with the approach
• Too many logs
• Machine generated but humanly handled
• Too many signals to monitor and decide on simultaneously
8
Problems with the approach
Too few skilled people to handle. How to find them?
Are there both security analysts and data scientists?
9
Let the machines handle machine generated logs
So, we want to use Machine Learning in order to
automate adaptation to change and to be able to
handle volumes constantly and repeatedly.
10
3 hottest terms lately
• Cyber (Security)
• Big Data
• Machine Learning
Now… Imagine all three together…
You can probably think of a few names from only last 3 or so months
Raising millions of $$ when saying:
big-data//machine- learning//cyber-security
11
What's Behavioral?
When we use the term Behavioral we mean that we're looking at
attributes that are not (necessarily) related to the content but rather
to information that describe or that may describe the behavioral
properties of the actual content.
Behavioral analysis focuses on the observable relationship of behavior
to the environment.
So, instead of addressing particular properties of communication
content and context we use information about how that
communication takes place and being used. For example, we look at
timing, at methods used, etc.
12
What's Behavioral?
• Some say
Behavioral Analysis
and actually mean
Anomaly Detection
• Others say
Behavioral Analysis
and actually mean
Machine Learning
13
What’s Machine Learning?
• You don’t develop rules, instead you develop software that
discovers the rules by itself.
• You sometimes don’t even design what input (features) to feed to
the learning algorithms, as those can be (sometimes) learned too
• You sometimes don’t even need to implement the feature
extraction, as such code can be (sometimes) be auto-written too
Try reading A Few Useful Things to Know About Machine Learning by Pedro Domingos (CACM, Volume 55, Issue 10, October 2012, pp. 78-
87) – you may also want to read the great commentary about this paper here and here
14
When we say Machine Learning we mean
that…
• our solution needs not be explicitly programmed or configured in
order to be well adapted and tuned to particular installation, setup,
environment, changes in the application, changes in traffic, etc.
• we'd like our solution to work out of the box without need of
human guidance or intervention. Instead, we'd like it work and to
adapt to changes by using examples instead of explicitly being
programmed.
• we hope to automate some (hopefully, most or all) of the domain
expertise network analyst work by learning from examples.
• hoping to be as good at the task as human experts, but scale better.
15
Machine Learning
Machine learning systems automatically learn programs from data.
This is often a very attractive alternative to manually constructing
them.
16
Some facts about Machine Learning
• Learning/training:
• You try to fit a function or a family of functions from your input (think
that your input is ultimately a series of k-tuples)
• Applying:
• You feed a new k-tuple and get a result
17
But how does the model building process
actually work?
All machine learning algorithms (the ones that build the models)
basically consist of the following three things:
• A set of possible models to look thorough
• A way to test whether a model is good
• A clever way to find a really good model with only a few test
18
Ways to classify machine learning algorithms
Supervised vs. Unsupervised
Classification (vs. clustering) vs. Regression
Online vs. Offline: (Streaming (learn and apply as you go) vs. Iterations (down to
only 1))
All input is there vs. Missing/incomplete data
.
.
.
19
Surprising facts about Machine Learning
• Although there are many off-the-shelf tools to help doing machine
learning it is almost always harder to do it right
• on real world problems,
• on real customer data,
• constantly,
• in scale and
• in quality
• Roughly 95% or more of the efforts are due to data collection and
preparation (missing values, correctness, relevance, representation,
balancing, cleaning, …)
20
Surprising facts about Machine Learning
• Many-times simpler common-sense algorithms outperform (quality,
scale, maintainability, …) complicated algorithms when big-data is
available
• It not so much about what algorithm you use but how much and quality
of the data you have
• Data representation many-times matter more (Deep Learning)
• Ensembles of simple-specialized algorithms usually do better than one
monolithic complex algorithm (Ensemble, Arbitration, …)
21
We want learning: implicit, automatic, dynamic
• Not need to write rules, not manage them
• Want the rules to be learned automatically and
• Don’t want to set/change thresholds
• Want the thresholds to be determined automatically and
dynamically
• I want to know about rare events that matter without needing to
define what rare means and without needing to define important
• Become better as more data becomes available
22
Example: 802.11 device fingerprinting
An empirical study of passive 802.11 Device Fingerprinting
Christoph Neumann, Olivier Heen, Stéphane Onno
Proceedings of 32nd International Conference on Distributed
Computing Systems Workshops (ICDCSW 2012), Workshop on
Network Forensics, Security and Privacy (NFSP'12)
23
802.11 device fingerprinting
• 802.11 device fingerprinting is the action of characterizing a target
device through its wireless traffic.
• This results in a signature that may be used for identification,
network monitoring or intrusion detection.
24
802.11 device fingerprinting
• The fingerprinting method is passive by just observing the traffic
sent by the target device.
• Focus on network parameters which can be easily extracted using
standard wireless cards
• Method should work also for encrypted 802.11 traffic
• Method should not be detected by attackers hard to cheat with
adversarial traffic
• Accurate
25
802.11 device fingerprinting
Many passive fingerprinting methods rely on the observation of one
particular network feature, such as the rate switching behavior or the
transmission pattern of probe requests.
In this work, the researchers evaluated a set of global wireless
network parameters with respect to their ability to identify 802.11
devices.
They restricted themselves to parameters that can be observed
passively using a standard wireless card.
Used information extracted by Radiotap or Prism headers
26
802.11 device fingerprinting
Machine Learning? Show me the ML!
• Features: Network parameters – observable features
• Transmission rate [Mbit/ µsec] – different card vendors and models
have variations
• Frame size [bytes] – differences in broadcast frame sizes implicitly
identify wireless devices
• Medium access time [µsec] – time since medium is idle and until device
starts sending its own frame
• Transmission time [µsec] – frame duration -- time it takes to send a
frame (approximate by frame size divided by transmission rate)
• Frame inter-arrival time [µsec] – time from end (start) of one frame and
end (start) of next frame on the same direction
27
802.11 device fingerprinting
Machine Learning? Show me the ML!
Computed features:
• Foreach frame type (data frames, probe requests, …)
• Foreach sender over the medium
• Maintain frequency histograms per observable feature
• Periodically,
• transform frequency histograms to proportional histograms
28
802.11 device fingerprinting
Machine Learning? Show me the ML!
Researchers in the paper used a supervised learning approach.
Learn:
Assume fixed/known set of devices
Characterize devices using features
Apply:
Compare histograms with learned histograms and MAC addresses.
When a conflict observed vs learned baselines – Alert!
29
802.11 device fingerprinting
How to compare histograms?
Researchers used Cosine similarity.
[Wikipedia]
30
Distance – alternatives?
Depends on what you want to capture? Examples:
• Minkowski distance
• Kullback-Liebler divergence
31
Distance – More alternatives
• Triangular discrimination
• Jensen Shannon
There are many more distance, divergence and similarity measures –
what to use? It depends…
32
Alternative learning?
Instead of supervised learning (requires sterile learning time, needs
examples, rigid, …) let’s do unsupervised:
• Cluster observed histograms by distances
• Assign MAC addresses to clusters
• Look into clusters with more than one MAC address
• If False Positives – be more sensitive to precision or look into
divergences that better capture differences/similarities
• Robust, flexible, assumes very little
33
But what makes a fingerprint?
• Check computed features by type?
• Create one big histogram?
• Create also histograms of inter-dependencies? (Cross product…)
• Hash histograms into something else? (What? How?)
What’s stable? What’s accurate?
It depends on your data, on your representation, on your algorithm…
34
From PHY to L7
An exercise in browser fingerprinting
35
Fingerprinting my browser? Really?
36
Whoa! How?
Mike Sconzo and Brian Wylie have reproducible research which they
presented on ShmooCon 2014
http://nbviewer.ipython.org/github/ClickSecurity/data_hacking/blob/
master/browser_fingerprinting/browser_fingerprinting.ipynb
You can learn methodology of how to do this from HTTP headers in a
scientific manner: Data Scientific manner
37
OK. So, what does this has to do with
Rise of the Machines?
We can now automatically,
• Collect data
• Analyze data
• Organize data
• Insights and conclusions
Find what’s interesting – automatically.
TADA!
38
Usages?
• SPAM detection
• Fraud detection
• Malware detection
• Intrusion detection
• Abuse detection
• Fingerprinting (duh!)
• DDoS detection
• Network traffic classification
• … … … <your idea here>
39
Links and references
• An empirical study of passive 802.11 Device Fingerprinting // Christoph
Neumann, Olivier Heen, Stéphane Onno // Proceedings of 32nd
International Conference on Distributed Computing Systems Workshops
(ICDCSW 2012), Workshop on Network Forensics, Security and Privacy
(NFSP'12)
• https://panopticlick.eff.org/
• ShmooCon 2014: Practical Applications of Data Science in Detection
// Mike Sconzo and Brian Wylie //
http://www.youtube.com/watch?v=8lF5rBmKhWk [start at 35:36]
• http://nbviewer.ipython.org/github/ClickSecurity/data_hacking/blo
b/master/browser_fingerprinting/browser_fingerprinting.ipynb
• Wikipedia {just look for terms}
40
What would I want to learn?
Theory?
Data Science
Statistics
Machine Learning
Statistical Inference
Predictive Analytics
41
What would I want to learn?
Tools?
R (S?) // Octave (Matlab?) // Julia // Haskell // Perl // Python // …
42
How would I learn?
Resources?
• Google is your friend (youtube too)
• Coursera // Udacity // Iversity // EdX // … excellent online courses
• Meetups
• Good old
reading books // university courses // reading academic
papers
43
Thanks to…
Eran Goldstein -- http://www.linkedin.com/in/erangoldstein
Maydan Weinreb -- http://www.linkedin.com/in/maydanw
44
Thank you!
http://il.linkedin.com/in/shlomoyona
s.yona@f5.com
https://www.facebook.com/shlomo.yona

More Related Content

Similar to Rise of the machines -- Owasp israel -- June 2014 meetup

Webinar: Machine Learning para Microcontroladores
Webinar: Machine Learning para MicrocontroladoresWebinar: Machine Learning para Microcontroladores
Webinar: Machine Learning para MicrocontroladoresEmbarcados
 
AI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are DangerousAI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are DangerousRaffael Marty
 
Machine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERMachine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERGanesan Narayanasamy
 
Machine Learning in the Real World
Machine Learning in the Real WorldMachine Learning in the Real World
Machine Learning in the Real WorldSrinath Perera
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tpseudor00t overflow
 
DSD-INT 2014 - OpenMI Symposium - Federated modelling of Critical Infrastruct...
DSD-INT 2014 - OpenMI Symposium - Federated modelling of Critical Infrastruct...DSD-INT 2014 - OpenMI Symposium - Federated modelling of Critical Infrastruct...
DSD-INT 2014 - OpenMI Symposium - Federated modelling of Critical Infrastruct...Deltares
 
Advanced malware analysis training session3 botnet analysis part2
Advanced malware analysis training session3 botnet analysis part2Advanced malware analysis training session3 botnet analysis part2
Advanced malware analysis training session3 botnet analysis part2Cysinfo Cyber Security Community
 
Delivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationDelivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationRaffael Marty
 
Machine Learning & Predictive Maintenance
Machine Learning &  Predictive MaintenanceMachine Learning &  Predictive Maintenance
Machine Learning & Predictive MaintenanceArnab Biswas
 
The Scope for Robotic Process Automation & Machine Learning in Telecom Operat...
The Scope for Robotic Process Automation & Machine Learning in Telecom Operat...The Scope for Robotic Process Automation & Machine Learning in Telecom Operat...
The Scope for Robotic Process Automation & Machine Learning in Telecom Operat...James Crawshaw
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software EngineeringMiroslaw Staron
 
Visualization in the Age of Big Data
Visualization in the Age of Big DataVisualization in the Age of Big Data
Visualization in the Age of Big DataRaffael Marty
 
Tech essentials for Product managers
Tech essentials for Product managersTech essentials for Product managers
Tech essentials for Product managersNitin T Bhat
 
The differing ways to monitor and instrument
The differing ways to monitor and instrumentThe differing ways to monitor and instrument
The differing ways to monitor and instrumentJonah Kowall
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Ali Alkan
 
Platforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern EngineeringPlatforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern EngineeringDATAVERSITY
 
Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Sri Ambati
 

Similar to Rise of the machines -- Owasp israel -- June 2014 meetup (20)

Webinar: Machine Learning para Microcontroladores
Webinar: Machine Learning para MicrocontroladoresWebinar: Machine Learning para Microcontroladores
Webinar: Machine Learning para Microcontroladores
 
AI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are DangerousAI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are Dangerous
 
Machine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERMachine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWER
 
Intelligent Cloud Automation
Intelligent Cloud AutomationIntelligent Cloud Automation
Intelligent Cloud Automation
 
I learning lot
I learning lotI learning lot
I learning lot
 
Machine Learning in the Real World
Machine Learning in the Real WorldMachine Learning in the Real World
Machine Learning in the Real World
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
 
DSD-INT 2014 - OpenMI Symposium - Federated modelling of Critical Infrastruct...
DSD-INT 2014 - OpenMI Symposium - Federated modelling of Critical Infrastruct...DSD-INT 2014 - OpenMI Symposium - Federated modelling of Critical Infrastruct...
DSD-INT 2014 - OpenMI Symposium - Federated modelling of Critical Infrastruct...
 
Advanced malware analysis training session3 botnet analysis part2
Advanced malware analysis training session3 botnet analysis part2Advanced malware analysis training session3 botnet analysis part2
Advanced malware analysis training session3 botnet analysis part2
 
Delivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationDelivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and Visualization
 
Machine Learning & Predictive Maintenance
Machine Learning &  Predictive MaintenanceMachine Learning &  Predictive Maintenance
Machine Learning & Predictive Maintenance
 
The Scope for Robotic Process Automation & Machine Learning in Telecom Operat...
The Scope for Robotic Process Automation & Machine Learning in Telecom Operat...The Scope for Robotic Process Automation & Machine Learning in Telecom Operat...
The Scope for Robotic Process Automation & Machine Learning in Telecom Operat...
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Visualization in the Age of Big Data
Visualization in the Age of Big DataVisualization in the Age of Big Data
Visualization in the Age of Big Data
 
Tech essentials for Product managers
Tech essentials for Product managersTech essentials for Product managers
Tech essentials for Product managers
 
The differing ways to monitor and instrument
The differing ways to monitor and instrumentThe differing ways to monitor and instrument
The differing ways to monitor and instrument
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
 
Machine learning in Banks
Machine learning in BanksMachine learning in Banks
Machine learning in Banks
 
Platforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern EngineeringPlatforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern Engineering
 
Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...
 

Recently uploaded

Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...gajnagarg
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...amitlee9823
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...gajnagarg
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 

Recently uploaded (20)

Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 

Rise of the machines -- Owasp israel -- June 2014 meetup

  • 1. Rise of the machines
  • 2. 2 Agenda • Some pains of Ops./IT engineers and Security folk • Rise of the machines: what’s behavioral analysis and what’s Machine Learning • (Example) Device fingerprinting from 802.11 traffic • (Example) Browser fingerprinting from HTTP headers • Wrap up and resources
  • 3. 3 Typical stuff Operations/IT/Security engineers do to be in control • Have important information logged to logging service. • Perhaps have some Security Information and Event Management (SIEM) solution(s). • If you want online awareness, you implement and deploy event monitors and setup dashboards. • If “something interesting” or “something important” happens, you want to understand and respond such that you keep the business running, fix what you can and wake up people to fix what you can’t. • So we/they/someone-else writes rules (x happened y times, foo happened with some relationship to bar…)
  • 4. 4 Problems with the approach • Services and websites change over time: • Deployments (infrastructure change, design change, functionality change…) • Routine or occasional maintenance to frontend/backend • A/B testing • Security tests • Etc.
  • 5. 5 Problems with the approach • Traffic changes: • Seasonality: hourly, daily, weekly, holidays, etc. • Rare events: successful campaigns, bad reviews Changes aren’t necessarily an unexpected anomaly or an attack. So, how to distinguish?
  • 6. 6 Problems with the approach Rules written in SIEM systems almost always either • over-generalize or • under-generalize So they are too simplistic to capture complex reality or over-fitted such that they don’t generalize well • After a while are too complicated to maintain • What happens when need to change a rule? • What about managing ordering and dependencies? • How to debug?
  • 7. 7 Problems with the approach • Too many logs • Machine generated but humanly handled • Too many signals to monitor and decide on simultaneously
  • 8. 8 Problems with the approach Too few skilled people to handle. How to find them? Are there both security analysts and data scientists?
  • 9. 9 Let the machines handle machine generated logs So, we want to use Machine Learning in order to automate adaptation to change and to be able to handle volumes constantly and repeatedly.
  • 10. 10 3 hottest terms lately • Cyber (Security) • Big Data • Machine Learning Now… Imagine all three together… You can probably think of a few names from only last 3 or so months Raising millions of $$ when saying: big-data//machine- learning//cyber-security
  • 11. 11 What's Behavioral? When we use the term Behavioral we mean that we're looking at attributes that are not (necessarily) related to the content but rather to information that describe or that may describe the behavioral properties of the actual content. Behavioral analysis focuses on the observable relationship of behavior to the environment. So, instead of addressing particular properties of communication content and context we use information about how that communication takes place and being used. For example, we look at timing, at methods used, etc.
  • 12. 12 What's Behavioral? • Some say Behavioral Analysis and actually mean Anomaly Detection • Others say Behavioral Analysis and actually mean Machine Learning
  • 13. 13 What’s Machine Learning? • You don’t develop rules, instead you develop software that discovers the rules by itself. • You sometimes don’t even design what input (features) to feed to the learning algorithms, as those can be (sometimes) learned too • You sometimes don’t even need to implement the feature extraction, as such code can be (sometimes) be auto-written too Try reading A Few Useful Things to Know About Machine Learning by Pedro Domingos (CACM, Volume 55, Issue 10, October 2012, pp. 78- 87) – you may also want to read the great commentary about this paper here and here
  • 14. 14 When we say Machine Learning we mean that… • our solution needs not be explicitly programmed or configured in order to be well adapted and tuned to particular installation, setup, environment, changes in the application, changes in traffic, etc. • we'd like our solution to work out of the box without need of human guidance or intervention. Instead, we'd like it work and to adapt to changes by using examples instead of explicitly being programmed. • we hope to automate some (hopefully, most or all) of the domain expertise network analyst work by learning from examples. • hoping to be as good at the task as human experts, but scale better.
  • 15. 15 Machine Learning Machine learning systems automatically learn programs from data. This is often a very attractive alternative to manually constructing them.
  • 16. 16 Some facts about Machine Learning • Learning/training: • You try to fit a function or a family of functions from your input (think that your input is ultimately a series of k-tuples) • Applying: • You feed a new k-tuple and get a result
  • 17. 17 But how does the model building process actually work? All machine learning algorithms (the ones that build the models) basically consist of the following three things: • A set of possible models to look thorough • A way to test whether a model is good • A clever way to find a really good model with only a few test
  • 18. 18 Ways to classify machine learning algorithms Supervised vs. Unsupervised Classification (vs. clustering) vs. Regression Online vs. Offline: (Streaming (learn and apply as you go) vs. Iterations (down to only 1)) All input is there vs. Missing/incomplete data . . .
  • 19. 19 Surprising facts about Machine Learning • Although there are many off-the-shelf tools to help doing machine learning it is almost always harder to do it right • on real world problems, • on real customer data, • constantly, • in scale and • in quality • Roughly 95% or more of the efforts are due to data collection and preparation (missing values, correctness, relevance, representation, balancing, cleaning, …)
  • 20. 20 Surprising facts about Machine Learning • Many-times simpler common-sense algorithms outperform (quality, scale, maintainability, …) complicated algorithms when big-data is available • It not so much about what algorithm you use but how much and quality of the data you have • Data representation many-times matter more (Deep Learning) • Ensembles of simple-specialized algorithms usually do better than one monolithic complex algorithm (Ensemble, Arbitration, …)
  • 21. 21 We want learning: implicit, automatic, dynamic • Not need to write rules, not manage them • Want the rules to be learned automatically and • Don’t want to set/change thresholds • Want the thresholds to be determined automatically and dynamically • I want to know about rare events that matter without needing to define what rare means and without needing to define important • Become better as more data becomes available
  • 22. 22 Example: 802.11 device fingerprinting An empirical study of passive 802.11 Device Fingerprinting Christoph Neumann, Olivier Heen, Stéphane Onno Proceedings of 32nd International Conference on Distributed Computing Systems Workshops (ICDCSW 2012), Workshop on Network Forensics, Security and Privacy (NFSP'12)
  • 23. 23 802.11 device fingerprinting • 802.11 device fingerprinting is the action of characterizing a target device through its wireless traffic. • This results in a signature that may be used for identification, network monitoring or intrusion detection.
  • 24. 24 802.11 device fingerprinting • The fingerprinting method is passive by just observing the traffic sent by the target device. • Focus on network parameters which can be easily extracted using standard wireless cards • Method should work also for encrypted 802.11 traffic • Method should not be detected by attackers hard to cheat with adversarial traffic • Accurate
  • 25. 25 802.11 device fingerprinting Many passive fingerprinting methods rely on the observation of one particular network feature, such as the rate switching behavior or the transmission pattern of probe requests. In this work, the researchers evaluated a set of global wireless network parameters with respect to their ability to identify 802.11 devices. They restricted themselves to parameters that can be observed passively using a standard wireless card. Used information extracted by Radiotap or Prism headers
  • 26. 26 802.11 device fingerprinting Machine Learning? Show me the ML! • Features: Network parameters – observable features • Transmission rate [Mbit/ µsec] – different card vendors and models have variations • Frame size [bytes] – differences in broadcast frame sizes implicitly identify wireless devices • Medium access time [µsec] – time since medium is idle and until device starts sending its own frame • Transmission time [µsec] – frame duration -- time it takes to send a frame (approximate by frame size divided by transmission rate) • Frame inter-arrival time [µsec] – time from end (start) of one frame and end (start) of next frame on the same direction
  • 27. 27 802.11 device fingerprinting Machine Learning? Show me the ML! Computed features: • Foreach frame type (data frames, probe requests, …) • Foreach sender over the medium • Maintain frequency histograms per observable feature • Periodically, • transform frequency histograms to proportional histograms
  • 28. 28 802.11 device fingerprinting Machine Learning? Show me the ML! Researchers in the paper used a supervised learning approach. Learn: Assume fixed/known set of devices Characterize devices using features Apply: Compare histograms with learned histograms and MAC addresses. When a conflict observed vs learned baselines – Alert!
  • 29. 29 802.11 device fingerprinting How to compare histograms? Researchers used Cosine similarity. [Wikipedia]
  • 30. 30 Distance – alternatives? Depends on what you want to capture? Examples: • Minkowski distance • Kullback-Liebler divergence
  • 31. 31 Distance – More alternatives • Triangular discrimination • Jensen Shannon There are many more distance, divergence and similarity measures – what to use? It depends…
  • 32. 32 Alternative learning? Instead of supervised learning (requires sterile learning time, needs examples, rigid, …) let’s do unsupervised: • Cluster observed histograms by distances • Assign MAC addresses to clusters • Look into clusters with more than one MAC address • If False Positives – be more sensitive to precision or look into divergences that better capture differences/similarities • Robust, flexible, assumes very little
  • 33. 33 But what makes a fingerprint? • Check computed features by type? • Create one big histogram? • Create also histograms of inter-dependencies? (Cross product…) • Hash histograms into something else? (What? How?) What’s stable? What’s accurate? It depends on your data, on your representation, on your algorithm…
  • 34. 34 From PHY to L7 An exercise in browser fingerprinting
  • 36. 36 Whoa! How? Mike Sconzo and Brian Wylie have reproducible research which they presented on ShmooCon 2014 http://nbviewer.ipython.org/github/ClickSecurity/data_hacking/blob/ master/browser_fingerprinting/browser_fingerprinting.ipynb You can learn methodology of how to do this from HTTP headers in a scientific manner: Data Scientific manner
  • 37. 37 OK. So, what does this has to do with Rise of the Machines? We can now automatically, • Collect data • Analyze data • Organize data • Insights and conclusions Find what’s interesting – automatically. TADA!
  • 38. 38 Usages? • SPAM detection • Fraud detection • Malware detection • Intrusion detection • Abuse detection • Fingerprinting (duh!) • DDoS detection • Network traffic classification • … … … <your idea here>
  • 39. 39 Links and references • An empirical study of passive 802.11 Device Fingerprinting // Christoph Neumann, Olivier Heen, Stéphane Onno // Proceedings of 32nd International Conference on Distributed Computing Systems Workshops (ICDCSW 2012), Workshop on Network Forensics, Security and Privacy (NFSP'12) • https://panopticlick.eff.org/ • ShmooCon 2014: Practical Applications of Data Science in Detection // Mike Sconzo and Brian Wylie // http://www.youtube.com/watch?v=8lF5rBmKhWk [start at 35:36] • http://nbviewer.ipython.org/github/ClickSecurity/data_hacking/blo b/master/browser_fingerprinting/browser_fingerprinting.ipynb • Wikipedia {just look for terms}
  • 40. 40 What would I want to learn? Theory? Data Science Statistics Machine Learning Statistical Inference Predictive Analytics
  • 41. 41 What would I want to learn? Tools? R (S?) // Octave (Matlab?) // Julia // Haskell // Perl // Python // …
  • 42. 42 How would I learn? Resources? • Google is your friend (youtube too) • Coursera // Udacity // Iversity // EdX // … excellent online courses • Meetups • Good old reading books // university courses // reading academic papers
  • 43. 43 Thanks to… Eran Goldstein -- http://www.linkedin.com/in/erangoldstein Maydan Weinreb -- http://www.linkedin.com/in/maydanw