SlideShare une entreprise Scribd logo
1  sur  37
A REVIEW OF MACHINE
LEARNING BASED
ANOMALY DETECTION
By Mohamed Elfadly
elfadly@aucegypt.edu
Outline
 Introduction
 CyberSecurity Systems
 Review of CyberSecurity Solutions
 Machine Learning
 Machine Learning for Anomaly Detection
 Machine Learning Based Techniques
 Machine Learning Applications
Introduction
 As technology moves forward, users became more
technical aware than before. People communicate and
cooperate efficiently through the Internet using their
personal computers, PDAs or mobile phones.
 Through these digital devices linked by the
Internet, hackers also attack personal privacy
using a variety of weapons, such as viruses,
Trojans, worms, botnet attacks, rootkits, adware,
spam, and social engineering platforms.
Introduction
 Those different forms of attacks are considered
a cyber-threat which can be categorized into
one of three groups according to the intruder’s
purpose:
 Stealing confidential information
 Manipulating the components of cyber infrastructure
 Denying the functions of the infrastructure
CyberSecurity System
CyberSecurity Systems
 However, Building defense systems for
discovered attacks is not easy because of the
constantly evolving cyber attacks
 That’s why, higher-level and adaptive
methodologies are required to discover the
embedded cyber intrusions
 Many higher-level
adaptive cyber
defense systems
can be partitioned
into component[1]
 Data-capturing tools, such as Libpcap for Linux and Winpcap for
Windows, capture events from the audit trails of resource information
sources (e.g., network).
 The data-preprocessing module filters out the attacks for which good
signatures have been learned.
 A feature extractor derives basic features that are useful in event analysis
engines, including a sequence of system calls, start time, duration of a
network flow, source IP and source port, destination IP and destination port,
protocol, number of bytes, and number of packets.
 In an analysis engine, various intrusion detection methods are
implemented to investigate the behavior of the cyber-infrastructure, which
may or may not have appeared before in the record, e.g., to detect
anomalous traffic.
 Solutions to cybersecurity problems:
 Proactive Approaches: anticipate and eliminate
vulnerabilities in the cyber system, while remaining
prepared to defend effectively and rapidly against
attacks
 Reactive Approaches: such as intrusion detection
systems (IDSs). IDSs detect intrusions based on the
information from log files and network flow, so that the
extent of damage can be determined, hackers can be
tracked down, and similar attacks can be prevented in
Review of Cyber Security
Solutions
 Proactive security solutions are designed to
maintain the overall security of a system, even if
individual components of the system have been
compromised by an attack.
 Researchers consider data-mining algorithms
from the viewpoint of privacy preservation. This
new research, introduced by Verykios et al.,
called PPDM (the Privacy preservation
technique)[4].
Reactive Security Systems
 An IDS intelligently monitors activities that occur
in a computing resource, e.g., network traffic and
computer usage, to analyze the events and to
generate reactions.
 The intrusion detection can be classified into the
following modules [1]:
 Misuse/Signature detection
 Anomaly Detection
 Hybrid Detection
 Scan detector and Profiling modules.
IDS Modules
 Misuse/Signature Detection: is an IDS triggering method that
generates alarms when a known cyber misuse occurs.
 Anomaly Detection: Anomaly detection triggers alarms when the
detected object behaves significantly differently from the predefined
normal patterns
 Hybrid Detection: Combining both anomaly and misuse detection
techniques to overcome their drawbacks
 Scan Detection and Profiling Module: Scan detection generates
alerts when attackers scan services or computer components in
network systems before launching attacks. The Profiling modules
group similar network connections and search for dominant
behaviors using clustering algorithms.
Purpose
 Most of the reactive security solutions depends
heavily on Machine learning approach to find
solutions to cyber security problems.
 That’s why, a literature review will be
conducted on the anomaly detection using
machine learning
Machine Learning
 Machine learning is one of the corner stone
fields in Artificial Intelligence, where machines
learn to act autonomously, and react to new
situations without being pre-programmed. It is
about designing algorithms that allow
computers to learn.
Machine Learning
 Machine learning algorithms are categorized,
based on the desired outcome of the algorithm
 Supervised Learning
 Unsupervised Learning
Machine Learning for Anomaly
Detection
Lust for victory will not give you the victory. You must receive the victory from
your opponent. He has no choice but to give it to you because he will sense
your heart as better or truer. Nature is your friend; it helps you to win. Your
enemy will have unnatural movement; therefore you will be able to know what
he is going to do before he does it.
Masaaki Hatsumi
Secret Ninjutsu
Anomaly Detection
 The goal of anomaly detection is to target any
event falling outside of a predefined set of normal
behaviors.
 Anomaly detection first defines a profile of normal
behaviors, which reflects the health and sensitivity
of a cyber-infrastructure. Correspondingly, an
anomaly behavior is defined as a pattern in data
that does not conform to the expected behaviors.
Anomaly Detection
 Anomaly detection relies on a clear boundary
between normal and anomalous behaviors, where
the profile of normal behaviors is defined as
different from anomaly events. The profile must fit
a set of criteria as explained by Gong[10].
 For example, if a user who usually logs in around
10 am from university dormitory logs in at 5:30 am
from an IP address of China, then an anomaly has
occurred
Challenges
1. The key challenge is that the huge volume of data with high-
dimensional feature space is difficult to manually analyze and
monitor. Such analysis and monitoring requires highly efficient
computational algorithms in data processing and pattern learning.
2. In the huge volume of network data, the same malicious data
repeatedly occur while the number of similar malicious data is
much smaller than the number of normal data.
3. Much of the data is streaming data, which requires online analysis
4. The concept of an anomaly/outlier varies among application
domains; the labeled anomalies are not available for
training/validation.
Machine Learning for Anomaly
Detection
 Workflow of
anomaly detection
system
 However, anomaly detection approaches has a major
drawback, since it may trigger high rates of false alarm.
Because it can flag any significant deviation from the baseline
as an intrusion
 Hackers often modify malicious codes or data to make them
similar to normal patterns. So when such an attack occurs, it
will detect it as part of the normal profile and the attack will be
missed because it was judged to be part of normal profile, a
false negative occur.
 The problem always remain is how to minimize the false
negative and false positive rates.
Machine Learning Based
Techniques
Technique Pros/Cons
Fuzzy Logic - Reasoning is approximate rather than precise
- Effective, especially against port scans and probes
- High resource consumption involved
Genetic Algorithm - Biologically inspired and employs evolutionary algorithm.
- Uses the properties like Selection, Crossover, and Mutation
- Capable of deriving classification rules and selecting optimal
parameters
Neural Network - Ability to generalize from limited, noisy and incomplete data.
- Has potential to recognize future unseen patterns
Bayesian Network - Encodes probabilistic relationships among the variables of
interest.
- Ability to incorporate both prior knowledge and data
Machine Learning Applications
1. Fusion of BVM and ELM for Anomaly
Detection
2. Anomaly Detection Using Neural Network
Optimized with GSA Algorithm
Fusion of BVM and ELM for Anomaly
Detection
 Changning et al., in their paper “Fusion of BVM
and ELM for Anomaly Detection in Computer
Networks” stated that fusion or ensemble of
classifiers is generally better than a single
classifier. Therefore, the fusion of classifiers for
anomaly detection not only improves the accuracy
but also sustains the low false alarm rates with a
high reliability and scalability. [13]. they utilizes the
extreme learning machine (ELM) and ball vector
machine (BVM) as two kinds of single classifiers.
 Extracting a suitable features for representing the
network traffic flow can be divided into three
groups:
 The content features: containing information about the
data content of packets that could be relevant to anomaly
or intrusion.
 The intrinsic features are some general information
related to the connection.
 Traffic features: for example, statistics related to past
connection similar to the current one.
Fusion Method
 Step 1: Prepare three kinds of features that should be labeled.
 Step 2: Every kinds of features is trained by BVM and ELM separately. The
classifier is denoted as bvm(i) and elm(i) i =1, 2,3 . Lable(i) i =1,...,6 is each
classifier’s output.
 Step 3: Train a single hidden layer BP neural network with 6 input nodes,
30 hidden nodes and 6 output nodes using labeled data of BVM and ELM
from step 2. (Using Lable(i) of bvm(i) and elm(i) as BP neural network’s
input)
 Step 4: Then using acquired Lable(i) as the input of neural network, to train
a BP neural network, and then we obtain Train U as the output.
 In the predicting process, BP neural network receives the labels from
trained ELM and BVM classifier, obtains the Lable(i) and w(i) i = 1,...,6
.Then using major weighted vote to process the value of weight, if
Experiments & Results
BVM ELM BVM+ELM+BP
Accuracy 97.7% 93.32% 99.06%
False alarm rates 0.28% 0.36% 0.13%
They randomly selected 20000 examples from the whole dataset to compose an experiment dataset.
The features are divided into three parts: the content features, which have 13 attributes, intrinsic features, which
have 9 attributes, and the traffic features, which have 19 attributes.
Fusion Method VS SVM
 A comparison between fusion method with
other fusion method, like SVM and BP neural
network as single classifier with same fusion
scheme. ELM+BVM+BP SVM+BP
Training Time 86s 102s
Accuracy 98.06% 98.02%
False alarm rates 0.13% 0.11%
Network Optimized with GSA
Algorithm
 In their paper “Flow-Based Anomaly Detection
Using Neural Network Optimized with GSA
Algorithm” [11] the authors proposes an
anomaly-based Network IDS which is an
important tool to protect computer networks
from attacks.
 Traditional packet-based NIDSs are time-intensive as
they analyze all network packets. A state-of-the-art
NIDS should be able to handle a high volume of traffic
in real time. Flow-based intrusion detection is an
effective method for high speed networks since it
inspects only packet headers. Anomaly-based
intrusion detection is a well-known method capable of
detecting unknown attacks. So they offered a GSA-
based flow anomaly detection system (GFADS), a
multi-layer perceptron neural network with one hidden
layer (MLP)
 They used GSA to overcome the slow
convergence and the local minima caused by
the back-propagation used to train the MLPs.
GSA is memory-less and uses distance to
agents in its updating procedure. It has an
adaptive learning rate and it also has faster
convergence.
Performance
They compared GSA with five gradient descent algorithms and
PSO:
1. Gradient descent momentum and an adaptive learning rate
(Train Gdx)
2. Gradient descent backpropagation (Train gd)
3. Gradient descent with adaptive learning rate
backpropagation (Train Gda)
4. Gradient descent with momentum backpropagation (Train
gdm)
5. Sequential order incremental training with learning function
(Trains)
6. Particle Swarm Optimization Algorithm (PSO)
Future Work
 Review researches on Hybird approaches where
Anomaly and misuse (Signature Based) are combined
together . Since each of these methods has cons and
pros.
 One of the most important disadvantages of anomaly
detection is high false alarm ratio; however misuse
detection is incapable in recognizing new attacks.
 Thus if they are combined in smart way , the proposed
model could use the combination of the qualities of
two mentioned methods to cover the weakness of
each one.
Reference
1. Sumeet Dua and Xian Du. Data Mining and Machine Learning in cybersecurity. April 25, 2011 by Auerbach Publications
2. Canetti, R., R. Gennaro, A. Herzberg, and D. Naor. Proactive security: Long-term protection against break-ins. CryptoBytes 3 (1997): 1–8.
3. Barak, B., A. Herzberg, D. Naor, and E. Shai. The proactive security toolkit and applications. In: Proceedings of the 6th ACM Conference on
Computer and Communications Security,Singapore, 1999, pp. 18–27.
4. Verykios, V.S., E. Bertino, I.N Fovino, L.P. Provenza, Y, Saygin, and Y. Theodoridis. State of-the-art in privacy preserving data mining. ACM
SIGMOD Record 33 , 2004:50–57
5. Denning, D. An intrusion-detection model. IEEE Transactions on Software Engineering 13 (2) (1987): 118–131.
6. Tom M Mitchell. Machine Learning, volume 4. Burr Ridge, IL: McGraw Hill, June 1997.
7. Phil Simon. Too Big to Ignore: The Business Case for Big Data. Wiley, 2013
8. Taiwo Oladipupo Ayodele. New Advances in Machine Learning. InTech, 2010.
9. Harjinder Kaur, Gurpreet Singh, Jaspreet Minhas, “A Review of Machine Learning based Anomaly Detection Techniques”
10. Gong, F. Deciphering detection techniques: Part II. Anomaly-based intrusion detection. white paper, Mcafee Network Security Technologies Group,
2003.
11. Zahra Jadidi, Mansour Sheikhan, “Flow-Based Anomaly Detection Using Neural Network Optimized with GSA Algorithm”
12. Eskin, E., A. Arnold, M. Prerau, L. Portnoy, and S. Stolfo. A geometric framework for unsupervised anomaly detection: Detecting intrusions in
unlabeled data. In: Applications of Data Mining in Computer Security, edited by S. Jajodia and D. Barbara. Dordrecht:Kluwer, 2002, Chap. 4.
13. Changning Cai, Guojian Cheng, Huaxian Pan, “Fusion of BVM and ELM for Anomaly Detection in Computer Networks”

Contenu connexe

Tendances

DB-OLS: An Approach for IDS1
DB-OLS: An Approach for IDS1DB-OLS: An Approach for IDS1
DB-OLS: An Approach for IDS1IJITE
 
A PROPOSED MODEL FOR DIMENSIONALITY REDUCTION TO IMPROVE THE CLASSIFICATION C...
A PROPOSED MODEL FOR DIMENSIONALITY REDUCTION TO IMPROVE THE CLASSIFICATION C...A PROPOSED MODEL FOR DIMENSIONALITY REDUCTION TO IMPROVE THE CLASSIFICATION C...
A PROPOSED MODEL FOR DIMENSIONALITY REDUCTION TO IMPROVE THE CLASSIFICATION C...IJNSA Journal
 
IRJET- Improving Cyber Security using Artificial Intelligence
IRJET- Improving Cyber Security using Artificial IntelligenceIRJET- Improving Cyber Security using Artificial Intelligence
IRJET- Improving Cyber Security using Artificial IntelligenceIRJET Journal
 
Classification of Malware Attacks Using Machine Learning In Decision Tree
Classification of Malware Attacks Using Machine Learning In Decision TreeClassification of Malware Attacks Using Machine Learning In Decision Tree
Classification of Malware Attacks Using Machine Learning In Decision TreeCSCJournals
 
Data Mining Techniques for Providing Network Security through Intrusion Detec...
Data Mining Techniques for Providing Network Security through Intrusion Detec...Data Mining Techniques for Providing Network Security through Intrusion Detec...
Data Mining Techniques for Providing Network Security through Intrusion Detec...IJAAS Team
 
A BAYESIAN CLASSIFICATION ON ASSET VULNERABILITY FOR REAL TIME REDUCTION OF F...
A BAYESIAN CLASSIFICATION ON ASSET VULNERABILITY FOR REAL TIME REDUCTION OF F...A BAYESIAN CLASSIFICATION ON ASSET VULNERABILITY FOR REAL TIME REDUCTION OF F...
A BAYESIAN CLASSIFICATION ON ASSET VULNERABILITY FOR REAL TIME REDUCTION OF F...IJNSA Journal
 
MACHINE LEARNING IN NETWORK SECURITY USING KNIME ANALYTICS
MACHINE LEARNING IN NETWORK SECURITY USING KNIME ANALYTICSMACHINE LEARNING IN NETWORK SECURITY USING KNIME ANALYTICS
MACHINE LEARNING IN NETWORK SECURITY USING KNIME ANALYTICSIJNSA Journal
 
Machine learning in network security using knime analytics
Machine learning in network security using knime analyticsMachine learning in network security using knime analytics
Machine learning in network security using knime analyticsIJNSA Journal
 
2 14-1346479656-1- a study of feature selection methods in intrusion detectio...
2 14-1346479656-1- a study of feature selection methods in intrusion detectio...2 14-1346479656-1- a study of feature selection methods in intrusion detectio...
2 14-1346479656-1- a study of feature selection methods in intrusion detectio...Dr. Amrita .
 
A web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tamA web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tameSAT Journals
 
Ids 013 detection approaches
Ids 013 detection approachesIds 013 detection approaches
Ids 013 detection approachesjyoti_lakhani
 
Real Time Intrusion Detection System Using Computational Intelligence and Neu...
Real Time Intrusion Detection System Using Computational Intelligence and Neu...Real Time Intrusion Detection System Using Computational Intelligence and Neu...
Real Time Intrusion Detection System Using Computational Intelligence and Neu...ijtsrd
 
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSAN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSieijjournal
 
Detecting Anomaly IDS in Network using Bayesian Network
Detecting Anomaly IDS in Network using Bayesian NetworkDetecting Anomaly IDS in Network using Bayesian Network
Detecting Anomaly IDS in Network using Bayesian NetworkIOSR Journals
 
False positive reduction by combining svm and knn algo
False positive reduction by combining svm and knn algoFalse positive reduction by combining svm and knn algo
False positive reduction by combining svm and knn algoeSAT Journals
 
IDS - Analysis of SVM and decision trees
IDS - Analysis of SVM and decision treesIDS - Analysis of SVM and decision trees
IDS - Analysis of SVM and decision treesVahid Farrahi
 
An effecient spam detection technique for io t devices using machine learning
An effecient spam detection technique for io t devices using machine learningAn effecient spam detection technique for io t devices using machine learning
An effecient spam detection technique for io t devices using machine learningVenkat Projects
 
A Survey On Genetic Algorithm For Intrusion Detection System
A Survey On Genetic Algorithm For Intrusion Detection SystemA Survey On Genetic Algorithm For Intrusion Detection System
A Survey On Genetic Algorithm For Intrusion Detection SystemIJARIIE JOURNAL
 

Tendances (18)

DB-OLS: An Approach for IDS1
DB-OLS: An Approach for IDS1DB-OLS: An Approach for IDS1
DB-OLS: An Approach for IDS1
 
A PROPOSED MODEL FOR DIMENSIONALITY REDUCTION TO IMPROVE THE CLASSIFICATION C...
A PROPOSED MODEL FOR DIMENSIONALITY REDUCTION TO IMPROVE THE CLASSIFICATION C...A PROPOSED MODEL FOR DIMENSIONALITY REDUCTION TO IMPROVE THE CLASSIFICATION C...
A PROPOSED MODEL FOR DIMENSIONALITY REDUCTION TO IMPROVE THE CLASSIFICATION C...
 
IRJET- Improving Cyber Security using Artificial Intelligence
IRJET- Improving Cyber Security using Artificial IntelligenceIRJET- Improving Cyber Security using Artificial Intelligence
IRJET- Improving Cyber Security using Artificial Intelligence
 
Classification of Malware Attacks Using Machine Learning In Decision Tree
Classification of Malware Attacks Using Machine Learning In Decision TreeClassification of Malware Attacks Using Machine Learning In Decision Tree
Classification of Malware Attacks Using Machine Learning In Decision Tree
 
Data Mining Techniques for Providing Network Security through Intrusion Detec...
Data Mining Techniques for Providing Network Security through Intrusion Detec...Data Mining Techniques for Providing Network Security through Intrusion Detec...
Data Mining Techniques for Providing Network Security through Intrusion Detec...
 
A BAYESIAN CLASSIFICATION ON ASSET VULNERABILITY FOR REAL TIME REDUCTION OF F...
A BAYESIAN CLASSIFICATION ON ASSET VULNERABILITY FOR REAL TIME REDUCTION OF F...A BAYESIAN CLASSIFICATION ON ASSET VULNERABILITY FOR REAL TIME REDUCTION OF F...
A BAYESIAN CLASSIFICATION ON ASSET VULNERABILITY FOR REAL TIME REDUCTION OF F...
 
MACHINE LEARNING IN NETWORK SECURITY USING KNIME ANALYTICS
MACHINE LEARNING IN NETWORK SECURITY USING KNIME ANALYTICSMACHINE LEARNING IN NETWORK SECURITY USING KNIME ANALYTICS
MACHINE LEARNING IN NETWORK SECURITY USING KNIME ANALYTICS
 
Machine learning in network security using knime analytics
Machine learning in network security using knime analyticsMachine learning in network security using knime analytics
Machine learning in network security using knime analytics
 
2 14-1346479656-1- a study of feature selection methods in intrusion detectio...
2 14-1346479656-1- a study of feature selection methods in intrusion detectio...2 14-1346479656-1- a study of feature selection methods in intrusion detectio...
2 14-1346479656-1- a study of feature selection methods in intrusion detectio...
 
A web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tamA web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tam
 
Ids 013 detection approaches
Ids 013 detection approachesIds 013 detection approaches
Ids 013 detection approaches
 
Real Time Intrusion Detection System Using Computational Intelligence and Neu...
Real Time Intrusion Detection System Using Computational Intelligence and Neu...Real Time Intrusion Detection System Using Computational Intelligence and Neu...
Real Time Intrusion Detection System Using Computational Intelligence and Neu...
 
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSAN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
 
Detecting Anomaly IDS in Network using Bayesian Network
Detecting Anomaly IDS in Network using Bayesian NetworkDetecting Anomaly IDS in Network using Bayesian Network
Detecting Anomaly IDS in Network using Bayesian Network
 
False positive reduction by combining svm and knn algo
False positive reduction by combining svm and knn algoFalse positive reduction by combining svm and knn algo
False positive reduction by combining svm and knn algo
 
IDS - Analysis of SVM and decision trees
IDS - Analysis of SVM and decision treesIDS - Analysis of SVM and decision trees
IDS - Analysis of SVM and decision trees
 
An effecient spam detection technique for io t devices using machine learning
An effecient spam detection technique for io t devices using machine learningAn effecient spam detection technique for io t devices using machine learning
An effecient spam detection technique for io t devices using machine learning
 
A Survey On Genetic Algorithm For Intrusion Detection System
A Survey On Genetic Algorithm For Intrusion Detection SystemA Survey On Genetic Algorithm For Intrusion Detection System
A Survey On Genetic Algorithm For Intrusion Detection System
 

Similaire à A review of machine learning based anomaly detection

A Survey On Intrusion Detection Systems
A Survey On Intrusion Detection SystemsA Survey On Intrusion Detection Systems
A Survey On Intrusion Detection SystemsMary Calkins
 
Supervised Machine Learning Algorithms for Intrusion Detection.pptx
Supervised Machine Learning Algorithms for Intrusion Detection.pptxSupervised Machine Learning Algorithms for Intrusion Detection.pptx
Supervised Machine Learning Algorithms for Intrusion Detection.pptxssuserf3a100
 
Application of Data Mining Technique in Invasion Recognition
Application of Data Mining Technique in Invasion RecognitionApplication of Data Mining Technique in Invasion Recognition
Application of Data Mining Technique in Invasion RecognitionIOSR Journals
 
A Comprehensive Review On Intrusion Detection System And Techniques
A Comprehensive Review On Intrusion Detection System And TechniquesA Comprehensive Review On Intrusion Detection System And Techniques
A Comprehensive Review On Intrusion Detection System And TechniquesKelly Taylor
 
Application of genetic algorithm in intrusion detection system
Application of genetic algorithm in intrusion detection systemApplication of genetic algorithm in intrusion detection system
Application of genetic algorithm in intrusion detection systemAlexander Decker
 
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...IJNSA Journal
 
Malicious Code Intrusion Detection using Machine Learning and Indicators of C...
Malicious Code Intrusion Detection using Machine Learning and Indicators of C...Malicious Code Intrusion Detection using Machine Learning and Indicators of C...
Malicious Code Intrusion Detection using Machine Learning and Indicators of C...IJCSIS Research Publications
 
IRJET- An Intrusion Detection Framework based on Binary Classifiers Optimized...
IRJET- An Intrusion Detection Framework based on Binary Classifiers Optimized...IRJET- An Intrusion Detection Framework based on Binary Classifiers Optimized...
IRJET- An Intrusion Detection Framework based on Binary Classifiers Optimized...IRJET Journal
 
Certified Ethical Hacking
Certified Ethical HackingCertified Ethical Hacking
Certified Ethical HackingJennifer Wood
 
MACHINE LEARNING AND DEEP LEARNING MODEL-BASED DETECTION OF IOT BOTNET ATTACKS.
MACHINE LEARNING AND DEEP LEARNING MODEL-BASED DETECTION OF IOT BOTNET ATTACKS.MACHINE LEARNING AND DEEP LEARNING MODEL-BASED DETECTION OF IOT BOTNET ATTACKS.
MACHINE LEARNING AND DEEP LEARNING MODEL-BASED DETECTION OF IOT BOTNET ATTACKS.IRJET Journal
 
Autonomic Anomaly Detection System in Computer Networks
Autonomic Anomaly Detection System in Computer NetworksAutonomic Anomaly Detection System in Computer Networks
Autonomic Anomaly Detection System in Computer Networksijsrd.com
 
Detecting Unknown Attacks Using Big Data Analysis
Detecting Unknown Attacks Using Big Data AnalysisDetecting Unknown Attacks Using Big Data Analysis
Detecting Unknown Attacks Using Big Data AnalysisEditor IJMTER
 
Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194Editor IJARCET
 
COPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxCOPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxvoversbyobersby
 
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSAN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSieijjournal1
 
Online Intrusion Alert Aggregation with Generative Data Stream Modeling
Online Intrusion Alert Aggregation with Generative Data Stream  ModelingOnline Intrusion Alert Aggregation with Generative Data Stream  Modeling
Online Intrusion Alert Aggregation with Generative Data Stream ModelingIJMER
 
Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)
Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)
Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)IJARIIE JOURNAL
 
Intrusion Detection System Using Self Organizing Map Algorithms
Intrusion Detection System Using Self Organizing Map AlgorithmsIntrusion Detection System Using Self Organizing Map Algorithms
Intrusion Detection System Using Self Organizing Map AlgorithmsEditor IJCATR
 
Image Morphing: A Literature Study
Image Morphing: A Literature StudyImage Morphing: A Literature Study
Image Morphing: A Literature StudyEditor IJCATR
 

Similaire à A review of machine learning based anomaly detection (20)

A Survey On Intrusion Detection Systems
A Survey On Intrusion Detection SystemsA Survey On Intrusion Detection Systems
A Survey On Intrusion Detection Systems
 
Supervised Machine Learning Algorithms for Intrusion Detection.pptx
Supervised Machine Learning Algorithms for Intrusion Detection.pptxSupervised Machine Learning Algorithms for Intrusion Detection.pptx
Supervised Machine Learning Algorithms for Intrusion Detection.pptx
 
Application of Data Mining Technique in Invasion Recognition
Application of Data Mining Technique in Invasion RecognitionApplication of Data Mining Technique in Invasion Recognition
Application of Data Mining Technique in Invasion Recognition
 
Kx3419591964
Kx3419591964Kx3419591964
Kx3419591964
 
A Comprehensive Review On Intrusion Detection System And Techniques
A Comprehensive Review On Intrusion Detection System And TechniquesA Comprehensive Review On Intrusion Detection System And Techniques
A Comprehensive Review On Intrusion Detection System And Techniques
 
Application of genetic algorithm in intrusion detection system
Application of genetic algorithm in intrusion detection systemApplication of genetic algorithm in intrusion detection system
Application of genetic algorithm in intrusion detection system
 
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...
 
Malicious Code Intrusion Detection using Machine Learning and Indicators of C...
Malicious Code Intrusion Detection using Machine Learning and Indicators of C...Malicious Code Intrusion Detection using Machine Learning and Indicators of C...
Malicious Code Intrusion Detection using Machine Learning and Indicators of C...
 
IRJET- An Intrusion Detection Framework based on Binary Classifiers Optimized...
IRJET- An Intrusion Detection Framework based on Binary Classifiers Optimized...IRJET- An Intrusion Detection Framework based on Binary Classifiers Optimized...
IRJET- An Intrusion Detection Framework based on Binary Classifiers Optimized...
 
Certified Ethical Hacking
Certified Ethical HackingCertified Ethical Hacking
Certified Ethical Hacking
 
MACHINE LEARNING AND DEEP LEARNING MODEL-BASED DETECTION OF IOT BOTNET ATTACKS.
MACHINE LEARNING AND DEEP LEARNING MODEL-BASED DETECTION OF IOT BOTNET ATTACKS.MACHINE LEARNING AND DEEP LEARNING MODEL-BASED DETECTION OF IOT BOTNET ATTACKS.
MACHINE LEARNING AND DEEP LEARNING MODEL-BASED DETECTION OF IOT BOTNET ATTACKS.
 
Autonomic Anomaly Detection System in Computer Networks
Autonomic Anomaly Detection System in Computer NetworksAutonomic Anomaly Detection System in Computer Networks
Autonomic Anomaly Detection System in Computer Networks
 
Detecting Unknown Attacks Using Big Data Analysis
Detecting Unknown Attacks Using Big Data AnalysisDetecting Unknown Attacks Using Big Data Analysis
Detecting Unknown Attacks Using Big Data Analysis
 
Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194
 
COPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxCOPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docx
 
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSAN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
 
Online Intrusion Alert Aggregation with Generative Data Stream Modeling
Online Intrusion Alert Aggregation with Generative Data Stream  ModelingOnline Intrusion Alert Aggregation with Generative Data Stream  Modeling
Online Intrusion Alert Aggregation with Generative Data Stream Modeling
 
Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)
Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)
Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)
 
Intrusion Detection System Using Self Organizing Map Algorithms
Intrusion Detection System Using Self Organizing Map AlgorithmsIntrusion Detection System Using Self Organizing Map Algorithms
Intrusion Detection System Using Self Organizing Map Algorithms
 
Image Morphing: A Literature Study
Image Morphing: A Literature StudyImage Morphing: A Literature Study
Image Morphing: A Literature Study
 

Dernier

Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 

Dernier (20)

Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 

A review of machine learning based anomaly detection

  • 1. A REVIEW OF MACHINE LEARNING BASED ANOMALY DETECTION By Mohamed Elfadly elfadly@aucegypt.edu
  • 2. Outline  Introduction  CyberSecurity Systems  Review of CyberSecurity Solutions  Machine Learning  Machine Learning for Anomaly Detection  Machine Learning Based Techniques  Machine Learning Applications
  • 3. Introduction  As technology moves forward, users became more technical aware than before. People communicate and cooperate efficiently through the Internet using their personal computers, PDAs or mobile phones.  Through these digital devices linked by the Internet, hackers also attack personal privacy using a variety of weapons, such as viruses, Trojans, worms, botnet attacks, rootkits, adware, spam, and social engineering platforms.
  • 4. Introduction  Those different forms of attacks are considered a cyber-threat which can be categorized into one of three groups according to the intruder’s purpose:  Stealing confidential information  Manipulating the components of cyber infrastructure  Denying the functions of the infrastructure
  • 6. CyberSecurity Systems  However, Building defense systems for discovered attacks is not easy because of the constantly evolving cyber attacks  That’s why, higher-level and adaptive methodologies are required to discover the embedded cyber intrusions
  • 7.  Many higher-level adaptive cyber defense systems can be partitioned into component[1]
  • 8.  Data-capturing tools, such as Libpcap for Linux and Winpcap for Windows, capture events from the audit trails of resource information sources (e.g., network).  The data-preprocessing module filters out the attacks for which good signatures have been learned.  A feature extractor derives basic features that are useful in event analysis engines, including a sequence of system calls, start time, duration of a network flow, source IP and source port, destination IP and destination port, protocol, number of bytes, and number of packets.  In an analysis engine, various intrusion detection methods are implemented to investigate the behavior of the cyber-infrastructure, which may or may not have appeared before in the record, e.g., to detect anomalous traffic.
  • 9.  Solutions to cybersecurity problems:  Proactive Approaches: anticipate and eliminate vulnerabilities in the cyber system, while remaining prepared to defend effectively and rapidly against attacks  Reactive Approaches: such as intrusion detection systems (IDSs). IDSs detect intrusions based on the information from log files and network flow, so that the extent of damage can be determined, hackers can be tracked down, and similar attacks can be prevented in
  • 10. Review of Cyber Security Solutions  Proactive security solutions are designed to maintain the overall security of a system, even if individual components of the system have been compromised by an attack.  Researchers consider data-mining algorithms from the viewpoint of privacy preservation. This new research, introduced by Verykios et al., called PPDM (the Privacy preservation technique)[4].
  • 11. Reactive Security Systems  An IDS intelligently monitors activities that occur in a computing resource, e.g., network traffic and computer usage, to analyze the events and to generate reactions.  The intrusion detection can be classified into the following modules [1]:  Misuse/Signature detection  Anomaly Detection  Hybrid Detection  Scan detector and Profiling modules.
  • 12. IDS Modules  Misuse/Signature Detection: is an IDS triggering method that generates alarms when a known cyber misuse occurs.  Anomaly Detection: Anomaly detection triggers alarms when the detected object behaves significantly differently from the predefined normal patterns  Hybrid Detection: Combining both anomaly and misuse detection techniques to overcome their drawbacks  Scan Detection and Profiling Module: Scan detection generates alerts when attackers scan services or computer components in network systems before launching attacks. The Profiling modules group similar network connections and search for dominant behaviors using clustering algorithms.
  • 13. Purpose  Most of the reactive security solutions depends heavily on Machine learning approach to find solutions to cyber security problems.  That’s why, a literature review will be conducted on the anomaly detection using machine learning
  • 14. Machine Learning  Machine learning is one of the corner stone fields in Artificial Intelligence, where machines learn to act autonomously, and react to new situations without being pre-programmed. It is about designing algorithms that allow computers to learn.
  • 15. Machine Learning  Machine learning algorithms are categorized, based on the desired outcome of the algorithm  Supervised Learning  Unsupervised Learning
  • 16. Machine Learning for Anomaly Detection Lust for victory will not give you the victory. You must receive the victory from your opponent. He has no choice but to give it to you because he will sense your heart as better or truer. Nature is your friend; it helps you to win. Your enemy will have unnatural movement; therefore you will be able to know what he is going to do before he does it. Masaaki Hatsumi Secret Ninjutsu
  • 17. Anomaly Detection  The goal of anomaly detection is to target any event falling outside of a predefined set of normal behaviors.  Anomaly detection first defines a profile of normal behaviors, which reflects the health and sensitivity of a cyber-infrastructure. Correspondingly, an anomaly behavior is defined as a pattern in data that does not conform to the expected behaviors.
  • 18. Anomaly Detection  Anomaly detection relies on a clear boundary between normal and anomalous behaviors, where the profile of normal behaviors is defined as different from anomaly events. The profile must fit a set of criteria as explained by Gong[10].  For example, if a user who usually logs in around 10 am from university dormitory logs in at 5:30 am from an IP address of China, then an anomaly has occurred
  • 19. Challenges 1. The key challenge is that the huge volume of data with high- dimensional feature space is difficult to manually analyze and monitor. Such analysis and monitoring requires highly efficient computational algorithms in data processing and pattern learning. 2. In the huge volume of network data, the same malicious data repeatedly occur while the number of similar malicious data is much smaller than the number of normal data. 3. Much of the data is streaming data, which requires online analysis 4. The concept of an anomaly/outlier varies among application domains; the labeled anomalies are not available for training/validation.
  • 20. Machine Learning for Anomaly Detection  Workflow of anomaly detection system
  • 21.  However, anomaly detection approaches has a major drawback, since it may trigger high rates of false alarm. Because it can flag any significant deviation from the baseline as an intrusion  Hackers often modify malicious codes or data to make them similar to normal patterns. So when such an attack occurs, it will detect it as part of the normal profile and the attack will be missed because it was judged to be part of normal profile, a false negative occur.  The problem always remain is how to minimize the false negative and false positive rates.
  • 23. Technique Pros/Cons Fuzzy Logic - Reasoning is approximate rather than precise - Effective, especially against port scans and probes - High resource consumption involved Genetic Algorithm - Biologically inspired and employs evolutionary algorithm. - Uses the properties like Selection, Crossover, and Mutation - Capable of deriving classification rules and selecting optimal parameters Neural Network - Ability to generalize from limited, noisy and incomplete data. - Has potential to recognize future unseen patterns Bayesian Network - Encodes probabilistic relationships among the variables of interest. - Ability to incorporate both prior knowledge and data
  • 24. Machine Learning Applications 1. Fusion of BVM and ELM for Anomaly Detection 2. Anomaly Detection Using Neural Network Optimized with GSA Algorithm
  • 25. Fusion of BVM and ELM for Anomaly Detection  Changning et al., in their paper “Fusion of BVM and ELM for Anomaly Detection in Computer Networks” stated that fusion or ensemble of classifiers is generally better than a single classifier. Therefore, the fusion of classifiers for anomaly detection not only improves the accuracy but also sustains the low false alarm rates with a high reliability and scalability. [13]. they utilizes the extreme learning machine (ELM) and ball vector machine (BVM) as two kinds of single classifiers.
  • 26.  Extracting a suitable features for representing the network traffic flow can be divided into three groups:  The content features: containing information about the data content of packets that could be relevant to anomaly or intrusion.  The intrinsic features are some general information related to the connection.  Traffic features: for example, statistics related to past connection similar to the current one.
  • 27. Fusion Method  Step 1: Prepare three kinds of features that should be labeled.  Step 2: Every kinds of features is trained by BVM and ELM separately. The classifier is denoted as bvm(i) and elm(i) i =1, 2,3 . Lable(i) i =1,...,6 is each classifier’s output.  Step 3: Train a single hidden layer BP neural network with 6 input nodes, 30 hidden nodes and 6 output nodes using labeled data of BVM and ELM from step 2. (Using Lable(i) of bvm(i) and elm(i) as BP neural network’s input)  Step 4: Then using acquired Lable(i) as the input of neural network, to train a BP neural network, and then we obtain Train U as the output.  In the predicting process, BP neural network receives the labels from trained ELM and BVM classifier, obtains the Lable(i) and w(i) i = 1,...,6 .Then using major weighted vote to process the value of weight, if
  • 28. Experiments & Results BVM ELM BVM+ELM+BP Accuracy 97.7% 93.32% 99.06% False alarm rates 0.28% 0.36% 0.13% They randomly selected 20000 examples from the whole dataset to compose an experiment dataset. The features are divided into three parts: the content features, which have 13 attributes, intrinsic features, which have 9 attributes, and the traffic features, which have 19 attributes.
  • 29. Fusion Method VS SVM  A comparison between fusion method with other fusion method, like SVM and BP neural network as single classifier with same fusion scheme. ELM+BVM+BP SVM+BP Training Time 86s 102s Accuracy 98.06% 98.02% False alarm rates 0.13% 0.11%
  • 30. Network Optimized with GSA Algorithm  In their paper “Flow-Based Anomaly Detection Using Neural Network Optimized with GSA Algorithm” [11] the authors proposes an anomaly-based Network IDS which is an important tool to protect computer networks from attacks.
  • 31.  Traditional packet-based NIDSs are time-intensive as they analyze all network packets. A state-of-the-art NIDS should be able to handle a high volume of traffic in real time. Flow-based intrusion detection is an effective method for high speed networks since it inspects only packet headers. Anomaly-based intrusion detection is a well-known method capable of detecting unknown attacks. So they offered a GSA- based flow anomaly detection system (GFADS), a multi-layer perceptron neural network with one hidden layer (MLP)
  • 32.  They used GSA to overcome the slow convergence and the local minima caused by the back-propagation used to train the MLPs. GSA is memory-less and uses distance to agents in its updating procedure. It has an adaptive learning rate and it also has faster convergence.
  • 33.
  • 34. Performance They compared GSA with five gradient descent algorithms and PSO: 1. Gradient descent momentum and an adaptive learning rate (Train Gdx) 2. Gradient descent backpropagation (Train gd) 3. Gradient descent with adaptive learning rate backpropagation (Train Gda) 4. Gradient descent with momentum backpropagation (Train gdm) 5. Sequential order incremental training with learning function (Trains) 6. Particle Swarm Optimization Algorithm (PSO)
  • 35.
  • 36. Future Work  Review researches on Hybird approaches where Anomaly and misuse (Signature Based) are combined together . Since each of these methods has cons and pros.  One of the most important disadvantages of anomaly detection is high false alarm ratio; however misuse detection is incapable in recognizing new attacks.  Thus if they are combined in smart way , the proposed model could use the combination of the qualities of two mentioned methods to cover the weakness of each one.
  • 37. Reference 1. Sumeet Dua and Xian Du. Data Mining and Machine Learning in cybersecurity. April 25, 2011 by Auerbach Publications 2. Canetti, R., R. Gennaro, A. Herzberg, and D. Naor. Proactive security: Long-term protection against break-ins. CryptoBytes 3 (1997): 1–8. 3. Barak, B., A. Herzberg, D. Naor, and E. Shai. The proactive security toolkit and applications. In: Proceedings of the 6th ACM Conference on Computer and Communications Security,Singapore, 1999, pp. 18–27. 4. Verykios, V.S., E. Bertino, I.N Fovino, L.P. Provenza, Y, Saygin, and Y. Theodoridis. State of-the-art in privacy preserving data mining. ACM SIGMOD Record 33 , 2004:50–57 5. Denning, D. An intrusion-detection model. IEEE Transactions on Software Engineering 13 (2) (1987): 118–131. 6. Tom M Mitchell. Machine Learning, volume 4. Burr Ridge, IL: McGraw Hill, June 1997. 7. Phil Simon. Too Big to Ignore: The Business Case for Big Data. Wiley, 2013 8. Taiwo Oladipupo Ayodele. New Advances in Machine Learning. InTech, 2010. 9. Harjinder Kaur, Gurpreet Singh, Jaspreet Minhas, “A Review of Machine Learning based Anomaly Detection Techniques” 10. Gong, F. Deciphering detection techniques: Part II. Anomaly-based intrusion detection. white paper, Mcafee Network Security Technologies Group, 2003. 11. Zahra Jadidi, Mansour Sheikhan, “Flow-Based Anomaly Detection Using Neural Network Optimized with GSA Algorithm” 12. Eskin, E., A. Arnold, M. Prerau, L. Portnoy, and S. Stolfo. A geometric framework for unsupervised anomaly detection: Detecting intrusions in unlabeled data. In: Applications of Data Mining in Computer Security, edited by S. Jajodia and D. Barbara. Dordrecht:Kluwer, 2002, Chap. 4. 13. Changning Cai, Guojian Cheng, Huaxian Pan, “Fusion of BVM and ELM for Anomaly Detection in Computer Networks”

Notes de l'éditeur

  1. To secure cyber infrastructure against intentional and potentially malicious threats, a growing collaborative effort between cybersecurity professionals and researchers from institutions, private industries, academia, and government agencies has engaged in exploiting and designing a variety of cyber defense systems
  2. As shown in the above figure cybersecurity systems combat cybersecurity threats at two levels: Network-based defense systems control network flow by network firewall, spam filter, antivirus, and network intrusion detection techniques. Host-based defense systems control upcoming data in a workstation by firewall, antivirus, and intrusion detection techniques installed in hosts.
  3. Recently, the improvement of data-mining techniques and information technology brings unlimited chances for Internet and other media users to explore new information. The new information may include sensitive information and, thus, incur a new research domain where researchers consider data-mining algorithms from the viewpoint of privacy preservation. This new research, introduced by Verykios et al., called PPDM (the Privacy preservation technique)[4]. PPDM is designed to protect private data and knowledge in data mining. Its methods can be characterized by data distribution, data modification, data-mining algorithms, rule hiding, and privacy preservation techniques.
  4. Most of the reactive security solutions depends heavily on Machine learning approach to find solutions to cyber security problems. That’s why in this paper, a literature review will be conducted on the anomaly detection using machine learning
  5. Supervised learning: The machine is trained with labeled data, where the algorithm generates a function that maps inputs to desired outputs. Although this method is widely used, obtaining labeled data is always difficult and expensive. Popular categorizations include artificial neural network (ANN), support vector machine (SVM), and decision trees. In unsupervised learning, no target or label is given in sample data. Unsupervised learning methods are designed to summarize the key features of the data and to form the natural clusters of input patterns given a particular cost function. The most famous unsupervised learning methods include k-means clustering, hierarchical clustering, and self-organization map. Unsupervised learning is difficult to evaluate, because it does not have an explicit teacher and, thus, does not have labeled data for testing.
  6. ].It must contain robustly characterized normal behavior, such as a host/IP address or VLAN segment and have the ability to track the normal behaviors of the target environment sensitively. Also, it should include the following information: occurrence patterns of specific commands in application protocols, association of content types with different fields of application protocols, connectivity patterns between protected servers and the outside world, and rate and burst length distributions for all types of traffic[10].
  7. In data collection, the volume of data is extremely large, and it requires data reduction in data preprocessing. Additionally, most of the data in the network are streaming data, and requires further data reduction. Thus, the data preprocessing step includes feature selection, feature extraction, or a dimensionality reduction technique, and an information-theoretic method.   Machine-learning methods play key roles in building normal profiles and intrusion detection in anomaly detection systems. In anomaly detection, labeled data corresponding to normal behavior are usually available, while labeled data for anomaly behavior are not. Supervised machine-learning methods need attack-free training data. However, this kind of training data is difficult to obtain in real-world network environments. This lack of training data leads to the well-known unbalanced data distribution in machine learning. Eskin et al. stated that unsupervised anomaly detection can overcome the drawbacks of supervised anomaly detection. Thus, semi-supervised and unsupervised machine-learning methods are employed frequently. [12]
  8. Anomaly detection techniques can be sub categorized into Statistical Approaches, Cognition and Machine learning. Machine learning techniques are based on explicit or implicit model that enables the patterns analyzed to be categorized. It can be categorized into Genetic Algorithms, Fuzzy Logic, Neural Networks, Bayesian networks and outlier detection
  9. ELM is a single-hidden layer feedforward neural network (SLFN) which randomly chooses hidden nodes and analytically determines the output weights of SLFN. As variants of SVM, BVM scaling up kernel methods based on the notion of enclosing ball problem, it does not require any numerical solver. Both of these two algorithms can produce good generalization performance in large-scale applications.
  10. The best performance is obtained by means of fusion of BVM and ELM method. As the best single classifier, utilizing BVM for anomaly detection has a similar accuracy with fusion method, but with a high false alarm rates. ELM has a lowest accuracy but also with a high false alarm rates.
  11. They classified the non-linearly separable patterns using MLP with two layers (one hidden layer and an output layer). They start with using the GSA to optimize the interconnection weights of a two-layer MLP. They use this optimized MLP to detect anomalies in a flow-based traffic. The output layer has two nodes that classifies the flow-based traffic into either malicious or benign attacks. The hidden layer with the three nodes and the selected GSA parameters gave the best performance as indicated in [11].
  12. For measuring performance, they used the following metrics: Accuracy Error Rate (ER) Miss Rate (MR) False Alarm Rate (FAR)
  13. The results show that the GSA has the highest accuracy with 99.43 in classifying benign and malicious attacks, and it also has the lowest FAR. The table below show the accuracy over 10 times experiments.