SlideShare une entreprise Scribd logo
1  sur  25
Télécharger pour lire hors ligne
Anomaly and fraud detection with
Machine Learning
Ahmed Rebai
Esprit Prépa & Esprit School of Business
Plan
● Anomaly/Fraud detection with machine learning/Deep
learning
● Outlier detection general applications
● Applications to financial sector (banking, insurance): Towards
Fintech
● Supervised vs. Unsupervised
● Algorithms and methods
● Conclusions Ensemble method (Isolation forest) and Density
method (DBSCAN/HDBSCAN)
Anomaly/Fraud detection with
machine learning/Deep learning
● Machine learning : data mining, predictive analytics, artificial
intelligence (in practice : statistics and numerical methods for
statistical analysis)
● Anomaly: deviation from “normal”/”expected”
Anomaly=Outlier=Deviant or Unusual Data Point
● Anomaly detection : detection of outlier events or observations:
Detecting deviations from the expected pattern of a data set.
The real challenge in anomaly detection is to construct the right
data model to separate outliers from noise and normal data.
In one dimension: what anomalies
look like?
Real Example: Stock market :
The May 6, 2010, Flash Crash at 2:45 pm
Real Example from the Stock market :
A real financial fraud!
● On April 21, 2015, nearly five years after the
incident, the U.S. Department of Justice laid "22
criminal counts, including fraud and market
manipulation" against Navinder Singh Sarao, a
trader. Among the charges included was the
use of spoofing algorithms; just prior to the
Flash Crash, he placed thousands of E-mini
S&P 500 stock index futures contracts which he
planned on canceling later.
Stock market : Flash Crash
continues
In two dimension: what anomalies
look like?
Results with DBSCAN algorithm from sklearn.cluster
In two dimension: what anomalies
look like?
Results with 2D PCA method from sklear.decomposition
also you can do kernel PCA
January 28 1986 Challenger spatial mission:
Dalal & al., 1989 Journal of the American Statistical Association
Real Example: Rocket science
In two dimension: what anomalies
look like?
In two dimension: what anomalies
look like?
Using Logarithmic, log-log plots to
display outliers can help
pyplot.hist can "log" y axis for you with keyword argument log=True
Example : plt.hist(numpy.log10(data), log=True)
Linear histogram Frequency vs log10
Price
Log-log histogram
log10 freq vs log10
price
Using Logarithmic, log-log plots to
display outliers can help
Linear plot Log-log plot
matplotlib.pyplot.loglog
Outlier detection general applications
● Cyber security: Detect cyber-attacks on networks: more
trusted connections
● Equipment failure (risk theory/théorie de la fiabilité)
industrial sector
● Fraud detection
● Detecting cheaters in mobile gaming
● Preprocessing task for analysis or machine learning
● Reduce false declines then grow the revenue
● Detecting French regions where Le Pen’s scores at the
2017 presidential election deviate from predictions based
on socio-economic variables (towardsdatascience)
Applications to financial sector (banking,
insurance): Towards Fintech
● Retail bank: Credit card fraud
● Private bank: Market abuse, Anti-money laundering
● Investment bank: Market abuse (Flash Crash), Anti-money
laundering
●
Insurance companies: Fraudulent operations
●
Central banks: detecting tax havens (panama papers)
Requires different approaches because Red flags are banking-type
specific (specific business expertise)
Methodology?
Supervised learning vs.
Unsupervised learning
Supervised vs. Unsupervised
● In credit card fraud detection one knows the target
variable.
How? Customers tell us. Can use supervised approach
because the true class is self-revealing.
● In market abuse or money laundering detection we don’t
really know classes (how money is laundered changes
all the time and by the same criminal organisation)
Why supervised learning is difficult?
● Severe class imbalance : we estimate that
99.9% are trusted operations vs. 0.1% of
fraudulent operations
● Problem during the train test split
● Solution you can use the stratification option in
the train_test_split function of sklearn
And now why unsupervised is
difficult?
● Sever class overlap : money laundering is mixed with legal
financial activity, especially in Investment banks
● Uncertainty around the data model
● The complexity of data
● The huge volume of data (time complexity)
● Next we will discuss the data models
● To avoid time complexity: use dask, numba (used to speed
numpy), ray and the newly module modin (used to speed
pandas) + demonstration
Algorithms
● We can differentiate between three methods:
- Distance based algorithms (similarity)
K-NN for classification
K-Means for clustering
- Density-based (fitting a density)
DBSCAN and HDBSCAN
Local outlier factor (LOF)
- Parametric
Gaussian mixture models (GMM)
Single class SVM
Extreme value theory: Tukey outlier labeling
Tree ensemble algorithm : Isolation forest
(used in Credit-swiss)
(F. T. Liu, et al., Isolation Forest, Data Mining, 2008. ICDM’08, Eighth
IEEE International Conference)
from sklearn.ensemble import IsolationForest
Ensemble regressor uses the concept of isolation to explain/separate-
away anomalies
● No point based distance calculation
● Instead I.F. builds an ensemble of random trees for a given data set
and anomalies are points with the shortest average path length
Unsupervised learning for anomaly
detection (conclusion)
● Unsupervised learning is all about finding structure in data
● Techniques: Clustering (K-means, spectral clustering)
● Principle Components Analysis
● Support Vector machine
● Autoencoder Deep Neural Networks : DNN
autoencoder anomaly detection (exotic)
● Filtering, Sequential Bayesian Filtering
● Gaussian mixture Model clustering via EM
● LightGBM (Exotic)
Fraud detection ML

Contenu connexe

Tendances

Credit card payment_fraud_detection
Credit card payment_fraud_detectionCredit card payment_fraud_detection
Credit card payment_fraud_detectionPEIPEI HAN
 
Detecting fraud with Python and machine learning
Detecting fraud with Python and machine learningDetecting fraud with Python and machine learning
Detecting fraud with Python and machine learningwgyn
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAndrea Dal Pozzolo
 
CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION K Srinivas Rao
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learningSandeep Garg
 
Real-Time Fraud Detection in Payment Transactions
Real-Time Fraud Detection in Payment TransactionsReal-Time Fraud Detection in Payment Transactions
Real-Time Fraud Detection in Payment TransactionsChristian Gügi
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithmsankit panigrahy
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learningdataalcott
 
Credit Card Fraud Detection
Credit Card Fraud DetectionCredit Card Fraud Detection
Credit Card Fraud DetectionBinayakreddy
 
How to identify credit card fraud
How to identify credit card fraudHow to identify credit card fraud
How to identify credit card fraudHenley Walls
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsHariteja Bodepudi
 
Short-Story-Writing 255.pptx
Short-Story-Writing 255.pptxShort-Story-Writing 255.pptx
Short-Story-Writing 255.pptxSravaniRaparla
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationImpetus Technologies
 
Analysis of-credit-card-fault-detection
Analysis of-credit-card-fault-detectionAnalysis of-credit-card-fault-detection
Analysis of-credit-card-fault-detectionJustluk Luk
 
Anomaly Detection Technique
Anomaly Detection TechniqueAnomaly Detection Technique
Anomaly Detection TechniqueChakrit Phain
 
Build Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and GraphsBuild Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and GraphsNeo4j
 
Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)k.surya kumar
 

Tendances (20)

Fraud detection
Fraud detectionFraud detection
Fraud detection
 
Credit card payment_fraud_detection
Credit card payment_fraud_detectionCredit card payment_fraud_detection
Credit card payment_fraud_detection
 
Detecting fraud with Python and machine learning
Detecting fraud with Python and machine learningDetecting fraud with Python and machine learning
Detecting fraud with Python and machine learning
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud Detection
 
CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
 
Real-Time Fraud Detection in Payment Transactions
Real-Time Fraud Detection in Payment TransactionsReal-Time Fraud Detection in Payment Transactions
Real-Time Fraud Detection in Payment Transactions
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithms
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
 
Credit Card Fraud Detection
Credit Card Fraud DetectionCredit Card Fraud Detection
Credit Card Fraud Detection
 
How to identify credit card fraud
How to identify credit card fraudHow to identify credit card fraud
How to identify credit card fraud
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
 
Short-Story-Writing 255.pptx
Short-Story-Writing 255.pptxShort-Story-Writing 255.pptx
Short-Story-Writing 255.pptx
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
 
Analysis of-credit-card-fault-detection
Analysis of-credit-card-fault-detectionAnalysis of-credit-card-fault-detection
Analysis of-credit-card-fault-detection
 
Anomaly Detection Technique
Anomaly Detection TechniqueAnomaly Detection Technique
Anomaly Detection Technique
 
CREDIT_CARD.ppt
CREDIT_CARD.pptCREDIT_CARD.ppt
CREDIT_CARD.ppt
 
Build Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and GraphsBuild Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and Graphs
 
Deep Learning for Fraud Detection
Deep Learning for Fraud DetectionDeep Learning for Fraud Detection
Deep Learning for Fraud Detection
 
Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)
 

Similaire à Fraud detection ML

Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713Mathieu DESPRIEE
 
Big Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloBig Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloOCTO Technology
 
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
IRJET -  	  Fraud Detection in Credit Card using Machine Learning TechniquesIRJET -  	  Fraud Detection in Credit Card using Machine Learning Techniques
IRJET - Fraud Detection in Credit Card using Machine Learning TechniquesIRJET Journal
 
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaUnsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaPyData
 
Analysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTAnalysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTIJERA Editor
 
Eick/Alpaydin Introduction
Eick/Alpaydin IntroductionEick/Alpaydin Introduction
Eick/Alpaydin Introductionbutest
 
2011 02-04 - d sallier - prévision probabiliste
2011 02-04 - d sallier - prévision probabiliste2011 02-04 - d sallier - prévision probabiliste
2011 02-04 - d sallier - prévision probabilisteCdiscount
 
Credit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data ScienceCredit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data ScienceIRJET Journal
 
Credit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data ScienceCredit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data ScienceIRJET Journal
 
Ml topic1 a
Ml topic1 aMl topic1 a
Ml topic1 abosycs1
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET Journal
 
Machine Learning
Machine LearningMachine Learning
Machine LearningVivek Garg
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systemsOlivier Teytaud
 
IRJET - Cardless ATM
IRJET -  	  Cardless ATMIRJET -  	  Cardless ATM
IRJET - Cardless ATMIRJET Journal
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesCodePolitan
 
Log Message Anomaly Detection with Oversampling
Log Message Anomaly Detection with Oversampling Log Message Anomaly Detection with Oversampling
Log Message Anomaly Detection with Oversampling gerogepatton
 
A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...IRJET Journal
 
Trading outlier detection machine learning approach
Trading outlier detection  machine learning approachTrading outlier detection  machine learning approach
Trading outlier detection machine learning approachEditorIJAERD
 

Similaire à Fraud detection ML (20)

Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
 
Big Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloBig Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao Paulo
 
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
IRJET -  	  Fraud Detection in Credit Card using Machine Learning TechniquesIRJET -  	  Fraud Detection in Credit Card using Machine Learning Techniques
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
 
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaUnsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
 
Analysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTAnalysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOT
 
Eick/Alpaydin Introduction
Eick/Alpaydin IntroductionEick/Alpaydin Introduction
Eick/Alpaydin Introduction
 
2011 02-04 - d sallier - prévision probabiliste
2011 02-04 - d sallier - prévision probabiliste2011 02-04 - d sallier - prévision probabiliste
2011 02-04 - d sallier - prévision probabiliste
 
Credit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data ScienceCredit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data Science
 
Credit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data ScienceCredit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data Science
 
Ml topic1 a
Ml topic1 aMl topic1 a
Ml topic1 a
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection Analysis
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systems
 
IRJET - Cardless ATM
IRJET -  	  Cardless ATMIRJET -  	  Cardless ATM
IRJET - Cardless ATM
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
 
Why am I doing this???
Why am I doing this???Why am I doing this???
Why am I doing this???
 
Log Message Anomaly Detection with Oversampling
Log Message Anomaly Detection with Oversampling Log Message Anomaly Detection with Oversampling
Log Message Anomaly Detection with Oversampling
 
Cerdit card
Cerdit cardCerdit card
Cerdit card
 
A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...
 
Trading outlier detection machine learning approach
Trading outlier detection  machine learning approachTrading outlier detection  machine learning approach
Trading outlier detection machine learning approach
 

Dernier

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 

Dernier (20)

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 

Fraud detection ML

  • 1. Anomaly and fraud detection with Machine Learning Ahmed Rebai Esprit Prépa & Esprit School of Business
  • 2. Plan ● Anomaly/Fraud detection with machine learning/Deep learning ● Outlier detection general applications ● Applications to financial sector (banking, insurance): Towards Fintech ● Supervised vs. Unsupervised ● Algorithms and methods ● Conclusions Ensemble method (Isolation forest) and Density method (DBSCAN/HDBSCAN)
  • 3. Anomaly/Fraud detection with machine learning/Deep learning ● Machine learning : data mining, predictive analytics, artificial intelligence (in practice : statistics and numerical methods for statistical analysis) ● Anomaly: deviation from “normal”/”expected” Anomaly=Outlier=Deviant or Unusual Data Point ● Anomaly detection : detection of outlier events or observations: Detecting deviations from the expected pattern of a data set. The real challenge in anomaly detection is to construct the right data model to separate outliers from noise and normal data.
  • 4. In one dimension: what anomalies look like?
  • 5. Real Example: Stock market : The May 6, 2010, Flash Crash at 2:45 pm
  • 6. Real Example from the Stock market : A real financial fraud! ● On April 21, 2015, nearly five years after the incident, the U.S. Department of Justice laid "22 criminal counts, including fraud and market manipulation" against Navinder Singh Sarao, a trader. Among the charges included was the use of spoofing algorithms; just prior to the Flash Crash, he placed thousands of E-mini S&P 500 stock index futures contracts which he planned on canceling later.
  • 7. Stock market : Flash Crash continues
  • 8. In two dimension: what anomalies look like? Results with DBSCAN algorithm from sklearn.cluster
  • 9. In two dimension: what anomalies look like? Results with 2D PCA method from sklear.decomposition also you can do kernel PCA
  • 10. January 28 1986 Challenger spatial mission: Dalal & al., 1989 Journal of the American Statistical Association Real Example: Rocket science
  • 11. In two dimension: what anomalies look like?
  • 12. In two dimension: what anomalies look like?
  • 13. Using Logarithmic, log-log plots to display outliers can help pyplot.hist can "log" y axis for you with keyword argument log=True Example : plt.hist(numpy.log10(data), log=True) Linear histogram Frequency vs log10 Price Log-log histogram log10 freq vs log10 price
  • 14. Using Logarithmic, log-log plots to display outliers can help Linear plot Log-log plot matplotlib.pyplot.loglog
  • 15. Outlier detection general applications ● Cyber security: Detect cyber-attacks on networks: more trusted connections ● Equipment failure (risk theory/théorie de la fiabilité) industrial sector ● Fraud detection ● Detecting cheaters in mobile gaming ● Preprocessing task for analysis or machine learning ● Reduce false declines then grow the revenue ● Detecting French regions where Le Pen’s scores at the 2017 presidential election deviate from predictions based on socio-economic variables (towardsdatascience)
  • 16. Applications to financial sector (banking, insurance): Towards Fintech ● Retail bank: Credit card fraud ● Private bank: Market abuse, Anti-money laundering ● Investment bank: Market abuse (Flash Crash), Anti-money laundering ● Insurance companies: Fraudulent operations ● Central banks: detecting tax havens (panama papers) Requires different approaches because Red flags are banking-type specific (specific business expertise)
  • 18. Supervised vs. Unsupervised ● In credit card fraud detection one knows the target variable. How? Customers tell us. Can use supervised approach because the true class is self-revealing. ● In market abuse or money laundering detection we don’t really know classes (how money is laundered changes all the time and by the same criminal organisation)
  • 19. Why supervised learning is difficult? ● Severe class imbalance : we estimate that 99.9% are trusted operations vs. 0.1% of fraudulent operations ● Problem during the train test split ● Solution you can use the stratification option in the train_test_split function of sklearn
  • 20. And now why unsupervised is difficult? ● Sever class overlap : money laundering is mixed with legal financial activity, especially in Investment banks ● Uncertainty around the data model ● The complexity of data ● The huge volume of data (time complexity) ● Next we will discuss the data models ● To avoid time complexity: use dask, numba (used to speed numpy), ray and the newly module modin (used to speed pandas) + demonstration
  • 21. Algorithms ● We can differentiate between three methods: - Distance based algorithms (similarity) K-NN for classification K-Means for clustering - Density-based (fitting a density) DBSCAN and HDBSCAN Local outlier factor (LOF) - Parametric Gaussian mixture models (GMM) Single class SVM Extreme value theory: Tukey outlier labeling
  • 22. Tree ensemble algorithm : Isolation forest (used in Credit-swiss) (F. T. Liu, et al., Isolation Forest, Data Mining, 2008. ICDM’08, Eighth IEEE International Conference) from sklearn.ensemble import IsolationForest Ensemble regressor uses the concept of isolation to explain/separate- away anomalies ● No point based distance calculation ● Instead I.F. builds an ensemble of random trees for a given data set and anomalies are points with the shortest average path length
  • 23.
  • 24. Unsupervised learning for anomaly detection (conclusion) ● Unsupervised learning is all about finding structure in data ● Techniques: Clustering (K-means, spectral clustering) ● Principle Components Analysis ● Support Vector machine ● Autoencoder Deep Neural Networks : DNN autoencoder anomaly detection (exotic) ● Filtering, Sequential Bayesian Filtering ● Gaussian mixture Model clustering via EM ● LightGBM (Exotic)