SlideShare une entreprise Scribd logo
1  sur  39
Télécharger pour lire hors ligne
Banulescu-Radu (LEO) WiMLDS 13/04/2021 1 / 39
Data Science for Financial Fraud Detection
Denisa BANULESCU-RADU
University of Orléans, LEO
WiMLDS 13th of April 2021
Banulescu-Radu (LEO) WiMLDS 13/04/2021 2 / 39
Background
• Since 2015: Associate Professor – University of Orléans, LEO
• 2016: Young Researcher Award in Economics – Autorité des Marchés
Financiers
• 2015: Thesis Prize – Fondation Banque de France
• 2014-2015: Max Weber Postdoctoral Fellow – European University Institute
• 2011-2014: PhD in Economics – Maastricht University and University of
Orléans
Title dissertation: "Four essays in financial econometrics"
Banulescu-Radu (LEO) WiMLDS 13/04/2021 3 / 39
Main research interests
Banulescu-Radu (LEO) WiMLDS 13/04/2021 4 / 39
Outline
1 Econometrics vs Machine Learning
2 General aspects of fraud
3 Main challenges and solutions
4 Case studies
4.1 Case 1: Insurance fraud detection
4.2 Case 2: Social fraud detection
5 Conclusion
Banulescu-Radu (LEO) WiMLDS 13/04/2021 5 / 39
Econometrics vs Machine Learning
Outline
1 Econometrics vs Machine Learning
2 General aspects of fraud
3 Main challenges and solutions
4 Case studies
4.1 Case 1: Insurance fraud detection
4.2 Case 2: Social fraud detection
5 Conclusion
Banulescu-Radu (LEO) WiMLDS 13/04/2021 6 / 39
Econometrics vs Machine Learning
Econometrics vs Machine Learning
Banulescu-Radu (LEO) WiMLDS 13/04/2021 7 / 39
Econometrics vs Machine Learning
Econometrics vs Machine Learning
Banulescu-Radu (LEO) WiMLDS 13/04/2021 8 / 39
Econometrics vs Machine Learning
“there are a number of areas where there would be opportunities
for fruitful collaboration between econometrics and machine
learning ”
Hal Varian (2014) - Professor of Economics (University of Michigan) & Chief Economist
(Google)
Banulescu-Radu (LEO) WiMLDS 13/04/2021 9 / 39
General aspects of fraud
Outline
1 Econometrics vs Machine Learning
2 General aspects of fraud
3 Main challenges and solutions
4 Case studies
4.1 Case 1: Insurance fraud detection
4.2 Case 2: Social fraud detection
5 Conclusion
Banulescu-Radu (LEO) WiMLDS 13/04/2021 10 / 39
General aspects of fraud
Fraud detection - Why is it important?
Banulescu-Radu (LEO) WiMLDS 13/04/2021 11 / 39
General aspects of fraud
Definition of fraud
Definition
• Baesens et al. (2015)
Fraud is an uncommon, well-considered, imperceptibly
concealed, time-evolving, and often carefully organized crime
which appears in many types of forms.
Banulescu-Radu (LEO) WiMLDS 13/04/2021 12 / 39
General aspects of fraud
Typologies of fraud
Banulescu-Radu (LEO) WiMLDS 13/04/2021 13 / 39
Main challenges and solutions
Outline
1 Econometrics vs Machine Learning
2 General aspects of fraud
3 Main challenges and solutions
4 Case studies
4.1 Case 1: Insurance fraud detection
4.2 Case 2: Social fraud detection
5 Conclusion
Banulescu-Radu (LEO) WiMLDS 13/04/2021 14 / 39
Main challenges and solutions
Main CHALLENGES and solutions
Banulescu-Radu (LEO) WiMLDS 13/04/2021 15 / 39
Main challenges and solutions
Main CHALLENGES and solutions
Banulescu-Radu (LEO) WiMLDS 13/04/2021 16 / 39
Main challenges and solutions
Main CHALLENGES and solutions
Banulescu-Radu (LEO) WiMLDS 13/04/2021 17 / 39
Main challenges and solutions
Main CHALLENGES and solutions
Banulescu-Radu (LEO) WiMLDS 13/04/2021 18 / 39
Main challenges and solutions
Main challenges and SOLUTIONS
1. Main tools used to fight fraud
Banulescu-Radu (LEO) WiMLDS 13/04/2021 19 / 39
Main challenges and solutions
Main challenges and SOLUTIONS
2. Deal with imbalanced datasets
Banulescu-Radu (LEO) WiMLDS 13/04/2021 20 / 39
Main challenges and solutions
Main challenges and SOLUTIONS
2. Deal with imbalanced datasets
Banulescu-Radu (LEO) WiMLDS 13/04/2021 21 / 39
Main challenges and solutions
Main challenges and SOLUTIONS
Banulescu-Radu (LEO) WiMLDS 13/04/2021 22 / 39
Main challenges and solutions
Main challenges and SOLUTIONS
3. Evaluation of fraud detection models
Banulescu-Radu (LEO) WiMLDS 13/04/2021 23 / 39
Main challenges and solutions
Main challenges and SOLUTIONS
4. Improving the interpretability of fraud detection models
“if the users do not trust a model or a prediction, they will not use it”
(Ribeiro et al., 2016)
• LIME method
Ribeiro et al. (2016)
• SHAP (SHapley Additive exPlanations) value
Lundberg and Lee, (2017)
BUT ... to what extent do we need fraud detection models to be interpretable?
Banulescu-Radu (LEO) WiMLDS 13/04/2021 24 / 39
Case studies
Outline
1 Econometrics vs Machine Learning
2 General aspects of fraud
3 Main challenges and solutions
4 Case studies
4.1 Case 1: Insurance fraud detection
4.2 Case 2: Social fraud detection
5 Conclusion
Banulescu-Radu (LEO) WiMLDS 13/04/2021 25 / 39
Case studies Case 1: Insurance fraud detection
Outline
1 Econometrics vs Machine Learning
2 General aspects of fraud
3 Main challenges and solutions
4 Case studies
4.1 Case 1: Insurance fraud detection
4.2 Case 2: Social fraud detection
5 Conclusion
Banulescu-Radu (LEO) WiMLDS 13/04/2021 26 / 39
Case studies Case 1: Insurance fraud detection
General framework
• Fraud claims represented 10% of the total number of claims in 2019 (Insurance
Europe)
• Negative record for France: e2.5 Billion in 2014. Only e219 million recovered.
(ALFA)
Banulescu-Radu (LEO) WiMLDS 13/04/2021 27 / 39
Case studies Case 1: Insurance fraud detection
Methodology
DATA
• 45 954 house claims for the period 2013 to 2017
• French insurance company
• 0.76% of claims are fraudulent
Technical tools
• Logistic LASSO (Cox, 1958; Tibshirani, 1996)
• Random forest (Breiman, 2001)
• Extreme Gradient Boosting or Xgboost (Chen and Guestrin, 2016)
Resampling techniques to deal with imbalanced data
• Random Oversampling
• Synthetic Minority Oversampling TEchnique or SMOTE (Chawla et al., 2002)
• ADAptive SYNthetic sampling or ADASYN (He et al., 2008)
Performance metrics
• AUC-ROC, AUC-PR, Brier score, Log-Loss, F-measure
Banulescu-Radu (LEO) WiMLDS 13/04/2021 28 / 39
Case studies Case 1: Insurance fraud detection
Methodology
Banulescu-Radu (LEO) WiMLDS 13/04/2021 29 / 39
Case studies Case 1: Insurance fraud detection
• Interpretation of results: SHAP value method (global/individual level)
Figure 1: Fraudulent case
Figure 2: Non Fraudulent case
Banulescu-Radu (LEO) WiMLDS 13/04/2021 30 / 39
Case studies Case 2: Social fraud detection
Outline
1 Econometrics vs Machine Learning
2 General aspects of fraud
3 Main challenges and solutions
4 Case studies
4.1 Case 1: Insurance fraud detection
4.2 Case 2: Social fraud detection
5 Conclusion
Banulescu-Radu (LEO) WiMLDS 13/04/2021 31 / 39
Case studies Case 2: Social fraud detection
General framework
• Controlling the risks of social and fiscal fraud and combating illegal work are
also important problems for social justice and economic efficiency
• French mutual organization
• collects data systematically from their beneficiaries
• organizes regular controls on a subsample of their taxpayers
• manages a fraud detection system to identify those who do not pay
their contributions
Banulescu-Radu (LEO) WiMLDS 13/04/2021 32 / 39
Case studies Case 2: Social fraud detection
General framework
Objective: Estimate the tax shortfall.
Definition
The tax shortfall is defined as the potential sum of the tax adjustments
that could have been imposed on companies having defrauded or made er-
roneous social declarations, if they had been effectively audited, whereas
they were not in reality.
Banulescu-Radu (LEO) WiMLDS 13/04/2021 33 / 39
Case studies Case 2: Social fraud detection
Remarks
• the two decisions are neither sequential nor conditional
• the decisions are linked
Banulescu-Radu (LEO) WiMLDS 13/04/2021 34 / 39
Case studies Case 2: Social fraud detection
Banulescu-Radu (LEO) WiMLDS 13/04/2021 35 / 39
Case studies Case 2: Social fraud detection
Methodology: Estimation by Maximum Likelihood
Control decision
Ci =
(
1
0
if C∗
i = Xc,i βc + εc,i > 0
otherwise
∀i = 1, . . . , n (1)
Fraud decision
e
Di =

1
0
if D∗
i = Xd,i βd + εd,i  0
otherwise
∀i = 1, . . . , n (2)
Potential tax shortfall
M∗
i =
(
Xm,i βm + εm,i
0
if e
Di = 1
otherwise
∀i = 1, ..n (3)


εc,i
εd,i
εm,i

 ∼ N

0,
X
with
X
= DRD (4)
D =



σc 0 0
0 σd 0
0 0 σm


 R =



1 ρcd ρcm
ρcd 1 ρdm
ρcm ρdm 1


 (5)
Banulescu-Radu (LEO) WiMLDS 13/04/2021 36 / 39
Conclusion
Outline
1 Econometrics vs Machine Learning
2 General aspects of fraud
3 Main challenges and solutions
4 Case studies
4.1 Case 1: Insurance fraud detection
4.2 Case 2: Social fraud detection
5 Conclusion
Banulescu-Radu (LEO) WiMLDS 13/04/2021 37 / 39
Conclusion
Thank you for your attention!
Banulescu-Radu (LEO) WiMLDS 13/04/2021 38 / 39
Conclusion
Banulescu-Radu (LEO) WiMLDS 13/04/2021 39 / 39

Contenu connexe

Similaire à Fraud detection by Denisa Banulescu-Radu

20687-39027-1-PB.pdf
20687-39027-1-PB.pdf20687-39027-1-PB.pdf
20687-39027-1-PB.pdfIjictTeam
 
Covid-19 Data Analysis and Visualization
Covid-19 Data Analysis and VisualizationCovid-19 Data Analysis and Visualization
Covid-19 Data Analysis and VisualizationIRJET Journal
 
Computer Invention And Its Effect On The Human Body
Computer Invention And Its Effect On The Human BodyComputer Invention And Its Effect On The Human Body
Computer Invention And Its Effect On The Human BodyJessica Myers
 
NEW CORONA VIRUS DISEASE 2022: SOCIAL DISTANCING IS AN EFFECTIVE MEASURE (COV...
NEW CORONA VIRUS DISEASE 2022: SOCIAL DISTANCING IS AN EFFECTIVE MEASURE (COV...NEW CORONA VIRUS DISEASE 2022: SOCIAL DISTANCING IS AN EFFECTIVE MEASURE (COV...
NEW CORONA VIRUS DISEASE 2022: SOCIAL DISTANCING IS AN EFFECTIVE MEASURE (COV...IRJET Journal
 
Prediction of Corporate Bankruptcy using Machine Learning Techniques
Prediction of Corporate Bankruptcy using Machine Learning Techniques Prediction of Corporate Bankruptcy using Machine Learning Techniques
Prediction of Corporate Bankruptcy using Machine Learning Techniques Shantanu Deshpande
 
Here are 4 discussion posts by class mates from the 495 class that.docx
Here are 4 discussion posts by class mates from the 495 class that.docxHere are 4 discussion posts by class mates from the 495 class that.docx
Here are 4 discussion posts by class mates from the 495 class that.docxpooleavelina
 
Aon Retail & Wholesale Inperspective Nov 2016
Aon Retail & Wholesale Inperspective Nov 2016Aon Retail & Wholesale Inperspective Nov 2016
Aon Retail & Wholesale Inperspective Nov 2016Graeme Cross
 
Data science landscape in the insurance industry
Data science landscape in the insurance industryData science landscape in the insurance industry
Data science landscape in the insurance industryStefano Perfetti
 
Data4Impact Expert Workshop Report
Data4Impact Expert Workshop ReportData4Impact Expert Workshop Report
Data4Impact Expert Workshop ReportData4Impact
 
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitycse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitysandeepnani2260
 
Predictive-Preventative-or-Intelligence-Led-Policing
Predictive-Preventative-or-Intelligence-Led-PolicingPredictive-Preventative-or-Intelligence-Led-Policing
Predictive-Preventative-or-Intelligence-Led-PolicingMartin Smith
 
Predictive preventative-or-intelligence-led-policing
Predictive preventative-or-intelligence-led-policingPredictive preventative-or-intelligence-led-policing
Predictive preventative-or-intelligence-led-policingYellow Pages of Pakistan
 
Wireless Communication, Sensing and REM: A Security Perspective
Wireless Communication, Sensing and REM: A Security PerspectiveWireless Communication, Sensing and REM: A Security Perspective
Wireless Communication, Sensing and REM: A Security PerspectiveIRJET Journal
 
Assessing The Nature Of Risk Management Implementation In Manufacturing Small...
Assessing The Nature Of Risk Management Implementation In Manufacturing Small...Assessing The Nature Of Risk Management Implementation In Manufacturing Small...
Assessing The Nature Of Risk Management Implementation In Manufacturing Small...Yolanda Ivey
 
Mind the Gaps: AML and Fraud Global Benchmark Survey
Mind the Gaps: AML and Fraud Global Benchmark Survey Mind the Gaps: AML and Fraud Global Benchmark Survey
Mind the Gaps: AML and Fraud Global Benchmark Survey Paul Hamilton
 

Similaire à Fraud detection by Denisa Banulescu-Radu (20)

20687-39027-1-PB.pdf
20687-39027-1-PB.pdf20687-39027-1-PB.pdf
20687-39027-1-PB.pdf
 
Covid-19 Data Analysis and Visualization
Covid-19 Data Analysis and VisualizationCovid-19 Data Analysis and Visualization
Covid-19 Data Analysis and Visualization
 
Computer Invention And Its Effect On The Human Body
Computer Invention And Its Effect On The Human BodyComputer Invention And Its Effect On The Human Body
Computer Invention And Its Effect On The Human Body
 
NEW CORONA VIRUS DISEASE 2022: SOCIAL DISTANCING IS AN EFFECTIVE MEASURE (COV...
NEW CORONA VIRUS DISEASE 2022: SOCIAL DISTANCING IS AN EFFECTIVE MEASURE (COV...NEW CORONA VIRUS DISEASE 2022: SOCIAL DISTANCING IS AN EFFECTIVE MEASURE (COV...
NEW CORONA VIRUS DISEASE 2022: SOCIAL DISTANCING IS AN EFFECTIVE MEASURE (COV...
 
Prediction of Corporate Bankruptcy using Machine Learning Techniques
Prediction of Corporate Bankruptcy using Machine Learning Techniques Prediction of Corporate Bankruptcy using Machine Learning Techniques
Prediction of Corporate Bankruptcy using Machine Learning Techniques
 
Here are 4 discussion posts by class mates from the 495 class that.docx
Here are 4 discussion posts by class mates from the 495 class that.docxHere are 4 discussion posts by class mates from the 495 class that.docx
Here are 4 discussion posts by class mates from the 495 class that.docx
 
Aon Retail & Wholesale Inperspective Nov 2016
Aon Retail & Wholesale Inperspective Nov 2016Aon Retail & Wholesale Inperspective Nov 2016
Aon Retail & Wholesale Inperspective Nov 2016
 
Data science landscape in the insurance industry
Data science landscape in the insurance industryData science landscape in the insurance industry
Data science landscape in the insurance industry
 
Data4Impact Expert Workshop Report
Data4Impact Expert Workshop ReportData4Impact Expert Workshop Report
Data4Impact Expert Workshop Report
 
Pwc gdpr survey 2018
Pwc gdpr survey 2018Pwc gdpr survey 2018
Pwc gdpr survey 2018
 
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitycse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber security
 
Oxford workshop
Oxford workshopOxford workshop
Oxford workshop
 
Predictive-Preventative-or-Intelligence-Led-Policing
Predictive-Preventative-or-Intelligence-Led-PolicingPredictive-Preventative-or-Intelligence-Led-Policing
Predictive-Preventative-or-Intelligence-Led-Policing
 
Predictive preventative-or-intelligence-led-policing
Predictive preventative-or-intelligence-led-policingPredictive preventative-or-intelligence-led-policing
Predictive preventative-or-intelligence-led-policing
 
The Principles of CSR
The Principles of CSRThe Principles of CSR
The Principles of CSR
 
Wireless Communication, Sensing and REM: A Security Perspective
Wireless Communication, Sensing and REM: A Security PerspectiveWireless Communication, Sensing and REM: A Security Perspective
Wireless Communication, Sensing and REM: A Security Perspective
 
Questionnaire on Financial Consumer Protection measures re COVID-19- Summary ...
Questionnaire on Financial Consumer Protection measures re COVID-19- Summary ...Questionnaire on Financial Consumer Protection measures re COVID-19- Summary ...
Questionnaire on Financial Consumer Protection measures re COVID-19- Summary ...
 
Assessing The Nature Of Risk Management Implementation In Manufacturing Small...
Assessing The Nature Of Risk Management Implementation In Manufacturing Small...Assessing The Nature Of Risk Management Implementation In Manufacturing Small...
Assessing The Nature Of Risk Management Implementation In Manufacturing Small...
 
Cipfa Workshops Scotland
Cipfa Workshops ScotlandCipfa Workshops Scotland
Cipfa Workshops Scotland
 
Mind the Gaps: AML and Fraud Global Benchmark Survey
Mind the Gaps: AML and Fraud Global Benchmark Survey Mind the Gaps: AML and Fraud Global Benchmark Survey
Mind the Gaps: AML and Fraud Global Benchmark Survey
 

Plus de Paris Women in Machine Learning and Data Science

Plus de Paris Women in Machine Learning and Data Science (20)

Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
How and why AI should fight cybersexism, by Chloe Daudier
How and why AI should fight cybersexism, by Chloe DaudierHow and why AI should fight cybersexism, by Chloe Daudier
How and why AI should fight cybersexism, by Chloe Daudier
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Managing international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha DimbanManaging international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha Dimban
 
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria KnorpsOptimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
 
Perspectives, by M. Pannegeon
Perspectives, by M. PannegeonPerspectives, by M. Pannegeon
Perspectives, by M. Pannegeon
 
Evaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled dataEvaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled data
 
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
 
An age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-PierreAn age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-Pierre
 
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle LautréApplying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
 
How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
How to supervise a thesis in NLP in the ChatGPT era? By Laure SoulierHow to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
 
Global Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna AbreuGlobal Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna Abreu
 
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie DelonPlug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
 
Sales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca IannuzziSales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca Iannuzzi
 
Identifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta BinkyteIdentifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta Binkyte
 
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
 
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
 
Sandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI projectSandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI project
 
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
 
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdfKhrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
 

Dernier

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 

Dernier (20)

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

Fraud detection by Denisa Banulescu-Radu

  • 1. Banulescu-Radu (LEO) WiMLDS 13/04/2021 1 / 39
  • 2. Data Science for Financial Fraud Detection Denisa BANULESCU-RADU University of Orléans, LEO WiMLDS 13th of April 2021 Banulescu-Radu (LEO) WiMLDS 13/04/2021 2 / 39
  • 3. Background • Since 2015: Associate Professor – University of Orléans, LEO • 2016: Young Researcher Award in Economics – Autorité des Marchés Financiers • 2015: Thesis Prize – Fondation Banque de France • 2014-2015: Max Weber Postdoctoral Fellow – European University Institute • 2011-2014: PhD in Economics – Maastricht University and University of Orléans Title dissertation: "Four essays in financial econometrics" Banulescu-Radu (LEO) WiMLDS 13/04/2021 3 / 39
  • 4. Main research interests Banulescu-Radu (LEO) WiMLDS 13/04/2021 4 / 39
  • 5. Outline 1 Econometrics vs Machine Learning 2 General aspects of fraud 3 Main challenges and solutions 4 Case studies 4.1 Case 1: Insurance fraud detection 4.2 Case 2: Social fraud detection 5 Conclusion Banulescu-Radu (LEO) WiMLDS 13/04/2021 5 / 39
  • 6. Econometrics vs Machine Learning Outline 1 Econometrics vs Machine Learning 2 General aspects of fraud 3 Main challenges and solutions 4 Case studies 4.1 Case 1: Insurance fraud detection 4.2 Case 2: Social fraud detection 5 Conclusion Banulescu-Radu (LEO) WiMLDS 13/04/2021 6 / 39
  • 7. Econometrics vs Machine Learning Econometrics vs Machine Learning Banulescu-Radu (LEO) WiMLDS 13/04/2021 7 / 39
  • 8. Econometrics vs Machine Learning Econometrics vs Machine Learning Banulescu-Radu (LEO) WiMLDS 13/04/2021 8 / 39
  • 9. Econometrics vs Machine Learning “there are a number of areas where there would be opportunities for fruitful collaboration between econometrics and machine learning ” Hal Varian (2014) - Professor of Economics (University of Michigan) & Chief Economist (Google) Banulescu-Radu (LEO) WiMLDS 13/04/2021 9 / 39
  • 10. General aspects of fraud Outline 1 Econometrics vs Machine Learning 2 General aspects of fraud 3 Main challenges and solutions 4 Case studies 4.1 Case 1: Insurance fraud detection 4.2 Case 2: Social fraud detection 5 Conclusion Banulescu-Radu (LEO) WiMLDS 13/04/2021 10 / 39
  • 11. General aspects of fraud Fraud detection - Why is it important? Banulescu-Radu (LEO) WiMLDS 13/04/2021 11 / 39
  • 12. General aspects of fraud Definition of fraud Definition • Baesens et al. (2015) Fraud is an uncommon, well-considered, imperceptibly concealed, time-evolving, and often carefully organized crime which appears in many types of forms. Banulescu-Radu (LEO) WiMLDS 13/04/2021 12 / 39
  • 13. General aspects of fraud Typologies of fraud Banulescu-Radu (LEO) WiMLDS 13/04/2021 13 / 39
  • 14. Main challenges and solutions Outline 1 Econometrics vs Machine Learning 2 General aspects of fraud 3 Main challenges and solutions 4 Case studies 4.1 Case 1: Insurance fraud detection 4.2 Case 2: Social fraud detection 5 Conclusion Banulescu-Radu (LEO) WiMLDS 13/04/2021 14 / 39
  • 15. Main challenges and solutions Main CHALLENGES and solutions Banulescu-Radu (LEO) WiMLDS 13/04/2021 15 / 39
  • 16. Main challenges and solutions Main CHALLENGES and solutions Banulescu-Radu (LEO) WiMLDS 13/04/2021 16 / 39
  • 17. Main challenges and solutions Main CHALLENGES and solutions Banulescu-Radu (LEO) WiMLDS 13/04/2021 17 / 39
  • 18. Main challenges and solutions Main CHALLENGES and solutions Banulescu-Radu (LEO) WiMLDS 13/04/2021 18 / 39
  • 19. Main challenges and solutions Main challenges and SOLUTIONS 1. Main tools used to fight fraud Banulescu-Radu (LEO) WiMLDS 13/04/2021 19 / 39
  • 20. Main challenges and solutions Main challenges and SOLUTIONS 2. Deal with imbalanced datasets Banulescu-Radu (LEO) WiMLDS 13/04/2021 20 / 39
  • 21. Main challenges and solutions Main challenges and SOLUTIONS 2. Deal with imbalanced datasets Banulescu-Radu (LEO) WiMLDS 13/04/2021 21 / 39
  • 22. Main challenges and solutions Main challenges and SOLUTIONS Banulescu-Radu (LEO) WiMLDS 13/04/2021 22 / 39
  • 23. Main challenges and solutions Main challenges and SOLUTIONS 3. Evaluation of fraud detection models Banulescu-Radu (LEO) WiMLDS 13/04/2021 23 / 39
  • 24. Main challenges and solutions Main challenges and SOLUTIONS 4. Improving the interpretability of fraud detection models “if the users do not trust a model or a prediction, they will not use it” (Ribeiro et al., 2016) • LIME method Ribeiro et al. (2016) • SHAP (SHapley Additive exPlanations) value Lundberg and Lee, (2017) BUT ... to what extent do we need fraud detection models to be interpretable? Banulescu-Radu (LEO) WiMLDS 13/04/2021 24 / 39
  • 25. Case studies Outline 1 Econometrics vs Machine Learning 2 General aspects of fraud 3 Main challenges and solutions 4 Case studies 4.1 Case 1: Insurance fraud detection 4.2 Case 2: Social fraud detection 5 Conclusion Banulescu-Radu (LEO) WiMLDS 13/04/2021 25 / 39
  • 26. Case studies Case 1: Insurance fraud detection Outline 1 Econometrics vs Machine Learning 2 General aspects of fraud 3 Main challenges and solutions 4 Case studies 4.1 Case 1: Insurance fraud detection 4.2 Case 2: Social fraud detection 5 Conclusion Banulescu-Radu (LEO) WiMLDS 13/04/2021 26 / 39
  • 27. Case studies Case 1: Insurance fraud detection General framework • Fraud claims represented 10% of the total number of claims in 2019 (Insurance Europe) • Negative record for France: e2.5 Billion in 2014. Only e219 million recovered. (ALFA) Banulescu-Radu (LEO) WiMLDS 13/04/2021 27 / 39
  • 28. Case studies Case 1: Insurance fraud detection Methodology DATA • 45 954 house claims for the period 2013 to 2017 • French insurance company • 0.76% of claims are fraudulent Technical tools • Logistic LASSO (Cox, 1958; Tibshirani, 1996) • Random forest (Breiman, 2001) • Extreme Gradient Boosting or Xgboost (Chen and Guestrin, 2016) Resampling techniques to deal with imbalanced data • Random Oversampling • Synthetic Minority Oversampling TEchnique or SMOTE (Chawla et al., 2002) • ADAptive SYNthetic sampling or ADASYN (He et al., 2008) Performance metrics • AUC-ROC, AUC-PR, Brier score, Log-Loss, F-measure Banulescu-Radu (LEO) WiMLDS 13/04/2021 28 / 39
  • 29. Case studies Case 1: Insurance fraud detection Methodology Banulescu-Radu (LEO) WiMLDS 13/04/2021 29 / 39
  • 30. Case studies Case 1: Insurance fraud detection • Interpretation of results: SHAP value method (global/individual level) Figure 1: Fraudulent case Figure 2: Non Fraudulent case Banulescu-Radu (LEO) WiMLDS 13/04/2021 30 / 39
  • 31. Case studies Case 2: Social fraud detection Outline 1 Econometrics vs Machine Learning 2 General aspects of fraud 3 Main challenges and solutions 4 Case studies 4.1 Case 1: Insurance fraud detection 4.2 Case 2: Social fraud detection 5 Conclusion Banulescu-Radu (LEO) WiMLDS 13/04/2021 31 / 39
  • 32. Case studies Case 2: Social fraud detection General framework • Controlling the risks of social and fiscal fraud and combating illegal work are also important problems for social justice and economic efficiency • French mutual organization • collects data systematically from their beneficiaries • organizes regular controls on a subsample of their taxpayers • manages a fraud detection system to identify those who do not pay their contributions Banulescu-Radu (LEO) WiMLDS 13/04/2021 32 / 39
  • 33. Case studies Case 2: Social fraud detection General framework Objective: Estimate the tax shortfall. Definition The tax shortfall is defined as the potential sum of the tax adjustments that could have been imposed on companies having defrauded or made er- roneous social declarations, if they had been effectively audited, whereas they were not in reality. Banulescu-Radu (LEO) WiMLDS 13/04/2021 33 / 39
  • 34. Case studies Case 2: Social fraud detection Remarks • the two decisions are neither sequential nor conditional • the decisions are linked Banulescu-Radu (LEO) WiMLDS 13/04/2021 34 / 39
  • 35. Case studies Case 2: Social fraud detection Banulescu-Radu (LEO) WiMLDS 13/04/2021 35 / 39
  • 36. Case studies Case 2: Social fraud detection Methodology: Estimation by Maximum Likelihood Control decision Ci = ( 1 0 if C∗ i = Xc,i βc + εc,i > 0 otherwise ∀i = 1, . . . , n (1) Fraud decision e Di = 1 0 if D∗ i = Xd,i βd + εd,i 0 otherwise ∀i = 1, . . . , n (2) Potential tax shortfall M∗ i = ( Xm,i βm + εm,i 0 if e Di = 1 otherwise ∀i = 1, ..n (3)   εc,i εd,i εm,i   ∼ N 0, X with X = DRD (4) D =    σc 0 0 0 σd 0 0 0 σm    R =    1 ρcd ρcm ρcd 1 ρdm ρcm ρdm 1    (5) Banulescu-Radu (LEO) WiMLDS 13/04/2021 36 / 39
  • 37. Conclusion Outline 1 Econometrics vs Machine Learning 2 General aspects of fraud 3 Main challenges and solutions 4 Case studies 4.1 Case 1: Insurance fraud detection 4.2 Case 2: Social fraud detection 5 Conclusion Banulescu-Radu (LEO) WiMLDS 13/04/2021 37 / 39
  • 38. Conclusion Thank you for your attention! Banulescu-Radu (LEO) WiMLDS 13/04/2021 38 / 39