Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Build Intelligent Fraud Prevention with Machine Learning and Graphs

1 266 vues

Publié le

See how financial services, banking and retail are using graph-enhanced machine learning to thwart fraud. Fraudsters are becoming increasingly sophisticated, organized and adaptive; traditional, rule-based solutions are not broad or nimble enough to deal with this reality. This session will cover several demonstrations and real-world technical examples including preventing credit card fraud, identifying money laundering and reducing false positives.

Publié dans : Technologie
  • Soyez le premier à commenter

Build Intelligent Fraud Prevention with Machine Learning and Graphs

  1. 1. Fraud Analysis Using Graph and Machine Learning February 13th, 2018
  2. 2. Who We Are GRAHAM GANSSLE - Ph.D., P.G. Data Science Lead, Expero Graham.Ganssle@experoinc.com Deep learning expert Financial analytics specialist NAV MATHUR Sr. Director - Global Solutions, Neo4j nav@neo4j.com @nav_mathur
  3. 3. Agenda • Who are Today’s Fraudsters • How to Fight Fraud Rings with Graphs • Different Types of Credit Card Fraud & Neo4j Demo • How Neo4j Fits in a Typical Architecture • Summary • Q&A
  4. 4. Who Are Today’s Fraudsters? 4
  5. 5. Who Are Today’s Fraudsters? 5 Organized in groups Synthetic Identities Stolen Identities Hijacked Devices
  6. 6. Types of Fraud 6 •Credit Card Fraud •Rogue Merchants •Fraud Rings •Insurance Fraud •eCommerce Fraud •Fraud we don’t know about yet…
  7. 7. (From a data-modeling perspective) Fraud Detection 7
  8. 8. 8 Raw Data
  9. 9. 9 Anomalies
  10. 10. 10 1) Detect 2) Respond Fraud Prevention is About Reacting to Patterns (And doing it fast!)
  11. 11. 11 Relational Database Choosing Underlying Technology
  12. 12. 12 Data Modelled as a Graph! Graph Database
  13. 13. Neo4j Graph Platform 13
  14. 14. Graph Transactions Graph Analytics Data Integration Development & Admin Analytics Tooling Drivers & APIs Discovery & Visualization Developers Admins Applications Business Users Data Analysts Data Scientists
  15. 15. Neo4j Native Graph Database Analytics Integrations Cypher Query Language Wide Range of APOC Procedures Optimized Graph Algorithms
  16. 16. Finds the optimal path or evaluates route availability and quality Evaluates how a group is clustered or partitioned Determines the importance of distinct nodes in the network
  17. 17. How Neo4j Differentiates from other Databases Visualization Queries Processing Storage Non-Native Graph DBNative Graph DB RDBMS Optimized for graph workloads
  18. 18. 1 2 3 4 5 6 Key Neo4j Architecture Components Index-Free Adjacency In memory and on flash/disk vs ACID Foundation Required for safe writes Full-Stack Clustering Causal consistency Language, Drivers, Tooling Developer Experience, Graph Efficiency, Type Safety Graph Engine Cost-Based Optimizer, Graph Statistics, Cypher Runtime Hardware Optimizations For next-gen infrastructure
  19. 19. Examples of Prevalent Fraud Types 19
  20. 20. Fraud Rings
  21. 21. 21 “Don’t consider traditional technology adequate to keep up with criminal trends” Market Guide for Online Fraud Detection, April 27, 2015
  22. 22. 22 Endpoint-Centri c Analysis of users and their end-points 1 . Navigation Centric Analysis of navigation behavior and suspect patterns 2 . Account-Centric Analysis of anomaly behavior by channel 3 . PC:s Mobile Phones IP-addresses User ID:s Comparing Transaction Identity Vetting Traditional Fraud Detection Methods
  23. 23. 23 INVESTIGATE Revolving Debt Number of Accounts INVESTIGATE Normal behavior Fraud Detection with Discrete Analysis Unable to detect • Fraud rings • Fake IP-adresses • Hijacked devices • Synthetic Identities • Stolen Identities • And more… Weaknesses
  24. 24. 24 Revolving Debt Number of Accounts Normal behavior Fraudulent pattern Fraud Detection with Connected Analysis
  25. 25. 25 CONNECTED ANALYSIS Endpoint-Centri c Analysis of users and their end-points Navigation Centric Analysis of navigation behavior and suspect patterns Account-Centric Analysis of anomaly behavior by channel DISCRETE ANALYSIS 1 . 2 . 3 . Cross Channel Analysis of anomaly behavior correlated across channels 4 . Entity Linking Analysis of relationships to detect organized crime and collusion 5 . Augmented Fraud Detection
  26. 26. 26 ACCOUNT HOLDER 2 Modeling a fraud ring as a graph ACCOUNT HOLDER 1 ACCOUNT HOLDER 3
  27. 27. 27 ACCOUNT HOLDER 2 ACCOUNT HOLDER 1 ACCOUNT HOLDER 3 CREDIT CARD BANK ACCOUNT BANK ACCOUNT BANK ACCOUNT PHONE NUMBER UNSECURE D LOAN SSN 2 UNSECURED LOAN Modeling a fraud ring as a graph
  28. 28. 28 ACCOUNT HOLDER 2 ACCOUNT HOLDER 1 ACCOUNT HOLDER 3 CREDIT CARD BANK ACCOUNT BANK ACCOUNT BANK ACCOUNT ADDRESS PHONE NUMBER PHONE NUMBER SSN 2 UNSECURED LOAN SSN 2 UNSECURED LOAN Modeling a fraud ring as a graph
  29. 29. Ring-Based Fraud Classification ● You can’t use standard deep learning techniques to learn about rings ● Let’s leverage the power of graph to do this ● The above case does consider spatial relationships of entities to one another, but the following case does so for multiple entities simultaneously
  30. 30. Graph Topology Metrics Measures for detecting ring-based fraud: ● Connectedness ● Degree ● Betweenness ● Node count ● Edge count ● Eigenvalues ● Centrality ● Clique size ● Diameter ● Triangles ● Page rank ● Closeness ● Community value ● Ave clustering coef ● Min edge dom set size ● Max edge independent set size ● Deg associativity coef ● Deg assortativity coef ● Betweenness centrality sum ● closeness centrality sum ● Eigenvector centrality Is this group of businesses actually a money laundering ring?
  31. 31. Deep Neural Network Analysis - Network Topology Fraud AnalysisConnectedness Degree Betweenness Node count Edge count Eigenvalues Centrality Clique size Diameter Triangles Page rank Closeness Community value Ave clustering coef Min edge dom set size financial ring metrics deep neural network Laundering / not laundering confidence
  32. 32. Credit Card Fraud 32 Example #1 “Credit Card Testing”
  33. 33. 33 Manual skimming of an ATM Sophisticated Data Breaches Retrieval of Credit Card Information Rogue Merchant
  34. 34. 34 USE ISSUES Terminal ATM-skimming Data Breach Card Holder Card Issuer Fraudste r USE $5MAKES $1 0 MAKES $2 MAKES MAKES $4000 AT Testin g Merchants ATMAKES Tx
  35. 35. 35 Example #2 “Fraud Origination and Assessing Loss Magnitude”
  36. 36. 36 TxTx Tx TxTx Tx Tx TxTxTx TxJohn
  37. 37. 37 Tx $2000 TxTx Tx Tx TxTxTxTx Tx Tx Computer Store John
  38. 38. 38 Tx $2000 Tx Tx $25$10$4 TxTx Tx Tx TxTxTx Computer Store John Gas Station
  39. 39. 39 Tx Tx $2000 Tx Tx $25$10$4 TxTx Tx Tx TxTxTx Computer Store John Gas Station Sheila Tx $2 TxTxSheila TxTxTx Tx Tx TxTx $3000 Tx Jewelry StoreTx $3
  40. 40. 40 Tx Tx $2000 Tx Tx $25$10$4 TxTx Tx Tx TxTxTx Computer Store John Gas Station Sheila Tx $2 TxTxSheila TxTxTx Tx Tx TxTx $3000 Tx Jewelry StoreTx $3 Robert TxTxTx Tx TxTx TxTxTx Tx Tx
  41. 41. 41 TxTx $2 TxTx Tx $2000 Tx Tx $25$10$4 TxTx Tx Tx TxTxTx Computer Store John Gas Station Sheila Robert $3 Karen TxTxTx Tx Tx TxTx $3000 Tx Jewelry StoreTx $3 TxTxTx Tx Tx TxTx TxTx TxTx TxTx Tx Tx TxTx $8 $12 Tx $1500 Furniture Store Tx Tx Tx
  42. 42. Credit Card Fraud Classification ● $118 billion lost each year on fraud false positives ● This is the reactive case. There’s a predictive case, too. (we’ll get there in a few slides) ● Scott talked in webinar #1 about fraud analysis for individual entities and organizational entities. Here’s how we actually do that stuff
  43. 43. Graph Embeddings squishHigh dimensional CC information graph 2 dimensional collapsed CC information
  44. 44. Deep Neural Network Analysis - Embeddings Networks 44 embedded CC info embedded CC info deep convolutional neural network deep neural network fraud / not fraud confidence fraud / not fraud confidence
  45. 45. Individual and Organizational Fraud Prediction ● We can combine the above analysis to predict both individual and organizational acts of fraud using graph convolutional networks ● This is a much more sophisticated architecture which (when applied to the right types of problems) can dramatically increase accuracy
  46. 46. Graph Convolutional Network Diagram 46 graph convolution graph convolution ReLU dropout softmax class
  47. 47. Node Classification - Single Entity Fraud Analysis Is this business committing fraud? It depends on where the money is going.
  48. 48. Subgraph Classification - Entity Ring Fraud Analysis Full Company Graph Company Supernode Graph Is this group of businesses com m itting fraud?
  49. 49. How Neo4j fits in
  50. 50. 50 Money Transferring Purchases Bank Services Relational database Develop Patterns Data Science-team + Good for Discrete Analysis – No Holistic View of Data-Relationships – Slow query speed for connections
  51. 51. 51 Money Transferring Purchases Bank Services Relational database Data Lake + Good for Map Reduce + Good for Analytical Workloads – No holistic view – Non-operational workloads – Weeks-to-months processes Develop Patterns Data Science-team Merchant Data Credit Score Data Other 3rd Party Data
  52. 52. 52 Money Transferring Purchases Bank Services Neo4j Cluster SENSE Transaction stream RESPOND Alerts & notification LOAD RELEVANT DATA Relational database Data Lake Develop Patterns Data Science-team Merchant Data Credit Score Data Other 3rd Party Data Data-set used to explore new insights
  53. 53. 53 Money Transferring Purchases Bank Services Neo4j powers 360° view of transactions in real-time Neo4j Cluster SENSE Transaction stream RESPOND Alerts & notification LOAD RELEVANT DATA Relational database Data Lake Visualization UI Fine Tune Patterns Develop Patterns Data Science-team Merchant Data Credit Score Data Other 3rd Party Data Data-set used to explore new insights
  54. 54. 54 • Detect & prevent fraud in real-time • Faster credit risk analysis and transactions • Reduce chargebacks • Quickly adapt to new methods of fraud Why Neo4j? Who’s using it? Financial institutions use Neo4j to: FINANCE Government Online Retail
  55. 55. Neo4j + Expero Complete Fraud Solutions PRESENTATION LOGIC DATA
  56. 56. Insight for Graph Methodology DISCOVERY INVENTION REALIZATION TRACK & MEASURE ONGOING SUPPORT PROOF OF CONCEPT PILOT TURN-KEY MVP DEVELOPMENT TECHNOLOGY LIFE CYCLE ASSESSMENTS : DIAGNOSE & PRESCRIBE - DATA, ARCHITECTURE, CODE, USER EXPERIENCE (Any Stage) SUPPORT - EXPERT SERVICES
  57. 57. Playbook: What are the Next Steps? Prototype Pilot Delivery Data Loading DSE Platform Data Discovery Craft Visualization Key Business Functions Build Rapid Pilot - Prototype Validate Business Case and Platform Technology ● Key Customer Functionality ● Graph Data Platform - Specifications ● Working Graph System ● Real Data Set Business Problem Go LiveDevelopmentDiscovery & Requirements Testing PLAY: Rapid Prototype
  58. 58. RAPID PILOT: See and Experience Your Data Web UI framework React Visualizations EXPERO GRAPH TOOLS + (Open Source) Graph Platform App Server (Generic Server) Provisioning EXPERO GRAPH TOOLS Ansible + Cloudburst Compute Cloud AWS EC2 Data Sources CUSTOMER Data or (Synthetic Data)
  59. 59. 59 Join Us - Webinar Series Thwart Fraud Using Graph-Enhanced ML & AI You Are Here Build Intelligent Fraud Prevention with ML and Graphs Overview Technical Aspects Understand Business Impact Delivered Available on Neo4j YouTube Channel Lock Down Funding for Graph-Enhanced Fraud Solutions Get Funding Feb 20 9:00 PST / 12:00 EST
  60. 60. Thank You! GRAHAM GANSSLE, Ph.D. Data Science Lead, Expero Graham.Ganssle@experoinc.com @GrahamGanssle Nav MATHUR Sr. Director - Global Solutions, Neo4j nav@neo4j.com @nav_mathur www.Neo4j.com /use-cases/fraud-detection info@neo4j.com @neo4j www.ExperoInc.com /practices/ai info@experoinc.com @experoinc

×