Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Graphs and Financial Services Analytics

163 vues

Publié le

EY delivers a talk on using graphs for AML and Anomaly detection in Financial Services.

Publié dans : Technologie
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Soyez le premier à aimer ceci

Graphs and Financial Services Analytics

  1. 1. Graphs and Financial ServicesAnalytics Michael Moore, Ph.D. Executive Director, Enterprise Knowledge Graphs + AI EY Performance Improvement Advisory Omar Azhar, M.S. Manager, Machine Learning andAdvanced Analytics EY Financial ServicesOrganization Miguel Perez, Ph.D. (DND), M.S. Senior, Machine Learning andAdvanced Analytics EY Financial ServicesOrganization
  2. 2. Falling Memory Cost: 1990-2016
  3. 3. 2 15 68 117 244 2000 4000 0 500 1000 1500 2000 2500 3000 3500 4000 m 1.xlarge m 2.4xlarge hs1.8xlarge r3.8xlarge x1e.16xlarge x1e.32xlarge 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 EC2 RAM (GB)Continued increase in capacity and dropping compute costs are challenging scale-out commodity server assumptions, particularly for database workloads 2018 ScaleOut  Scale Up
  4. 4. This is a Graph.
  5. 5. This is a Graph.
  6. 6. This is a Graph.
  7. 7. This is a Graph.
  8. 8. ► Common use cases for graph analytics ► Recommendation engines ► Supply chain and network optimization ► Fraud networks ► Community detection (social network analysis) ► Impact analysis / network contagion ► Anomaly detection 7 Graph Analytics Use Cases Focus of this talk
  9. 9. 8 Anomalous Behavior Detection in Dynamic Graphs in Financial Services Anomalies are not always about finding bad behavior. We’re trying to find change in a network or behavior that is indicative of a significant change in our assumptions • Customer Behavior: A life event such as new job, new house, marriage. Significant life changes are indicated by customers behaving in ways that they previously did not. Points of opportunities for providing new services • Transaction Networks at Scale: What defines an efficient flow of funds vs. an inefficient? Are they correlated with the type of behavior? • How should we think of structuring this as a graph problem? problem?
  10. 10. 9 Let’s start with a model everyone is familiar with…Customer 360 FA Hub Corporate Wiki Call Logs E-mail Logs Social network data Financial Hub Accounts Hub Transaction Logs Now that we’ve got our graph model we now need to consider scale
  11. 11. 10 Scaling determines what snapshots you take of the graph for analysis Micro. Looking at my graph at an account level
  12. 12. 11 Scaling determines what snapshots you take of the graph for analysis Moving up the scale. Looking at the customer level
  13. 13. 12 Scaling determines what snapshots you take of the graph for analysis Household level
  14. 14. 13 How do we think about scaling in a graph problem? Consider the business defined scale • Scaling by collections of nodes: clumping nodes together -> household node • Generally defined by business and domain expertise • Scaling by collections of edges: clumping edges together -> geometric time-averaging • Requires both business / domain knowledge as well as a little bit of investigating. How do you tell what is a full time cycle? Micro Macro Account Firm Coarse versus fine grain Tuning
  15. 15. 14 Understanding your graph snapshot. Different data models of the same underlying knowledge graph Explore your graph snapshots. You will notice natural separation or clusters / segments in each snapshot. Most of this is already done through current segmentation models at most firms Can we use similar graph snapshots to describe expected behavior? Checking accounts Credit Cards Similar customers by spend Households with similar incomes
  16. 16. 15 But how is this any different than what is already done today? Why Graph? They all belong to this household The college student Let’s investigate how a single household shows up through two separate snapshots The parents
  17. 17. 16 How should change in one snapshot change the nodes in another snapshot? What does it mean for a node in on snapshot to change it’s data to move to another location in it’s snapshot? Can we model that?
  18. 18. 17 We should expect diffusion of information across our graph data models (GDMs) Household moves to a lower cost state -> Household retains income but is wealthier in new state
  19. 19. 18 Information should spread across GDMs. It should go both ways but not necessarily with the same weight College student graduates and moves back in with his parents
  20. 20. 19 Can we now model this as expected change across our GDMs? Identify node changes What other types of change have a small impact in one GDM and a large impact in the other GDM?: One family member moves -> Household income is represented differently in one model versus another
  21. 21. 20 Expressing Behavior with graph snapshots Compare graph snapshots to identify node behavioral change • Similar GDMs can give you a context dependent way of expressing behavioral change! This means we can self-compute it
  22. 22. 21 Expressing Behavior with graph snapshots Compare graph snapshots to identify node behavioral change • Similar GDMs can give you a context dependent way of expressing behavioral change! This means we can self-compute it • Expressing behavioral change is now deeply connected to expressing the structural change on similar GDMs that are supported by the same underlying knowledge graph
  23. 23. 22 Behavioral Change over time Time-Sequenced Graph Data Models (TSGDM) • A sequence of graph data models provides the Context for behavioral change over time.
  24. 24. 23 TSGDM – Assumptions – Semantic Compatibility Time-Sequenced Graph Data Models – Necessary Conditions • (1) Intuitive edges that are semantically compatible with the parent KG and entity resolution
  25. 25. 24 TSGDM – Assumptions – Semantic Compatibility Time-Sequenced Graph Data Models – Necessary Conditions • (1) Intuitive edges that are semantically compatible with the parent KG and entity resolution • (2) Obeys information theoretic concerns about “information propagation on a geometric structure
  26. 26. 25 TSGDM – Assumptions – Semantic Compatibility Time-Sequenced Graph Data Models – Necessary Conditions • (1) Intuitive edges that are semantically compatible with the parent KG and entity resolution • (2) Obeys information theoretic concerns about “information propagation on a geometric structure • (3) Use an Unsupervised architecture that correctly diffuses information in each time step
  27. 27. 26 TSGDM – Assumptions – Semantic Compatibility Time-Sequenced Graph Data Models – Necessary Conditions • (1) Intuitive edges that are semantically compatible with the parent KG and entity resolution • (2) Obeys information theoretic concerns about “information propagation on a geometric structure • (3) Use an Unsupervised architecture that correctly diffuses information in each time step • (4) The architecture learns how we should be describing behavioral change – not the other way around
  28. 28. 27 TSGDM – Assumptions – Semantic Compatibility Time-Sequenced Graph Data Models – Necessary Conditions • (1) Intuitive edges that are semantically compatible with the parent KG and entity resolution • (2) Obeys information theoretic concerns about “information propagation on a geometric structure • (3) Use an Unsupervised architecture that correctly diffuses information in each time step • (4) The architecture learns how we should be describing behavioral change – not the other way around • (5) Use the statistical distribution learned to identify outliers • (6) Rank Those outliers
  29. 29. 28 TSGDM - Using a learned statistical distribution to identify outliers Take your customer transaction data and build a Parent Knowledge Graph
  30. 30. 29 Scaling experimentation let’s us study different schema for candidate TSGDM Comparing two similar GDMs provides context for behavioral change of a node TSGDM - Using a learned statistical distribution to identify outliers
  31. 31. 30 Apply the selected schema on each month of data (or another appropriate time scale) Memory constraints will fix the number of time windows your architecture can learn from TSGDM - Using a learned statistical distribution to identify outliers Month 1 Month 2 Month 3 Month 4 Month X
  32. 32. 31 Learn a Champion Model on each time window batch TSGDM - Using a learned statistical distribution to identify outliers Champion Model
  33. 33. 32 Apply Champion Model to each TSGDM and investigate the tail of each distribution TSGDM - Using a learned statistical distribution to identify outliers The log scale compression error, or reconstruction error, tends to follow a power law distribution. Graph structural changes that are harder to reproduce tend to be outliers!
  34. 34. 33 • Create multiple champion models with some overlap in their time windows • The overlap in the cumulative error between champion models will be the outliers of interest • Rank all nodes by their cumulative error for each Champion Model • Key Takeaway: If a financial behavior is hard to replicate in this framework, the more likely the behavior is an anomaly TSGDM - Using a learned statistical distribution to identify outliers …
  35. 35. 34 Use Case: Anti-Money Laundering Existing Business Problem: Financial Institutions are responsible for monitoring the transaction activity of client accounts in order to detect the presence of Money Laundering activity. Rule-based systems generate too many false- positive alerts that require expensive and subjective manual review. Industry standard performance is 1:1000
  36. 36. 35 Aggregated activity in real-world networks can demonstrate the efficiency of money- flow in certain pockets of our economy
  37. 37. 36 Aggregated activity in real-world networks can demonstrate the efficiency of money- flow in certain pockets of our economy normal and random dispersion of money flow that follows a natural path
  38. 38. A few regions of high interconnectivity connected to spoke-like hubs. Low reproducibility, potentially anomalous
  39. 39. 38 A few regions of high interconnectivity connected to spoke-like hubs. Low reproducibility, potentially anomalous Potentially higher connectedness than normal
  40. 40. EY Cross-Sector Graph Experience: MDM, 360°, AML/Fraud, Recommenders Fortune 100 Tech Company Use Case: Global B2B Account 360° view and marketing attribution Approach: Neo4j graph with 500M nodes and 2.2B relationships, representing all known business accounts, contacts and marketing touches. Mastered data from 17disparate transactional sources in Azure Data Lake. Supported in- graph analytics for marketing attribution and next best action recommendations across global geographies Duration: 16 weeks to working graph Fortune 100 Footwear Company Use Case: Converged Brick & Mortar + Online Shopper 360° View Approach: Neo4j graph with 2B nodes and relationships, representing sales transactions for 40M shoppers across 275 physical stores and the ecommerce platform. Algorithmic extraction and profiling from raw XML records in AWS Hadoop, MDM record concordance and in- graph analytics for product associations, store analytics and recommendation services. Duration: 12 weeks to working graph Fortune 500 Cruise Line Company Use Case: Shipboard and Shoreside Recommendation Engine Approach: Neo4j graph deployable to shipboard VM Ware data centers, with streaming updates from large shoreside Neo4j graph integrating data from Azure Cerebro, Adobe Experience Manager and legacy transactional systems. In-graph analytics,services API, recommendation engine for next best activity for passengers surfaced via mobile app Duration: 12 weeks to working graph Fortune 100 Investment Firm Use Case: Enhanced Anti-Money Laundering and Fraud Detection using Graph+AI Approach: Neo4j graph of account 360° view representing activity of 2M accounts over 4 years. MDM and entity extraction for account and party identity elements from enterprise Oracle system. Network clustering, feature engineering and graph embedding in TensorFlow deep learning classifier for suspicious activity patterns across accounts and between parties. Duration: 16 weeks to working graph Fortune 100 Tech Company Use Case: B2B Local Marketing Events Recommendation Engine Approach: Neo4j graph and personalized next best event recommendation engine for B2B field marketers. Reconciles physical and digital event attendees with corporate account structures for 10K accounts and 5M contacts Entities mastered from transactional data in SQLServer and Azure Data Lake. Microservices APIs support data syndication to martech applications and PowerBI reporting. Duration: 10 weeks to working graph

×