Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Managing Large Scale Financial Time-Series Data with Graphs

Slides from a recent webinar by Objectivity showing how the ThingSpan platform is ideal for graph analytics to uncover patterns and insights within large, complex data sets in order to make efficient decisions.

  • Identifiez-vous pour voir les commentaires

Managing Large Scale Financial Time-Series Data with Graphs

  1. 1. © Copyright - 2016 Objectivity, Inc. N O V E M B E R 2 0 1 6 Managing Large Scale Financial Time-Series Data with Graph
  2. 2. © Copyright - 2016 Objectivity, Inc. Overview Financial Data Challenges Distributed Graph Platform Demonstration Use Case Live Demo
  3. 3. © Copyright - 2016 Objectivity, Inc. • Volume, Velocity and Variety • Current systems produce billions of transactions and events per day • Combined streaming, operational and historical data • Analytic challenge • Statistical analysis is limited • The need to discover complex relationships and patterns • Deeper insight from the relationship value • Time based query and graph analysis • Reusability for multiple uses cases The Challenge
  4. 4. © Copyright - 2016 Objectivity, Inc. • Risk Management • Money Laundering • Insider Threat • Fraud Detection • Communication Graph • Operational • Smart Trading Optimization • Portfolio/Customer Management • Regulatory Compliance Systems • System/Process Optimization Graph Use Cases
  5. 5. © Copyright - 2016 Objectivity, Inc. Performance and Scale In small graphs, insights can be lost due to limited RAM or machine size Big graphs (trillions of nodes and edges) scale UP and scale OUT to reveal subtle insights and hidden relationships in ALL data *Trillions of nodes and edges
  6. 6. © Copyright - 2016 Objectivity, Inc. ThingSpan Technology
  7. 7. © Copyright - 2016 Objectivity, Inc. Graph Analytics ThingSpan Platform Data Analytics Objectivity Open source Partner Spark Streaming Kafka, Storm Workflow Design GUI H D F S / P O S I X Analytics MLlib R E S T S E R V E R J A V A , C + + , C # A P I BI Visualization DO Declarative Query Language Y A R N / M E S O S SPARK ThingSpan Distributed Graph
  8. 8. © Copyright - 2016 Objectivity, Inc. S p a r k C l u s t e r H D F S Spark + ThingSpan = Parallelism W o r k e r N o d e D a t a f r a m e D r i v e r A p p l i c a t i o n W o r k e r N o d e D a t a f r a m e W o r k e r N o d e D a t a f r a m e W o r k e r N o d e D a t a f r a m e W o r k e r N o d e D a t a f r a m e T H I N G S P A N D I S T R I B U T E D G R A P H
  9. 9. © Copyright - 2016 Objectivity, Inc. • Inbound event streaming using Kafka • Event is formed into vertices and edges • Vertices and edges are inserted into the pipeline and processed using Samza • Inserts/upserts: • Consistent • Idempotent Distributed Ingest
  10. 10. © Copyright - 2016 Objectivity, Inc. • Data scientists and analysts use the same language • DO queries run in parallel • Spark DataFrames allow data to be processed with SparkSQL Distributed Query
  11. 11. © Copyright - 2016 Objectivity, Inc. • Familiar to data scientists • Adopted best-of-breed techniques from SQL and Cypher • Extends SQL-like query with graph navigation capabilities • Value based queries and complex graph queries • Query data without having to write or compile code • Support for Weighted graph query • Weights are assigned at query time regardless of the model • Support for Path and Trails • A path is a walk with distinct vertices • A trail is a walk with distinct edges DO – The Query Language
  12. 12. © Copyright - 2016 Objectivity, Inc. Demonstration Overview
  13. 13. © Copyright - 2016 Objectivity, Inc. • A financial institution needs to process massive amount of events per day • Current system produces at least one billion transaction events with a target of five billion in the near future • Events represent both business and operational information • Statistical analysis is possible, but certain graph (navigational) queries are hard to do • Time based query and analysis Use Case
  14. 14. © Copyright - 2016 Objectivity, Inc. Financial Transaction Event <TransactionProcessed> <start_timestamp>2016-03-11 00:54:58.301</start_timestamp> <start_epoch_ms>1457657698301</start_epoch_ms> <end_timestamp>2016-03-11 00:54:58.343</end_timestamp> <end_epoch_ms>1457657698343</end_epoch_ms> <service_type>storm</service_type> <service_instance_id>hadoop02.oktaylabs.com_6703_16 </service_instance_id> <task_type>ParseFIXBolt</task_type> <transaction_type>8</transaction_type> <transaction_id>ALG_20160311_5</transaction_id> <transaction_timestamp>2016-03-11 00:54:57.637066 </transaction_timestamp> <transaction_epoch_ms>1457657697637</transaction_epoch_ms> <parent_transaction_id></parent_transaction_id> <security_id>USB</security_id> <mutual_account_id>ACCT0001</mutual_account_id> <firm_id>client2</firm_id> <sender_id>acct1</sender_id> <basket_id></basket_id> </TransactionProcessed>
  15. 15. © Copyright - 2016 Objectivity, Inc. • Business Entities • Account – The entity that is requesting the transaction • Firm - The firm involved in the transaction • Sender – Firm entity on-behalf of an account • Basket - Bundle or batch of transactions related together • Transaction - The Buy (order), Fill, Cancel or Cancel and Replace order • System Entities • Task - The operational task that process the transaction • Service - The operational service that owns one or more Tasks • Transaction event - Time based event information for financial transaction processing Data Model
  16. 16. © Copyright - 2016 Objectivity, Inc. • Financial transaction events ingested in real time • Concurrent graph queries during ingest • 1 billion financial transaction events in ~12 hours (~23k per second) • Each transaction event produces a sub-graph • Graph size – 1.38 billion vertices and 5.25 billion edges • Cluster: EC2 - 16 Instances of m4.4xlarge The Results
  17. 17. © Copyright - 2016 Objectivity, Inc. Vertices Ingest 1 Billion Rate per process Overall rate for all processes
  18. 18. © Copyright - 2016 Objectivity, Inc. Edges Ingest 1 Billion Rate per process Overall rate for all processes
  19. 19. © Copyright - 2016 Objectivity, Inc. Queries: Processing a Basket For a client’s basket, show all system tasks used to process the basket including processing time. Match p=(:Basket{m_Id=="ALG12"})-->(:Transaction)-[:m_Children*1..5]->(:Transaction)-->(:TransactionEvent)-->(:Task) return p Basket TransactionEvent Task
  20. 20. © Copyright - 2016 Objectivity, Inc. Queries: Comparing Accounts Match p=shortest((:Account{m_Id=='client2.ACCT0005' OR m_Id=='client3.ACCT0003'})-[:m_Baskets]->(:Basket{m_Id=~~'ALG.*'}) -[:m_Transactions]->(:Transaction{m_Type=='D'})-[:m_Children]->(:Transaction{m_Type=='8'})-[:m_Security]->(:Security{m_Id=='CBS'})) return p Compare two Accounts for their algorithmic baskets that produce a fill order for ‘CBS’ Basket Account Fill Order Security (CBS)
  21. 21. © Copyright - 2016 Objectivity, Inc. Basket Comparison with Tableau Transactions, Tasks, etc. per basket, viewed collectively
  22. 22. © Copyright - 2016 Objectivity, Inc. Live Demo
  23. 23. © Copyright - 2016 Objectivity, Inc. • Scale and performance • High speed concurrent ingest and queries during mixed workloads • Scalable massive and complex graph • Enable real time pattern/anomaly detection and discovery • Sub-graph similarity (capture the behavior, not just the statistics) • Data governance and lineage • Open source integration • Fast navigation / path finding • Visualization and BI tool integration • DO query language – Data scientists and analysts use the same language Why ?
  24. 24. © Copyright - 2016 Objectivity, Inc. For more information: www.objectivity.com