SlideShare une entreprise Scribd logo
1  sur  24
© Copyright - 2016 Objectivity, Inc.
N O V E M B E R 2 0 1 6
Managing Large Scale Financial Time-Series Data
with Graph
© Copyright - 2016 Objectivity, Inc.
Overview
Financial Data
Challenges
Distributed
Graph
Platform
Demonstration
Use Case
Live Demo
© Copyright - 2016 Objectivity, Inc.
• Volume, Velocity and Variety
• Current systems produce billions of transactions and events per day
• Combined streaming, operational and historical data
• Analytic challenge
• Statistical analysis is limited
• The need to discover complex relationships and patterns
• Deeper insight from the relationship value
• Time based query and graph analysis
• Reusability for multiple uses cases
The Challenge
© Copyright - 2016 Objectivity, Inc.
• Risk Management
• Money Laundering
• Insider Threat
• Fraud Detection
• Communication Graph
• Operational
• Smart Trading Optimization
• Portfolio/Customer Management
• Regulatory Compliance Systems
• System/Process Optimization
Graph Use Cases
© Copyright - 2016 Objectivity, Inc.
Performance and Scale
In small graphs, insights can be
lost due to limited RAM or
machine size
Big graphs (trillions of nodes and edges)
scale UP and scale OUT to reveal subtle
insights and hidden relationships in ALL
data
*Trillions of nodes and edges
© Copyright - 2016 Objectivity, Inc.
ThingSpan Technology
© Copyright - 2016 Objectivity, Inc.
Graph Analytics
ThingSpan Platform
Data Analytics
Objectivity Open source Partner
Spark
Streaming
Kafka, Storm
Workflow
Design GUI
H D F S / P O S I X
Analytics
MLlib
R E S T
S E R V E R
J A V A , C + + , C #
A P I
BI
Visualization
DO
Declarative Query
Language
Y A R N / M E S O S
SPARK ThingSpan
Distributed Graph
© Copyright - 2016 Objectivity, Inc.
S p a r k C l u s t e r
H D F S
Spark + ThingSpan = Parallelism
W o r k e r
N o d e
D a t a f r a m e
D r i v e r A p p l i c a t i o n
W o r k e r
N o d e
D a t a f r a m e
W o r k e r
N o d e
D a t a f r a m e
W o r k e r
N o d e
D a t a f r a m e
W o r k e r
N o d e
D a t a f r a m e
T H I N G S P A N D I S T R I B U T E D G R A P H
© Copyright - 2016 Objectivity, Inc.
• Inbound event streaming using
Kafka
• Event is formed into vertices
and edges
• Vertices and edges are inserted
into the pipeline and processed
using Samza
• Inserts/upserts:
• Consistent
• Idempotent
Distributed Ingest
© Copyright - 2016 Objectivity, Inc.
• Data scientists and analysts use the
same language
• DO queries run in parallel
• Spark DataFrames allow data to be
processed with SparkSQL
Distributed Query
© Copyright - 2016 Objectivity, Inc.
• Familiar to data scientists
• Adopted best-of-breed techniques from SQL and Cypher
• Extends SQL-like query with graph navigation capabilities
• Value based queries and complex graph queries
• Query data without having to write or compile code
• Support for Weighted graph query
• Weights are assigned at query time regardless of the model
• Support for Path and Trails
• A path is a walk with distinct vertices
• A trail is a walk with distinct edges
DO – The Query Language
© Copyright - 2016 Objectivity, Inc.
Demonstration Overview
© Copyright - 2016 Objectivity, Inc.
• A financial institution needs to process massive amount of
events per day
• Current system produces at least one billion transaction events
with a target of five billion in the near future
• Events represent both business and operational information
• Statistical analysis is possible, but certain graph (navigational)
queries are hard to do
• Time based query and analysis
Use Case
© Copyright - 2016 Objectivity, Inc.
Financial Transaction Event
<TransactionProcessed>
<start_timestamp>2016-03-11 00:54:58.301</start_timestamp>
<start_epoch_ms>1457657698301</start_epoch_ms>
<end_timestamp>2016-03-11 00:54:58.343</end_timestamp>
<end_epoch_ms>1457657698343</end_epoch_ms>
<service_type>storm</service_type>
<service_instance_id>hadoop02.oktaylabs.com_6703_16
</service_instance_id>
<task_type>ParseFIXBolt</task_type>
<transaction_type>8</transaction_type>
<transaction_id>ALG_20160311_5</transaction_id>
<transaction_timestamp>2016-03-11 00:54:57.637066
</transaction_timestamp>
<transaction_epoch_ms>1457657697637</transaction_epoch_ms>
<parent_transaction_id></parent_transaction_id>
<security_id>USB</security_id>
<mutual_account_id>ACCT0001</mutual_account_id>
<firm_id>client2</firm_id>
<sender_id>acct1</sender_id>
<basket_id></basket_id>
</TransactionProcessed>
© Copyright - 2016 Objectivity, Inc.
• Business Entities
• Account – The entity that is requesting the transaction
• Firm - The firm involved in the transaction
• Sender – Firm entity on-behalf of an account
• Basket - Bundle or batch of transactions related together
• Transaction - The Buy (order), Fill, Cancel or Cancel and Replace order
• System Entities
• Task - The operational task that process the transaction
• Service - The operational service that owns one or more Tasks
• Transaction event - Time based event information for financial transaction processing
Data Model
© Copyright - 2016 Objectivity, Inc.
• Financial transaction events ingested in real time
• Concurrent graph queries during ingest
• 1 billion financial transaction events in ~12 hours (~23k per second)
• Each transaction event produces a sub-graph
• Graph size – 1.38 billion vertices and 5.25 billion edges
• Cluster: EC2 - 16 Instances of m4.4xlarge
The Results
© Copyright - 2016 Objectivity, Inc.
Vertices Ingest 1 Billion
Rate per process Overall rate for all processes
© Copyright - 2016 Objectivity, Inc.
Edges Ingest 1 Billion
Rate per process Overall rate for all processes
© Copyright - 2016 Objectivity, Inc.
Queries: Processing a Basket
For a client’s basket, show all
system tasks used to process
the basket including
processing time.
Match p=(:Basket{m_Id=="ALG12"})-->(:Transaction)-[:m_Children*1..5]->(:Transaction)-->(:TransactionEvent)-->(:Task) return p
Basket
TransactionEvent
Task
© Copyright - 2016 Objectivity, Inc.
Queries: Comparing Accounts
Match p=shortest((:Account{m_Id=='client2.ACCT0005' OR m_Id=='client3.ACCT0003'})-[:m_Baskets]->(:Basket{m_Id=~~'ALG.*'})
-[:m_Transactions]->(:Transaction{m_Type=='D'})-[:m_Children]->(:Transaction{m_Type=='8'})-[:m_Security]->(:Security{m_Id=='CBS'})) return p
Compare two Accounts for
their algorithmic baskets
that produce a fill order for
‘CBS’
Basket
Account
Fill Order
Security (CBS)
© Copyright - 2016 Objectivity, Inc.
Basket Comparison with Tableau
Transactions, Tasks, etc. per
basket, viewed collectively
© Copyright - 2016 Objectivity, Inc.
Live Demo
© Copyright - 2016 Objectivity, Inc.
• Scale and performance
• High speed concurrent ingest and queries during mixed workloads
• Scalable massive and complex graph
• Enable real time pattern/anomaly detection and discovery
• Sub-graph similarity (capture the behavior, not just the statistics)
• Data governance and lineage
• Open source integration
• Fast navigation / path finding
• Visualization and BI tool integration
• DO query language – Data scientists and analysts use the same language
Why ?
© Copyright - 2016 Objectivity, Inc.
For more information:
www.objectivity.com

Contenu connexe

Tendances

2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)
2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)
2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)Albert Wong
 
A taste of Snowplow Analytics data
A taste of Snowplow Analytics dataA taste of Snowplow Analytics data
A taste of Snowplow Analytics dataRobert Kingston
 
Life is but a Stream
Life is but a StreamLife is but a Stream
Life is but a StreamDatabricks
 
Stream processing for the practitioner: Blueprints for common stream processi...
Stream processing for the practitioner: Blueprints for common stream processi...Stream processing for the practitioner: Blueprints for common stream processi...
Stream processing for the practitioner: Blueprints for common stream processi...Aljoscha Krettek
 
Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
Kafka and Stream Processing, Taking Analytics Real-time, Mike SpicerKafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicerconfluent
 
The Stream is the Database - Revolutionizing Healthcare Data Architecture
The Stream is the Database - Revolutionizing Healthcare Data ArchitectureThe Stream is the Database - Revolutionizing Healthcare Data Architecture
The Stream is the Database - Revolutionizing Healthcare Data ArchitectureDataWorks Summit/Hadoop Summit
 
Real Time Event Processing and In-­memory analysis of Big Data - StampedeCon ...
Real Time Event Processing and In-­memory analysis of Big Data - StampedeCon ...Real Time Event Processing and In-­memory analysis of Big Data - StampedeCon ...
Real Time Event Processing and In-­memory analysis of Big Data - StampedeCon ...StampedeCon
 
Simply Business and Snowplow - Multichannel Attribution Analysis
Simply Business and Snowplow - Multichannel Attribution AnalysisSimply Business and Snowplow - Multichannel Attribution Analysis
Simply Business and Snowplow - Multichannel Attribution AnalysisStewart Duncan
 
Moving Beyond Batch: Transactional Databases for Real-time Data
Moving Beyond Batch: Transactional Databases for Real-time DataMoving Beyond Batch: Transactional Databases for Real-time Data
Moving Beyond Batch: Transactional Databases for Real-time DataVoltDB
 
Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Building a Just in Time Data Warehouse by Dan Morris and Jason PohlBuilding a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Building a Just in Time Data Warehouse by Dan Morris and Jason PohlSpark Summit
 
xGem Data Stream Processing
xGem Data Stream ProcessingxGem Data Stream Processing
xGem Data Stream ProcessingJorge Hirtz
 
Driving the On-Demand Economy with Spark and Predictive Analytics
Driving the On-Demand Economy with Spark and Predictive AnalyticsDriving the On-Demand Economy with Spark and Predictive Analytics
Driving the On-Demand Economy with Spark and Predictive AnalyticsSingleStore
 
Snowplow the evolving data pipeline
Snowplow   the evolving data pipelineSnowplow   the evolving data pipeline
Snowplow the evolving data pipelineyalisassoon
 
2016 09 measurecamp - event data modeling
2016 09 measurecamp - event data modeling2016 09 measurecamp - event data modeling
2016 09 measurecamp - event data modelingyalisassoon
 
Big data meetup budapest adding data schemas to snowplow
Big data meetup budapest   adding data schemas to snowplowBig data meetup budapest   adding data schemas to snowplow
Big data meetup budapest adding data schemas to snowplowyalisassoon
 
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
 Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr... Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...Databricks
 
WSO2 Big Data Analytics Platform
WSO2 Big Data Analytics PlatformWSO2 Big Data Analytics Platform
WSO2 Big Data Analytics PlatformSamisa Abeysinghe
 
Data analytics at a petabyte scale final
Data analytics at a petabyte scale   finalData analytics at a petabyte scale   final
Data analytics at a petabyte scale finalOri Reshef
 
Simply Business - Near Real Time Event Processing
Simply Business - Near Real Time Event ProcessingSimply Business - Near Real Time Event Processing
Simply Business - Near Real Time Event Processingidan_by
 
Real-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and SparkReal-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and SparkSingleStore
 

Tendances (20)

2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)
2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)
2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)
 
A taste of Snowplow Analytics data
A taste of Snowplow Analytics dataA taste of Snowplow Analytics data
A taste of Snowplow Analytics data
 
Life is but a Stream
Life is but a StreamLife is but a Stream
Life is but a Stream
 
Stream processing for the practitioner: Blueprints for common stream processi...
Stream processing for the practitioner: Blueprints for common stream processi...Stream processing for the practitioner: Blueprints for common stream processi...
Stream processing for the practitioner: Blueprints for common stream processi...
 
Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
Kafka and Stream Processing, Taking Analytics Real-time, Mike SpicerKafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
 
The Stream is the Database - Revolutionizing Healthcare Data Architecture
The Stream is the Database - Revolutionizing Healthcare Data ArchitectureThe Stream is the Database - Revolutionizing Healthcare Data Architecture
The Stream is the Database - Revolutionizing Healthcare Data Architecture
 
Real Time Event Processing and In-­memory analysis of Big Data - StampedeCon ...
Real Time Event Processing and In-­memory analysis of Big Data - StampedeCon ...Real Time Event Processing and In-­memory analysis of Big Data - StampedeCon ...
Real Time Event Processing and In-­memory analysis of Big Data - StampedeCon ...
 
Simply Business and Snowplow - Multichannel Attribution Analysis
Simply Business and Snowplow - Multichannel Attribution AnalysisSimply Business and Snowplow - Multichannel Attribution Analysis
Simply Business and Snowplow - Multichannel Attribution Analysis
 
Moving Beyond Batch: Transactional Databases for Real-time Data
Moving Beyond Batch: Transactional Databases for Real-time DataMoving Beyond Batch: Transactional Databases for Real-time Data
Moving Beyond Batch: Transactional Databases for Real-time Data
 
Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Building a Just in Time Data Warehouse by Dan Morris and Jason PohlBuilding a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl
 
xGem Data Stream Processing
xGem Data Stream ProcessingxGem Data Stream Processing
xGem Data Stream Processing
 
Driving the On-Demand Economy with Spark and Predictive Analytics
Driving the On-Demand Economy with Spark and Predictive AnalyticsDriving the On-Demand Economy with Spark and Predictive Analytics
Driving the On-Demand Economy with Spark and Predictive Analytics
 
Snowplow the evolving data pipeline
Snowplow   the evolving data pipelineSnowplow   the evolving data pipeline
Snowplow the evolving data pipeline
 
2016 09 measurecamp - event data modeling
2016 09 measurecamp - event data modeling2016 09 measurecamp - event data modeling
2016 09 measurecamp - event data modeling
 
Big data meetup budapest adding data schemas to snowplow
Big data meetup budapest   adding data schemas to snowplowBig data meetup budapest   adding data schemas to snowplow
Big data meetup budapest adding data schemas to snowplow
 
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
 Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr... Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
 
WSO2 Big Data Analytics Platform
WSO2 Big Data Analytics PlatformWSO2 Big Data Analytics Platform
WSO2 Big Data Analytics Platform
 
Data analytics at a petabyte scale final
Data analytics at a petabyte scale   finalData analytics at a petabyte scale   final
Data analytics at a petabyte scale final
 
Simply Business - Near Real Time Event Processing
Simply Business - Near Real Time Event ProcessingSimply Business - Near Real Time Event Processing
Simply Business - Near Real Time Event Processing
 
Real-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and SparkReal-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and Spark
 

En vedette

Financial Time Series: Concept and Forecast (dsth Meetup#2)
Financial Time Series: Concept and Forecast (dsth Meetup#2)Financial Time Series: Concept and Forecast (dsth Meetup#2)
Financial Time Series: Concept and Forecast (dsth Meetup#2)Data Science Thailand
 
On Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationOn Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationGautier Marti
 
Changing organizational mindset
Changing organizational mindsetChanging organizational mindset
Changing organizational mindsetAllison Pollard
 
Financial time series_forecasting_svm
Financial time series_forecasting_svmFinancial time series_forecasting_svm
Financial time series_forecasting_svmMohamed DHAOUI
 
Lean Implementation Overview
Lean Implementation OverviewLean Implementation Overview
Lean Implementation Overviewbroper
 
Financial forecasting by time series 55660701
Financial forecasting by time series 55660701Financial forecasting by time series 55660701
Financial forecasting by time series 55660701Pongsiri Nontasak
 
Lean production System - TPS
Lean production System - TPSLean production System - TPS
Lean production System - TPSPrakash Prakash
 
Lean Strategy Implementation Methodology.
Lean Strategy Implementation Methodology.Lean Strategy Implementation Methodology.
Lean Strategy Implementation Methodology.Yadhu Gopinath
 
Mindset for positive change
Mindset for positive changeMindset for positive change
Mindset for positive changeIEI GSC
 
Positive change = mindset x tools
Positive change = mindset x toolsPositive change = mindset x tools
Positive change = mindset x toolssrprs.me
 
Objective and subjective performance measures
Objective and subjective performance measuresObjective and subjective performance measures
Objective and subjective performance measuresJhun Ar Ar Ramos
 
Lean manufacturing concepts and tools and quality management1
Lean manufacturing concepts and tools and quality management1Lean manufacturing concepts and tools and quality management1
Lean manufacturing concepts and tools and quality management1hgalinova
 

En vedette (20)

Financial Time Series: Concept and Forecast (dsth Meetup#2)
Financial Time Series: Concept and Forecast (dsth Meetup#2)Financial Time Series: Concept and Forecast (dsth Meetup#2)
Financial Time Series: Concept and Forecast (dsth Meetup#2)
 
trading
tradingtrading
trading
 
What is lean management?
What is lean management?What is lean management?
What is lean management?
 
quantmachine
quantmachinequantmachine
quantmachine
 
On Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationOn Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond Correlation
 
Changing organizational mindset
Changing organizational mindsetChanging organizational mindset
Changing organizational mindset
 
Financial time series_forecasting_svm
Financial time series_forecasting_svmFinancial time series_forecasting_svm
Financial time series_forecasting_svm
 
Building a Lean Management System
Building a Lean Management System Building a Lean Management System
Building a Lean Management System
 
Lean Implementation Overview
Lean Implementation OverviewLean Implementation Overview
Lean Implementation Overview
 
Financial forecasting by time series 55660701
Financial forecasting by time series 55660701Financial forecasting by time series 55660701
Financial forecasting by time series 55660701
 
Lean Management System
Lean Management SystemLean Management System
Lean Management System
 
Lean production System - TPS
Lean production System - TPSLean production System - TPS
Lean production System - TPS
 
Lean Strategy Implementation Methodology.
Lean Strategy Implementation Methodology.Lean Strategy Implementation Methodology.
Lean Strategy Implementation Methodology.
 
Mindset for positive change
Mindset for positive changeMindset for positive change
Mindset for positive change
 
Positive change = mindset x tools
Positive change = mindset x toolsPositive change = mindset x tools
Positive change = mindset x tools
 
Change Your Mindset in 6 Steps
Change Your Mindset in 6 StepsChange Your Mindset in 6 Steps
Change Your Mindset in 6 Steps
 
Objective and subjective performance measures
Objective and subjective performance measuresObjective and subjective performance measures
Objective and subjective performance measures
 
Change Management Framework
Change Management FrameworkChange Management Framework
Change Management Framework
 
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
 
Lean manufacturing concepts and tools and quality management1
Lean manufacturing concepts and tools and quality management1Lean manufacturing concepts and tools and quality management1
Lean manufacturing concepts and tools and quality management1
 

Similaire à Managing Large Scale Financial Time-Series Data with Graphs

Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Big Data Spain
 
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...Kai Wähner
 
When Streaming Becomes Strategic
When Streaming Becomes StrategicWhen Streaming Becomes Strategic
When Streaming Becomes StrategicMapR Technologies
 
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big DataVoxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big DataStavros Kontopoulos
 
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big DataVoxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big DataVoxxed Days Thessaloniki
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futuremarkgrover
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesKarthik Murugesan
 
Analytics in Your Enterprise
Analytics in Your EnterpriseAnalytics in Your Enterprise
Analytics in Your EnterpriseWSO2
 
Delivering Services Powered by Operational Data - Connected Services
Delivering Services Powered by Operational Data -  Connected ServicesDelivering Services Powered by Operational Data -  Connected Services
Delivering Services Powered by Operational Data - Connected ServicesOSIsoft, LLC
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkDatabricks
 
How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersVoltDB
 
conf2015_TLaGatta_CHarris_Splunk_BusinessAnalytics_DeliveringHighLevelAnalytics
conf2015_TLaGatta_CHarris_Splunk_BusinessAnalytics_DeliveringHighLevelAnalyticsconf2015_TLaGatta_CHarris_Splunk_BusinessAnalytics_DeliveringHighLevelAnalytics
conf2015_TLaGatta_CHarris_Splunk_BusinessAnalytics_DeliveringHighLevelAnalyticsTom LaGatta
 
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics PlatformWSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics PlatformWSO2
 
Assessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesAssessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesDATAVERSITY
 
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Amazon Web Services
 
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...Flink Forward
 
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...Nelson Petracek
 
Streaming analytics state of the art
Streaming analytics state of the artStreaming analytics state of the art
Streaming analytics state of the artStavros Kontopoulos
 
Analytical Innovation: How to Build the Next Generation Data Platform
Analytical Innovation: How to Build the Next Generation Data PlatformAnalytical Innovation: How to Build the Next Generation Data Platform
Analytical Innovation: How to Build the Next Generation Data PlatformVMware Tanzu
 

Similaire à Managing Large Scale Financial Time-Series Data with Graphs (20)

Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
 
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
 
When Streaming Becomes Strategic
When Streaming Becomes StrategicWhen Streaming Becomes Strategic
When Streaming Becomes Strategic
 
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big DataVoxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
 
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big DataVoxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the future
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slides
 
Analytics in Your Enterprise
Analytics in Your EnterpriseAnalytics in Your Enterprise
Analytics in Your Enterprise
 
Gcp dataflow
Gcp dataflowGcp dataflow
Gcp dataflow
 
Delivering Services Powered by Operational Data - Connected Services
Delivering Services Powered by Operational Data -  Connected ServicesDelivering Services Powered by Operational Data -  Connected Services
Delivering Services Powered by Operational Data - Connected Services
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
 
How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top Contenders
 
conf2015_TLaGatta_CHarris_Splunk_BusinessAnalytics_DeliveringHighLevelAnalytics
conf2015_TLaGatta_CHarris_Splunk_BusinessAnalytics_DeliveringHighLevelAnalyticsconf2015_TLaGatta_CHarris_Splunk_BusinessAnalytics_DeliveringHighLevelAnalytics
conf2015_TLaGatta_CHarris_Splunk_BusinessAnalytics_DeliveringHighLevelAnalytics
 
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics PlatformWSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
 
Assessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesAssessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use Cases
 
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
 
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...
 
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
 
Streaming analytics state of the art
Streaming analytics state of the artStreaming analytics state of the art
Streaming analytics state of the art
 
Analytical Innovation: How to Build the Next Generation Data Platform
Analytical Innovation: How to Build the Next Generation Data PlatformAnalytical Innovation: How to Build the Next Generation Data Platform
Analytical Innovation: How to Build the Next Generation Data Platform
 

Dernier

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 

Dernier (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 

Managing Large Scale Financial Time-Series Data with Graphs

  • 1. © Copyright - 2016 Objectivity, Inc. N O V E M B E R 2 0 1 6 Managing Large Scale Financial Time-Series Data with Graph
  • 2. © Copyright - 2016 Objectivity, Inc. Overview Financial Data Challenges Distributed Graph Platform Demonstration Use Case Live Demo
  • 3. © Copyright - 2016 Objectivity, Inc. • Volume, Velocity and Variety • Current systems produce billions of transactions and events per day • Combined streaming, operational and historical data • Analytic challenge • Statistical analysis is limited • The need to discover complex relationships and patterns • Deeper insight from the relationship value • Time based query and graph analysis • Reusability for multiple uses cases The Challenge
  • 4. © Copyright - 2016 Objectivity, Inc. • Risk Management • Money Laundering • Insider Threat • Fraud Detection • Communication Graph • Operational • Smart Trading Optimization • Portfolio/Customer Management • Regulatory Compliance Systems • System/Process Optimization Graph Use Cases
  • 5. © Copyright - 2016 Objectivity, Inc. Performance and Scale In small graphs, insights can be lost due to limited RAM or machine size Big graphs (trillions of nodes and edges) scale UP and scale OUT to reveal subtle insights and hidden relationships in ALL data *Trillions of nodes and edges
  • 6. © Copyright - 2016 Objectivity, Inc. ThingSpan Technology
  • 7. © Copyright - 2016 Objectivity, Inc. Graph Analytics ThingSpan Platform Data Analytics Objectivity Open source Partner Spark Streaming Kafka, Storm Workflow Design GUI H D F S / P O S I X Analytics MLlib R E S T S E R V E R J A V A , C + + , C # A P I BI Visualization DO Declarative Query Language Y A R N / M E S O S SPARK ThingSpan Distributed Graph
  • 8. © Copyright - 2016 Objectivity, Inc. S p a r k C l u s t e r H D F S Spark + ThingSpan = Parallelism W o r k e r N o d e D a t a f r a m e D r i v e r A p p l i c a t i o n W o r k e r N o d e D a t a f r a m e W o r k e r N o d e D a t a f r a m e W o r k e r N o d e D a t a f r a m e W o r k e r N o d e D a t a f r a m e T H I N G S P A N D I S T R I B U T E D G R A P H
  • 9. © Copyright - 2016 Objectivity, Inc. • Inbound event streaming using Kafka • Event is formed into vertices and edges • Vertices and edges are inserted into the pipeline and processed using Samza • Inserts/upserts: • Consistent • Idempotent Distributed Ingest
  • 10. © Copyright - 2016 Objectivity, Inc. • Data scientists and analysts use the same language • DO queries run in parallel • Spark DataFrames allow data to be processed with SparkSQL Distributed Query
  • 11. © Copyright - 2016 Objectivity, Inc. • Familiar to data scientists • Adopted best-of-breed techniques from SQL and Cypher • Extends SQL-like query with graph navigation capabilities • Value based queries and complex graph queries • Query data without having to write or compile code • Support for Weighted graph query • Weights are assigned at query time regardless of the model • Support for Path and Trails • A path is a walk with distinct vertices • A trail is a walk with distinct edges DO – The Query Language
  • 12. © Copyright - 2016 Objectivity, Inc. Demonstration Overview
  • 13. © Copyright - 2016 Objectivity, Inc. • A financial institution needs to process massive amount of events per day • Current system produces at least one billion transaction events with a target of five billion in the near future • Events represent both business and operational information • Statistical analysis is possible, but certain graph (navigational) queries are hard to do • Time based query and analysis Use Case
  • 14. © Copyright - 2016 Objectivity, Inc. Financial Transaction Event <TransactionProcessed> <start_timestamp>2016-03-11 00:54:58.301</start_timestamp> <start_epoch_ms>1457657698301</start_epoch_ms> <end_timestamp>2016-03-11 00:54:58.343</end_timestamp> <end_epoch_ms>1457657698343</end_epoch_ms> <service_type>storm</service_type> <service_instance_id>hadoop02.oktaylabs.com_6703_16 </service_instance_id> <task_type>ParseFIXBolt</task_type> <transaction_type>8</transaction_type> <transaction_id>ALG_20160311_5</transaction_id> <transaction_timestamp>2016-03-11 00:54:57.637066 </transaction_timestamp> <transaction_epoch_ms>1457657697637</transaction_epoch_ms> <parent_transaction_id></parent_transaction_id> <security_id>USB</security_id> <mutual_account_id>ACCT0001</mutual_account_id> <firm_id>client2</firm_id> <sender_id>acct1</sender_id> <basket_id></basket_id> </TransactionProcessed>
  • 15. © Copyright - 2016 Objectivity, Inc. • Business Entities • Account – The entity that is requesting the transaction • Firm - The firm involved in the transaction • Sender – Firm entity on-behalf of an account • Basket - Bundle or batch of transactions related together • Transaction - The Buy (order), Fill, Cancel or Cancel and Replace order • System Entities • Task - The operational task that process the transaction • Service - The operational service that owns one or more Tasks • Transaction event - Time based event information for financial transaction processing Data Model
  • 16. © Copyright - 2016 Objectivity, Inc. • Financial transaction events ingested in real time • Concurrent graph queries during ingest • 1 billion financial transaction events in ~12 hours (~23k per second) • Each transaction event produces a sub-graph • Graph size – 1.38 billion vertices and 5.25 billion edges • Cluster: EC2 - 16 Instances of m4.4xlarge The Results
  • 17. © Copyright - 2016 Objectivity, Inc. Vertices Ingest 1 Billion Rate per process Overall rate for all processes
  • 18. © Copyright - 2016 Objectivity, Inc. Edges Ingest 1 Billion Rate per process Overall rate for all processes
  • 19. © Copyright - 2016 Objectivity, Inc. Queries: Processing a Basket For a client’s basket, show all system tasks used to process the basket including processing time. Match p=(:Basket{m_Id=="ALG12"})-->(:Transaction)-[:m_Children*1..5]->(:Transaction)-->(:TransactionEvent)-->(:Task) return p Basket TransactionEvent Task
  • 20. © Copyright - 2016 Objectivity, Inc. Queries: Comparing Accounts Match p=shortest((:Account{m_Id=='client2.ACCT0005' OR m_Id=='client3.ACCT0003'})-[:m_Baskets]->(:Basket{m_Id=~~'ALG.*'}) -[:m_Transactions]->(:Transaction{m_Type=='D'})-[:m_Children]->(:Transaction{m_Type=='8'})-[:m_Security]->(:Security{m_Id=='CBS'})) return p Compare two Accounts for their algorithmic baskets that produce a fill order for ‘CBS’ Basket Account Fill Order Security (CBS)
  • 21. © Copyright - 2016 Objectivity, Inc. Basket Comparison with Tableau Transactions, Tasks, etc. per basket, viewed collectively
  • 22. © Copyright - 2016 Objectivity, Inc. Live Demo
  • 23. © Copyright - 2016 Objectivity, Inc. • Scale and performance • High speed concurrent ingest and queries during mixed workloads • Scalable massive and complex graph • Enable real time pattern/anomaly detection and discovery • Sub-graph similarity (capture the behavior, not just the statistics) • Data governance and lineage • Open source integration • Fast navigation / path finding • Visualization and BI tool integration • DO query language – Data scientists and analysts use the same language Why ?
  • 24. © Copyright - 2016 Objectivity, Inc. For more information: www.objectivity.com

Notes de l'éditeur

  1. Graph size for a billion fix transactions events
  2. Graph size for a billion fix transactions events
  3. Transaction events are read from a Kafka topic, but they could have easily been read from other streaming technologies like Spark Streaming, Flume or Stream Sets. Each event is formed into vertices and edges. The edges are further decomposed into triples to reduce lock contention and allow the parallel processing of the edge. This results in a lower latency of each operation and increased throughput. The upsert of vertices and the insert of edges (decomposed to triples) are funneled into Samza tasks running on the cluster and managed by YARN. These upserts are consistent and idempotent.  
  4. ThingSpan runs queries in parallel. Each query is partitioned into parts, and a part, or partition, of the query is sent to each machine where it is executed as a YARN job. The query returns multiple paths from each partition, and these are collated into a single result. In the Spark world, this process can be described as transforming (mapPartition) each input partition into an RDD or DataFrame. Using Spark DataFrames allows results from ThingSpan to be processed even further. Spark SQL statements can join, aggregate, and select from multiple tables. DataFrame operations are processed in parallel across the cluster. The parallelism of queries allows near linear scaling of query throughput by “scaling out” the cluster. ThingSpan runs queries in parallel. Each query is partitioned into parts, and a part, or partition, of the query is sent to each machine where it is executed as a YARN job. The query returns multiple paths from each partition, and these are collated into a single result. In the Spark world, this process can be described as transforming (mapPartition) each input partition into an RDD or DataFrame. Using Spark DataFrames allows results from ThingSpan to be processed even further. Spark SQL statements can join, aggregate, and select from multiple tables. DataFrame operations are processed in parallel across the cluster. The parallelism of queries allows near linear scaling of query throughput by “scaling out” the cluster.
  5. Graph size for a billion fix transactions events
  6. 2- For a client basket, show all tasks that processed it and the timing. Start point: Basket: m_Id End point: Service DevOps - Discovery of hotspots - How much resource is used - Metrics