SlideShare a Scribd company logo
1 of 39
2017 © Parametric Portfolio Associates® LLC
HOW TO USE INNOVATIVE DATA HANDLING AND
PROCESSING TECHNIQUES TO DRIVE ALPHA IN
THE FINANCIAL MARKETS
2017 © Parametric Portfolio Associates® LLC2
PARAMETRIC’S PROFILE:
*As of 3/31/2017. Includes AUM of Parametric Investment & Overlay Strategies and Parametric Custom Tax-Managed & Centralized Portfolio
Management.
Seattle, WA Minneapolis, MN Westport, CT
• Leaders in rules-based, engineered
portfolio solutions
• Strategies ranging from index tracking
portfolios to managed smart beta
• Ability to incorporate responsible investing
themes
• Founded 1987
• A subsidiary of Eaton Vance Corp.
since 2003
• Pioneers in overlay strategies and
custom risk management solutions
(formerly The Clifton Group)
• Innovative product solutions in real asset
and liquid alternatives
• Founded 1972
• Acquired by Parametric in 2012
• Specialists in option portfolio
management
• Provide product-based and custom
option overlay solutions
• Founded 2002
• A part of Parametric since 2007
We provide systematic, disciplined portfolio management solutions
We offer investment solutions through our three investment centers:
> Parametric Portfolio Associates® LLC (“Parametric”) is a majority-owned subsidiary of Eaton Vance Corp.
> Approximately $197.6 Billion (3/31/2017) in assets under management*.
2017 © Parametric Portfolio Associates® LLC3
PARAMETRIC INVESTMENT PLATFORM*
*For illustrative purposes only
2017 © Parametric Portfolio Associates® LLC4
PARAMETRIC’S BIG DATA JOURNEY
•MDM Launch
•Decision: Data
Centralization2016
•Data Lake
Implementation
•Focus:
Mastering Data
Sources and
Data Discovery
2017
•Modernizing
Data Usage
•Focus:
Transition Data
Silos to Data
Lake
2018
2017 © Parametric Portfolio Associates® LLC5
OVERVIEW - IT ENVIRONMENT
Our Hadoop Environment:
Two Separate Clusters
‒ Production 10 Node Cluster
• Common + Hive
• Clustered NiFi
‒ Development 10 Node Cluster
• Common + Hive + Spark
• Non-Clustered NiFi
NiFi in both dev and production
We build NiFi workflows in dev and promote to
production
Our Environment:
‒ Before Hadoop
• Primarily Windows
‒ C# , MS SQL, PowerShell, etc.
• All automation done using CA
‒ With Hadoop
• Still primarily Windows… + Hadoop
• Transition ETL automation to NiFi
2017 © Parametric Portfolio Associates® LLC6
THE DATA MANAGEMENT OFFICE CHALLENGE
Stop processing on time boundaries – Process as soon as data is available!
• Previously processes were triggered at specific times of day
• Vendor data availability is generally good but not perfect
‒Delayed data has cascading negative effects
‒Manual intervention typically required for delayed or missing data files
‒Requires after hours support
• Data consumption pushed to nightly jobs to ensure most complete data sets
• Insurance premium for waiting on data
‒Loss of potential processing hours
‒Loss of processing during business hours
‒Potential loss of pre-processing work
2017 © Parametric Portfolio Associates® LLC7
RESOLUTION: NIFI
NiFi’s Immediate Benefits:
• Event Based Processing
• Data Provenance
• Queueing
• Back Pressure
• Rapid Development
• Large Selection of Processors
2017 © Parametric Portfolio Associates® LLC8
QUICK WIN – FINDING RESTRICTED ISSUERS
• Background
• An account defines an ownership restriction for one or more issuers (companies). The
client provides one or more security identifiers with an issuer name. However the list
may be incomplete but the mandate is that ANY security from that issuer can not be
held in their account.
• Problem Statement
• How to identify the issuer to prevent any of its security types from being held?
• Solution:
• Use the client provided security identifiers to search our Bloomberg data and map it to
an id_bb_global so its restrict by our compliance system.
2017 © Parametric Portfolio Associates® LLC9
OLD PROCESS SOLUTION
Old Process Steps:
• Client provides a spreadsheet of market identifiers
• Spreadsheet is reviewed and then run through several independent processes – mostly manual
• Spreadsheet is returned to the requestor with the company identifier tacked, if found, to the original spreadsheet
• Requestor then formats the results so that it can be digested by the target system
Old Process Requirements:
• 3 people an average of 3 to 5 hours
‒ Requestor – 1 to 1 ½ hours
• Review client send, write email to kick start process, follow up on expected completion, and reformatting
results
‒ HelpDesk
• Process ticket and assign to app support
‒ App Support
• Research and query generation
2017 © Parametric Portfolio Associates® LLC10
NIFI SOLUTION
NiFi Steps:
• Client provides a spreadsheet of market identifiers
• Spreadsheet is reviewed, identifiers are cut and pasted into a standard formatted spreadsheet
• Spreadsheet is saved as a CSV file to the In directory on public drive
• Requestor receives an email when NiFi process is done
• Requestor picks up a ready to load CSV file in the Out directory
Achieved Targets:
• Minimal manual processing and IT intervention
• All self service and its easy to do
• Bonus: Better results by searching all Bloomberg data available instead of investible
universe
2017 © Parametric Portfolio Associates® LLC11
WHAT IT LOOKS LIKE TO THE REQUESTOR
2017 © Parametric Portfolio Associates® LLC12
WHAT IT LOOKS LIKE IN NIFI TODAY (REFINED)
2017 © Parametric Portfolio Associates® LLC13
HOW IT STARTED
14 2017 © Parametric Portfolio Associates® LLC
OUR TOP 10 NIFI BEST
PRACTICES
2017 © Parametric Portfolio Associates® LLC15
#10
Adjust Processor Run Schedule
-When developing flows, first priority should be to adjust Run Schedule
-We have had cases where this wasn’t done and massive amounts of data was
generated
-Back Pressure will keep things from running completely out of control
2017 © Parametric Portfolio Associates® LLC16
#9
Make sure you have plenty of storage space for the NiFi
databases
•NiFi’s Data Provenance and Queuing require sufficient storage space
•Place these databases on a separate mount points
•Configure Provenance expiration to meet your business requirements
2017 © Parametric Portfolio Associates® LLC17
#8
Use Process Groups
•Process groups allow you create modular flows
•Keeps flows organized and readable
•Authorization can be set for process groups
•The root page is a process group
2017 © Parametric Portfolio Associates® LLC18
#8 USE PROCESS GROUPS
1. Each developer has their own process group in our development
environment
2. We work in our own process groups when doing initial development / POC
type of work
3. We have process groups that encapsulate “releasable code”
2017 © Parametric Portfolio Associates® LLC19
#8 USE PROCESS GROUPS
2017 © Parametric Portfolio Associates® LLC20
#7
Create templates of single Processor for easy reuse
•Often times you add a Processor to a flow and then need to update a number of
properties
•Stream line this by creating a template that has just the Processor with the
properties prepopulated
•Use the template instead of the Processor
2017 © Parametric Portfolio Associates® LLC21
#7 PROCESSOR REUSE
1. Configure a Processor
2. In this example Put
HDFS configured with
Kerberos Principal
Create a template that only contains
the processor
1. Select the processor
2. Click the create template icon
3. Give it a good name and
description
2017 © Parametric Portfolio Associates® LLC22
#7 PROCESSOR REUSE
Once a template has been created
you can add that template to newly
developed work flows
2017 © Parametric Portfolio Associates® LLC23
#6
The Data Provenance Search Facility
2017 © Parametric Portfolio Associates® LLC24
#5
If you Cluster in Prod then Cluster in Dev
•NiFi supports clustering
•If you are going to clustered NiFi in Production then have your Dev NiFi
clustered as well
•Certain Processor should only run on a single node in the cluster
•It is possible to create a single node cluster in dev, but still best to have your dev
setup match your production setup.
2017 © Parametric Portfolio Associates® LLC25
#4
Set expirations on Success queues
•Often times we want to capture flows successful completion
•We route the final Success output of a process to a funnel
•Make sure you set the flow file expiration of the queue otherwise back pressure
will cause your flows to stop until the queue is drained
2017 © Parametric Portfolio Associates® LLC26
#4 EXPIRE SUCCESS QUEUES
2017 © Parametric Portfolio Associates® LLC27
#3
Use Custom Properties per environment
•NiFi flows have access to Custom properties and environment variables
•We use these to make our flows environment agnostic
•Not perfect
• Not all properties support expression language
• Custom properties are read at startup
2017 © Parametric Portfolio Associates® LLC28
#2
Create Small Modular Disconnected work flows
•A large complex, interconnected flows
• Difficult to debug
• Difficult to deploy
•Decouple flows
• Create flow with specific functional purpose
• If other flows are dependent use queues to as the coupling mechanism
2017 © Parametric Portfolio Associates® LLC29
#2 SMALL, MODULAR AND DISCONNECTED
Download and
put to HDFS
Process
downloaded files
2017 © Parametric Portfolio Associates® LLC30
#2 SMALL, MODULAR AND DISCONNECTED
Decouple
2017 © Parametric Portfolio Associates® LLC31
#1
Update the Names of the Processors
•Do not use the default Processor name
•Give each processor in a flow a friendly, well understood name
•NiFi Summary other features that show processor names will be
more usable if processors are named
2017 © Parametric Portfolio Associates® LLC32
#1 UPDATE PROCESSORS NAMES
ORGANIZATIONAL
CHALLENGES -
NIFI
2017 © Parametric Portfolio Associates® LLC34
EVENT BASED MINDSET – AS A COMPANY
 To leverage a modern data architecture fully companies need to
think in terms events, not a schedule
 An event could be anything from a file arriving in an FTP, a restful
API call, or a database being updated
 The concept of jobs running at specific times causes
unnecessary strain on systems and reduces overall throughput
• Like waiting to the last minute to do all your homework
• 50MB/s = >4TB/day
2017 © Parametric Portfolio Associates® LLC35
EASY TO GET STARTED – DON’T BE LAZY
 It is a lot easier to go fast and get an application working
• This leads to bad programs
 Just because it works doesn’t mean its efficient
• Double the hardware for the same amount of work
 Repeated for Importance: Rename your processors like you
would comment your code
 Check your backpressure settings – very easy to make a flow that
overwhelms a specific step
• Particularly with “GenerateFlowFile”
• Debugging steps need to be evaluated/removed before production
2017 © Parametric Portfolio Associates® LLC36
INCLUDE “BUSINESS” PEOPLE AND SPECIALISTS
 With a more interactive graphical interface, business people &
specialists can understand the program’s flow a lot better
 Have you ever tried to go line by line through java code with
someone? Eyes glaze over
 Collaborate earlier and more often,
it will decrease the cycles needed
to get on target
2017 © Parametric Portfolio Associates® LLC37
DON’T BE AFRAID TO CUSTOM CODE
 The majority of use-cases don’t need custom code – don’t be
afraid to make new processors
38 2017 © Parametric Portfolio Associates® LLC
QUESTIONS?
39 2017 © Parametric Portfolio Associates® LLC
DISCLOSURE
Parametric Portfolio Associates LLC (“Parametric”), headquartered in Seattle, Washington, is registered as an investment adviser with the U.S. Securities
and Exchange Commission under the Investment Advisers Act of 1940. Parametric is a leading global asset management firm, providing investment
strategies and customized exposure management directly to institutional investors and indirectly to individual investors through financial intermediaries.
Parametric offers a variety of rules-based investment strategies, including alpha-seeking equity, alternative and options strategies, as well as
implementation services, including customized equity, traditional overlay and centralized portfolio management. Parametric is a majority-owned
subsidiary of Eaton Vance Corp. and offers these capabilities through investment centers in Seattle, WA, Minneapolis, MN and Westport, CT. This
material may not be forwarded or reproduced, in whole or in part, without the written consent of Parametric Compliance. Parametric and its affiliates
are not responsible for its use by other parties.
All contents copyright 2017 Parametric Portfolio Associates LLC. All rights reserved. Parametric Portfolio Associates, PIOS, and Parametric with the iris
flower logo are all trademarks registered in the US Patent and Trademark Office.
Parametric is located at 1918 8th Avenue, Suite 3100, Seattle, WA 98101. For more information regarding Parametric and its investment strategies, or to
request a copy of Parametric’s Form ADV Brochure, please contact us at 206.694.5575 or visit our website, www.parametricportfolio.com.

More Related Content

What's hot

GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...DataWorks Summit
 
Accelerating Big Data Insights
Accelerating Big Data InsightsAccelerating Big Data Insights
Accelerating Big Data InsightsDataWorks Summit
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...DataWorks Summit
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoopgregchanan
 
Securing Enterprise Healthcare Big Data by the Combination of Knox/F5, Ranger...
Securing Enterprise Healthcare Big Data by the Combination of Knox/F5, Ranger...Securing Enterprise Healthcare Big Data by the Combination of Knox/F5, Ranger...
Securing Enterprise Healthcare Big Data by the Combination of Knox/F5, Ranger...DataWorks Summit
 
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopSuccesses, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopDataWorks Summit/Hadoop Summit
 
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...Data Con LA
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseDataWorks Summit
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackDataWorks Summit/Hadoop Summit
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...DataWorks Summit
 
The DAP - Where YARN, HBase, Kafka and Spark go to Production
The DAP - Where YARN, HBase, Kafka and Spark go to ProductionThe DAP - Where YARN, HBase, Kafka and Spark go to Production
The DAP - Where YARN, HBase, Kafka and Spark go to ProductionDataWorks Summit/Hadoop Summit
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheDremio Corporation
 
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...DataWorks Summit
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 
Implementing Security on a Large Multi-Tenant Cluster the Right Way
Implementing Security on a Large Multi-Tenant Cluster the Right WayImplementing Security on a Large Multi-Tenant Cluster the Right Way
Implementing Security on a Large Multi-Tenant Cluster the Right WayDataWorks Summit
 
Ingesting Data at Blazing Speed Using Apache Orc
Ingesting Data at Blazing Speed Using Apache OrcIngesting Data at Blazing Speed Using Apache Orc
Ingesting Data at Blazing Speed Using Apache OrcDataWorks Summit
 
Realizing the Promise of Portable Data Processing with Apache Beam
Realizing the Promise of Portable Data Processing with Apache BeamRealizing the Promise of Portable Data Processing with Apache Beam
Realizing the Promise of Portable Data Processing with Apache BeamDataWorks Summit
 
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsCloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsDataWorks Summit
 

What's hot (20)

GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
 
Accelerating Big Data Insights
Accelerating Big Data InsightsAccelerating Big Data Insights
Accelerating Big Data Insights
 
HDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and SupportabilityHDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and Supportability
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoop
 
Securing Enterprise Healthcare Big Data by the Combination of Knox/F5, Ranger...
Securing Enterprise Healthcare Big Data by the Combination of Knox/F5, Ranger...Securing Enterprise Healthcare Big Data by the Combination of Knox/F5, Ranger...
Securing Enterprise Healthcare Big Data by the Combination of Knox/F5, Ranger...
 
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopSuccesses, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
 
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stack
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
 
The DAP - Where YARN, HBase, Kafka and Spark go to Production
The DAP - Where YARN, HBase, Kafka and Spark go to ProductionThe DAP - Where YARN, HBase, Kafka and Spark go to Production
The DAP - Where YARN, HBase, Kafka and Spark go to Production
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the Experts
 
Implementing Security on a Large Multi-Tenant Cluster the Right Way
Implementing Security on a Large Multi-Tenant Cluster the Right WayImplementing Security on a Large Multi-Tenant Cluster the Right Way
Implementing Security on a Large Multi-Tenant Cluster the Right Way
 
Ingesting Data at Blazing Speed Using Apache Orc
Ingesting Data at Blazing Speed Using Apache OrcIngesting Data at Blazing Speed Using Apache Orc
Ingesting Data at Blazing Speed Using Apache Orc
 
Realizing the Promise of Portable Data Processing with Apache Beam
Realizing the Promise of Portable Data Processing with Apache BeamRealizing the Promise of Portable Data Processing with Apache Beam
Realizing the Promise of Portable Data Processing with Apache Beam
 
Hybrid Data Platform
Hybrid Data Platform Hybrid Data Platform
Hybrid Data Platform
 
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsCloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerations
 

Similar to How to Use Innovative Data Handling and Processing Techniques to Drive Alpha in the Financial Markets

Implement DevOps Like a Unicorn—Even If You’re Not One
Implement DevOps Like a Unicorn—Even If You’re Not OneImplement DevOps Like a Unicorn—Even If You’re Not One
Implement DevOps Like a Unicorn—Even If You’re Not OneTechWell
 
Automating Infrastructure as a Service Deployments and monitoring – TEC213
Automating Infrastructure as a Service Deployments and monitoring – TEC213Automating Infrastructure as a Service Deployments and monitoring – TEC213
Automating Infrastructure as a Service Deployments and monitoring – TEC213Chris Kernaghan
 
DevOps Culture & Enablement with Postgres Plus Cloud Database
DevOps Culture & Enablement with Postgres Plus Cloud DatabaseDevOps Culture & Enablement with Postgres Plus Cloud Database
DevOps Culture & Enablement with Postgres Plus Cloud DatabaseEDB
 
The Fastest Way to Redis on Pivotal Cloud Foundry
The Fastest Way to Redis on Pivotal Cloud FoundryThe Fastest Way to Redis on Pivotal Cloud Foundry
The Fastest Way to Redis on Pivotal Cloud FoundryVMware Tanzu
 
451 Research: Data Is the Key to Friction in DevOps
451 Research: Data Is the Key to Friction in DevOps451 Research: Data Is the Key to Friction in DevOps
451 Research: Data Is the Key to Friction in DevOpsDelphix
 
Baha Mar's All in Bet on Red: The Story of Integrating Data and Master Data w...
Baha Mar's All in Bet on Red: The Story of Integrating Data and Master Data w...Baha Mar's All in Bet on Red: The Story of Integrating Data and Master Data w...
Baha Mar's All in Bet on Red: The Story of Integrating Data and Master Data w...Joseph Alaimo Jr
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
Change Management in Hybrid landscapes 2017
Change Management in Hybrid landscapes 2017Change Management in Hybrid landscapes 2017
Change Management in Hybrid landscapes 2017Chris Kernaghan
 
SAP Teched 2012 Session Tec3438 Automate IaaS SAP deployments
SAP Teched 2012 Session Tec3438 Automate IaaS SAP deploymentsSAP Teched 2012 Session Tec3438 Automate IaaS SAP deployments
SAP Teched 2012 Session Tec3438 Automate IaaS SAP deploymentsChris Kernaghan
 
Webinar: Ten Ways to Enhance Your Salesforce.com Application in 2013
Webinar: Ten Ways to Enhance Your Salesforce.com Application in 2013Webinar: Ten Ways to Enhance Your Salesforce.com Application in 2013
Webinar: Ten Ways to Enhance Your Salesforce.com Application in 2013Emtec Inc.
 
ATAGTR2017 Performance Testing and Non-Functional Testing Strategy for Big Da...
ATAGTR2017 Performance Testing and Non-Functional Testing Strategy for Big Da...ATAGTR2017 Performance Testing and Non-Functional Testing Strategy for Big Da...
ATAGTR2017 Performance Testing and Non-Functional Testing Strategy for Big Da...Agile Testing Alliance
 
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...Fwdays
 
18. Madhur Hemnani - Result Orientated Innovation with Oracle HR Analytics
18. Madhur Hemnani - Result Orientated Innovation with Oracle HR Analytics18. Madhur Hemnani - Result Orientated Innovation with Oracle HR Analytics
18. Madhur Hemnani - Result Orientated Innovation with Oracle HR AnalyticsCedar Consulting
 
Risks & Rewards of Upgrading to the Latest Version of Siebel CTMS
Risks & Rewards of Upgrading to the Latest Version of Siebel CTMSRisks & Rewards of Upgrading to the Latest Version of Siebel CTMS
Risks & Rewards of Upgrading to the Latest Version of Siebel CTMSPerficient, Inc.
 
Redefining the Role of IT in a Self-Help Data Integration Environment
Redefining the Role of IT in a Self-Help Data Integration EnvironmentRedefining the Role of IT in a Self-Help Data Integration Environment
Redefining the Role of IT in a Self-Help Data Integration EnvironmentUNIFI Software
 
Agile Data Architecture
Agile Data ArchitectureAgile Data Architecture
Agile Data ArchitectureCprime
 
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...Amazon Web Services
 
Optimizing Open Source for Greater Database Savings and Control
Optimizing Open Source for Greater Database Savings and ControlOptimizing Open Source for Greater Database Savings and Control
Optimizing Open Source for Greater Database Savings and ControlEDB
 

Similar to How to Use Innovative Data Handling and Processing Techniques to Drive Alpha in the Financial Markets (20)

Implement DevOps Like a Unicorn—Even If You’re Not One
Implement DevOps Like a Unicorn—Even If You’re Not OneImplement DevOps Like a Unicorn—Even If You’re Not One
Implement DevOps Like a Unicorn—Even If You’re Not One
 
Automating Infrastructure as a Service Deployments and monitoring – TEC213
Automating Infrastructure as a Service Deployments and monitoring – TEC213Automating Infrastructure as a Service Deployments and monitoring – TEC213
Automating Infrastructure as a Service Deployments and monitoring – TEC213
 
DevOps Culture & Enablement with Postgres Plus Cloud Database
DevOps Culture & Enablement with Postgres Plus Cloud DatabaseDevOps Culture & Enablement with Postgres Plus Cloud Database
DevOps Culture & Enablement with Postgres Plus Cloud Database
 
The Fastest Way to Redis on Pivotal Cloud Foundry
The Fastest Way to Redis on Pivotal Cloud FoundryThe Fastest Way to Redis on Pivotal Cloud Foundry
The Fastest Way to Redis on Pivotal Cloud Foundry
 
DevOps for the DBA- Jax Style!
DevOps for the DBA-  Jax Style!DevOps for the DBA-  Jax Style!
DevOps for the DBA- Jax Style!
 
451 Research: Data Is the Key to Friction in DevOps
451 Research: Data Is the Key to Friction in DevOps451 Research: Data Is the Key to Friction in DevOps
451 Research: Data Is the Key to Friction in DevOps
 
Baha Mar's All in Bet on Red: The Story of Integrating Data and Master Data w...
Baha Mar's All in Bet on Red: The Story of Integrating Data and Master Data w...Baha Mar's All in Bet on Red: The Story of Integrating Data and Master Data w...
Baha Mar's All in Bet on Red: The Story of Integrating Data and Master Data w...
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Change Management in Hybrid landscapes 2017
Change Management in Hybrid landscapes 2017Change Management in Hybrid landscapes 2017
Change Management in Hybrid landscapes 2017
 
SAP Teched 2012 Session Tec3438 Automate IaaS SAP deployments
SAP Teched 2012 Session Tec3438 Automate IaaS SAP deploymentsSAP Teched 2012 Session Tec3438 Automate IaaS SAP deployments
SAP Teched 2012 Session Tec3438 Automate IaaS SAP deployments
 
Webinar: Ten Ways to Enhance Your Salesforce.com Application in 2013
Webinar: Ten Ways to Enhance Your Salesforce.com Application in 2013Webinar: Ten Ways to Enhance Your Salesforce.com Application in 2013
Webinar: Ten Ways to Enhance Your Salesforce.com Application in 2013
 
ATAGTR2017 Performance Testing and Non-Functional Testing Strategy for Big Da...
ATAGTR2017 Performance Testing and Non-Functional Testing Strategy for Big Da...ATAGTR2017 Performance Testing and Non-Functional Testing Strategy for Big Da...
ATAGTR2017 Performance Testing and Non-Functional Testing Strategy for Big Da...
 
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...
 
18. Madhur Hemnani - Result Orientated Innovation with Oracle HR Analytics
18. Madhur Hemnani - Result Orientated Innovation with Oracle HR Analytics18. Madhur Hemnani - Result Orientated Innovation with Oracle HR Analytics
18. Madhur Hemnani - Result Orientated Innovation with Oracle HR Analytics
 
1225 case study
1225 case study1225 case study
1225 case study
 
Risks & Rewards of Upgrading to the Latest Version of Siebel CTMS
Risks & Rewards of Upgrading to the Latest Version of Siebel CTMSRisks & Rewards of Upgrading to the Latest Version of Siebel CTMS
Risks & Rewards of Upgrading to the Latest Version of Siebel CTMS
 
Redefining the Role of IT in a Self-Help Data Integration Environment
Redefining the Role of IT in a Self-Help Data Integration EnvironmentRedefining the Role of IT in a Self-Help Data Integration Environment
Redefining the Role of IT in a Self-Help Data Integration Environment
 
Agile Data Architecture
Agile Data ArchitectureAgile Data Architecture
Agile Data Architecture
 
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
 
Optimizing Open Source for Greater Database Savings and Control
Optimizing Open Source for Greater Database Savings and ControlOptimizing Open Source for Greater Database Savings and Control
Optimizing Open Source for Greater Database Savings and Control
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

How to Use Innovative Data Handling and Processing Techniques to Drive Alpha in the Financial Markets

  • 1. 2017 © Parametric Portfolio Associates® LLC HOW TO USE INNOVATIVE DATA HANDLING AND PROCESSING TECHNIQUES TO DRIVE ALPHA IN THE FINANCIAL MARKETS
  • 2. 2017 © Parametric Portfolio Associates® LLC2 PARAMETRIC’S PROFILE: *As of 3/31/2017. Includes AUM of Parametric Investment & Overlay Strategies and Parametric Custom Tax-Managed & Centralized Portfolio Management. Seattle, WA Minneapolis, MN Westport, CT • Leaders in rules-based, engineered portfolio solutions • Strategies ranging from index tracking portfolios to managed smart beta • Ability to incorporate responsible investing themes • Founded 1987 • A subsidiary of Eaton Vance Corp. since 2003 • Pioneers in overlay strategies and custom risk management solutions (formerly The Clifton Group) • Innovative product solutions in real asset and liquid alternatives • Founded 1972 • Acquired by Parametric in 2012 • Specialists in option portfolio management • Provide product-based and custom option overlay solutions • Founded 2002 • A part of Parametric since 2007 We provide systematic, disciplined portfolio management solutions We offer investment solutions through our three investment centers: > Parametric Portfolio Associates® LLC (“Parametric”) is a majority-owned subsidiary of Eaton Vance Corp. > Approximately $197.6 Billion (3/31/2017) in assets under management*.
  • 3. 2017 © Parametric Portfolio Associates® LLC3 PARAMETRIC INVESTMENT PLATFORM* *For illustrative purposes only
  • 4. 2017 © Parametric Portfolio Associates® LLC4 PARAMETRIC’S BIG DATA JOURNEY •MDM Launch •Decision: Data Centralization2016 •Data Lake Implementation •Focus: Mastering Data Sources and Data Discovery 2017 •Modernizing Data Usage •Focus: Transition Data Silos to Data Lake 2018
  • 5. 2017 © Parametric Portfolio Associates® LLC5 OVERVIEW - IT ENVIRONMENT Our Hadoop Environment: Two Separate Clusters ‒ Production 10 Node Cluster • Common + Hive • Clustered NiFi ‒ Development 10 Node Cluster • Common + Hive + Spark • Non-Clustered NiFi NiFi in both dev and production We build NiFi workflows in dev and promote to production Our Environment: ‒ Before Hadoop • Primarily Windows ‒ C# , MS SQL, PowerShell, etc. • All automation done using CA ‒ With Hadoop • Still primarily Windows… + Hadoop • Transition ETL automation to NiFi
  • 6. 2017 © Parametric Portfolio Associates® LLC6 THE DATA MANAGEMENT OFFICE CHALLENGE Stop processing on time boundaries – Process as soon as data is available! • Previously processes were triggered at specific times of day • Vendor data availability is generally good but not perfect ‒Delayed data has cascading negative effects ‒Manual intervention typically required for delayed or missing data files ‒Requires after hours support • Data consumption pushed to nightly jobs to ensure most complete data sets • Insurance premium for waiting on data ‒Loss of potential processing hours ‒Loss of processing during business hours ‒Potential loss of pre-processing work
  • 7. 2017 © Parametric Portfolio Associates® LLC7 RESOLUTION: NIFI NiFi’s Immediate Benefits: • Event Based Processing • Data Provenance • Queueing • Back Pressure • Rapid Development • Large Selection of Processors
  • 8. 2017 © Parametric Portfolio Associates® LLC8 QUICK WIN – FINDING RESTRICTED ISSUERS • Background • An account defines an ownership restriction for one or more issuers (companies). The client provides one or more security identifiers with an issuer name. However the list may be incomplete but the mandate is that ANY security from that issuer can not be held in their account. • Problem Statement • How to identify the issuer to prevent any of its security types from being held? • Solution: • Use the client provided security identifiers to search our Bloomberg data and map it to an id_bb_global so its restrict by our compliance system.
  • 9. 2017 © Parametric Portfolio Associates® LLC9 OLD PROCESS SOLUTION Old Process Steps: • Client provides a spreadsheet of market identifiers • Spreadsheet is reviewed and then run through several independent processes – mostly manual • Spreadsheet is returned to the requestor with the company identifier tacked, if found, to the original spreadsheet • Requestor then formats the results so that it can be digested by the target system Old Process Requirements: • 3 people an average of 3 to 5 hours ‒ Requestor – 1 to 1 ½ hours • Review client send, write email to kick start process, follow up on expected completion, and reformatting results ‒ HelpDesk • Process ticket and assign to app support ‒ App Support • Research and query generation
  • 10. 2017 © Parametric Portfolio Associates® LLC10 NIFI SOLUTION NiFi Steps: • Client provides a spreadsheet of market identifiers • Spreadsheet is reviewed, identifiers are cut and pasted into a standard formatted spreadsheet • Spreadsheet is saved as a CSV file to the In directory on public drive • Requestor receives an email when NiFi process is done • Requestor picks up a ready to load CSV file in the Out directory Achieved Targets: • Minimal manual processing and IT intervention • All self service and its easy to do • Bonus: Better results by searching all Bloomberg data available instead of investible universe
  • 11. 2017 © Parametric Portfolio Associates® LLC11 WHAT IT LOOKS LIKE TO THE REQUESTOR
  • 12. 2017 © Parametric Portfolio Associates® LLC12 WHAT IT LOOKS LIKE IN NIFI TODAY (REFINED)
  • 13. 2017 © Parametric Portfolio Associates® LLC13 HOW IT STARTED
  • 14. 14 2017 © Parametric Portfolio Associates® LLC OUR TOP 10 NIFI BEST PRACTICES
  • 15. 2017 © Parametric Portfolio Associates® LLC15 #10 Adjust Processor Run Schedule -When developing flows, first priority should be to adjust Run Schedule -We have had cases where this wasn’t done and massive amounts of data was generated -Back Pressure will keep things from running completely out of control
  • 16. 2017 © Parametric Portfolio Associates® LLC16 #9 Make sure you have plenty of storage space for the NiFi databases •NiFi’s Data Provenance and Queuing require sufficient storage space •Place these databases on a separate mount points •Configure Provenance expiration to meet your business requirements
  • 17. 2017 © Parametric Portfolio Associates® LLC17 #8 Use Process Groups •Process groups allow you create modular flows •Keeps flows organized and readable •Authorization can be set for process groups •The root page is a process group
  • 18. 2017 © Parametric Portfolio Associates® LLC18 #8 USE PROCESS GROUPS 1. Each developer has their own process group in our development environment 2. We work in our own process groups when doing initial development / POC type of work 3. We have process groups that encapsulate “releasable code”
  • 19. 2017 © Parametric Portfolio Associates® LLC19 #8 USE PROCESS GROUPS
  • 20. 2017 © Parametric Portfolio Associates® LLC20 #7 Create templates of single Processor for easy reuse •Often times you add a Processor to a flow and then need to update a number of properties •Stream line this by creating a template that has just the Processor with the properties prepopulated •Use the template instead of the Processor
  • 21. 2017 © Parametric Portfolio Associates® LLC21 #7 PROCESSOR REUSE 1. Configure a Processor 2. In this example Put HDFS configured with Kerberos Principal Create a template that only contains the processor 1. Select the processor 2. Click the create template icon 3. Give it a good name and description
  • 22. 2017 © Parametric Portfolio Associates® LLC22 #7 PROCESSOR REUSE Once a template has been created you can add that template to newly developed work flows
  • 23. 2017 © Parametric Portfolio Associates® LLC23 #6 The Data Provenance Search Facility
  • 24. 2017 © Parametric Portfolio Associates® LLC24 #5 If you Cluster in Prod then Cluster in Dev •NiFi supports clustering •If you are going to clustered NiFi in Production then have your Dev NiFi clustered as well •Certain Processor should only run on a single node in the cluster •It is possible to create a single node cluster in dev, but still best to have your dev setup match your production setup.
  • 25. 2017 © Parametric Portfolio Associates® LLC25 #4 Set expirations on Success queues •Often times we want to capture flows successful completion •We route the final Success output of a process to a funnel •Make sure you set the flow file expiration of the queue otherwise back pressure will cause your flows to stop until the queue is drained
  • 26. 2017 © Parametric Portfolio Associates® LLC26 #4 EXPIRE SUCCESS QUEUES
  • 27. 2017 © Parametric Portfolio Associates® LLC27 #3 Use Custom Properties per environment •NiFi flows have access to Custom properties and environment variables •We use these to make our flows environment agnostic •Not perfect • Not all properties support expression language • Custom properties are read at startup
  • 28. 2017 © Parametric Portfolio Associates® LLC28 #2 Create Small Modular Disconnected work flows •A large complex, interconnected flows • Difficult to debug • Difficult to deploy •Decouple flows • Create flow with specific functional purpose • If other flows are dependent use queues to as the coupling mechanism
  • 29. 2017 © Parametric Portfolio Associates® LLC29 #2 SMALL, MODULAR AND DISCONNECTED Download and put to HDFS Process downloaded files
  • 30. 2017 © Parametric Portfolio Associates® LLC30 #2 SMALL, MODULAR AND DISCONNECTED Decouple
  • 31. 2017 © Parametric Portfolio Associates® LLC31 #1 Update the Names of the Processors •Do not use the default Processor name •Give each processor in a flow a friendly, well understood name •NiFi Summary other features that show processor names will be more usable if processors are named
  • 32. 2017 © Parametric Portfolio Associates® LLC32 #1 UPDATE PROCESSORS NAMES
  • 34. 2017 © Parametric Portfolio Associates® LLC34 EVENT BASED MINDSET – AS A COMPANY  To leverage a modern data architecture fully companies need to think in terms events, not a schedule  An event could be anything from a file arriving in an FTP, a restful API call, or a database being updated  The concept of jobs running at specific times causes unnecessary strain on systems and reduces overall throughput • Like waiting to the last minute to do all your homework • 50MB/s = >4TB/day
  • 35. 2017 © Parametric Portfolio Associates® LLC35 EASY TO GET STARTED – DON’T BE LAZY  It is a lot easier to go fast and get an application working • This leads to bad programs  Just because it works doesn’t mean its efficient • Double the hardware for the same amount of work  Repeated for Importance: Rename your processors like you would comment your code  Check your backpressure settings – very easy to make a flow that overwhelms a specific step • Particularly with “GenerateFlowFile” • Debugging steps need to be evaluated/removed before production
  • 36. 2017 © Parametric Portfolio Associates® LLC36 INCLUDE “BUSINESS” PEOPLE AND SPECIALISTS  With a more interactive graphical interface, business people & specialists can understand the program’s flow a lot better  Have you ever tried to go line by line through java code with someone? Eyes glaze over  Collaborate earlier and more often, it will decrease the cycles needed to get on target
  • 37. 2017 © Parametric Portfolio Associates® LLC37 DON’T BE AFRAID TO CUSTOM CODE  The majority of use-cases don’t need custom code – don’t be afraid to make new processors
  • 38. 38 2017 © Parametric Portfolio Associates® LLC QUESTIONS?
  • 39. 39 2017 © Parametric Portfolio Associates® LLC DISCLOSURE Parametric Portfolio Associates LLC (“Parametric”), headquartered in Seattle, Washington, is registered as an investment adviser with the U.S. Securities and Exchange Commission under the Investment Advisers Act of 1940. Parametric is a leading global asset management firm, providing investment strategies and customized exposure management directly to institutional investors and indirectly to individual investors through financial intermediaries. Parametric offers a variety of rules-based investment strategies, including alpha-seeking equity, alternative and options strategies, as well as implementation services, including customized equity, traditional overlay and centralized portfolio management. Parametric is a majority-owned subsidiary of Eaton Vance Corp. and offers these capabilities through investment centers in Seattle, WA, Minneapolis, MN and Westport, CT. This material may not be forwarded or reproduced, in whole or in part, without the written consent of Parametric Compliance. Parametric and its affiliates are not responsible for its use by other parties. All contents copyright 2017 Parametric Portfolio Associates LLC. All rights reserved. Parametric Portfolio Associates, PIOS, and Parametric with the iris flower logo are all trademarks registered in the US Patent and Trademark Office. Parametric is located at 1918 8th Avenue, Suite 3100, Seattle, WA 98101. For more information regarding Parametric and its investment strategies, or to request a copy of Parametric’s Form ADV Brochure, please contact us at 206.694.5575 or visit our website, www.parametricportfolio.com.