The proliferation of novel data sources has awoken quantitative investors to the promise of “Big Data”. Billions of venture capital funding has created an ecosystem of companies to help investors extract information out of unstructured text, sensors, etc. A “Vision for Quants in the Data Economy” is nice, but what does it take to turn that vision into reality? Join Data Capital Management as we discuss some of the breakthroughs by companies like Twitter, Google and Facebook that are empowering quantitative investors to extract alpha from “Big Data."
2. EMPOWERING QUANTITATIVE INVESTORS
IN THE DATA ECONOMY
QuantCon 2016
Saturday, April 9, 2016
| DATA CAPITAL MANAGEMENT | 1
Dr. Napoleon Hernandez
Got API?
dcm@datacapitalmanagement.com
3. IMPORTAN T INFORMATI ON
| DATA CAPITAL MANAGEMENT | 2
The information contained in these documents is confidential. The compiled package is property of Data Capital
Management LLC. It is intended only for the use of the addressed party (the “Recipient”). It is directed at professional
clients and eligible counterparties only and is not intended for retail clients. If you are not the intended Recipient, please
notify us immediately so that we may arrange for return of the document at no expense to you. You should not make
any copies, nor disclose the contents to any other person. The Recipient agrees to hold confidential any and all
Confidential Information disclosed to the Recipient by Data Capital Management LLC and agrees not to use or to
disclose to any third party, either directly or indirectly, all or any portion of the Confidential Information or to disclose the
fact that the Recipient has received Confidential Information from Data Capital Management LLC except with prior
written consent of the Disclosing Party.
This is not an offer or solicitation with respect to the purchase or sale of any security. The material is intended only to
facilitate the Recipient’s discussions with Data Capital Management LLC as to the opportunities available to our clients.
The given material is subject to change and, although based upon information which we consider reliable, it is not
guaranteed as to accuracy or completeness and it should not be relied upon as such.
The material is not intended to be used as a general guide to investing, or as a source of any specific investment
recommendations, and makes no implied or express recommendations concerning the manner in which any client’s
account should or would be handled, as appropriate investment strategies depend upon client’s investment objectives.
The price and value of the investments referred to in this material and the income from them may go down as well as
up and investors may not receive back the amount originally invested. Past performance is not a guide to future
performance. Future returns are not guaranteed and a loss of principal may occur.
4. TABLE OF CONTEN TS
| DATA CAPITAL MANAGEMENT | 3
THE OPPORTUNITIES FOR INVESTMENT IN THE DATA ECONOMY
4
THE CHALLENGES FOR SUCCESSFUL INVESTORS IN THE DATA ECONOMY
7
CHALLENGE 1: DATA ACQUISITION AND INTEGRATION
8
CHALLENGE 2: CONTEXTUALIZATION AND ANALYSIS OF DATA
10
CHALLENGE 3: INFRASTRUCTURE & ARCHITECTURE
12
WHAT WILL THE FUTURE LOOK LIKE?
14
5. THE AMOUN T OF INFORMATI ON AVAI L ABL E HAS EXPLODED…
DATA IS EVERYWHERE
The amount of digital information
created and shared in the world
increased twenty-fold in just ten years,
to almost 8 zettabytes
EVERY INTERNET MINUTE…
624 TB of data transferred
1,300 new mobile users
$83,000 USD in sales on Amazon
100+ new LinkedIn accounts
2+ million queries on Google
1.3M views of video on YouTube
UNPRECEDENTED DATA ACCESS
ALLOWS THE DISCOVERY OF
INVESTMENT OPPORTUNITIES
FROM INDIVIDUAL,
MICROECONOMIC ACTIVITY
| DATA CAPITAL MANAGEMENT | 4
“Big Data” is a disruptive force in financial investing due to the exponential growth in asset-price-relevant information.
…while the value of any given individual piece of data has shrunk
Source: http://archive.tiecon.org/content/big-data-landscape-%C3%A2%E2%82%AC%E2%80%9C-why-should-you-care
http://www.techandinnovationdaily.com/2013/05/31/six-tech-statistics-mary-meeker/
For illustrative purposes only. Back test results are not indicative of future returns.
6. HOW TO INVEST IN A DATA FILLED WORLD
| DATA CAPITAL MANAGEMENT | 5
Operations for profit should be based
not on optimism but on arithmetic
- Benjamin Graham
Operations for profit should be based
not on optimism but on careful analysis
of relevant data
- DCM Portfolio Manager
7. 3 STEPS FOR DATA DRIVEN INVESTMEN TS
| DATA CAPITAL MANAGEMENT | 6
In essence all professional investment managers share the same process to make decisions
DATA ACQUISITION ANALYSIS DECISION
Breadth: Data is data; novel
data sources including news,
images, social networks,
macroeconomic feeds, etc. are
linked to price movements in
securities, currencies,
commodities, etc.
Depth: Analyzes, prioritizes
and monitors big-data input
and its impact using systematic
and quantitative metrics
Speed: Real-time, rules-based
extraction and interpretation of
information based on event
triggers from over 20,000
leading global newswires,
online newspapers,
aggregators,
and blogs in under 5 seconds
VOLUME &
VARIETY
VERACITY VELOCITY
8. VALUE FOR INVESTMEN T IS CLEAR, WHAT IS THE PROBLEM?
| DATA CAPITAL MANAGEMENT | 7
New technologies are required for news-aware investing
Source: http://www.zdnet.com/article/big-data-big-hype-or-big-hope/
For illustrative purposes only. Back test results are not indicative of future returns. Strategies are preliminary and are not necessarily those that will be deployed in the market.
DCM Technology Stack
DCM Data Services™ Novel and traditional data integration
DCM Corporate Graph™ Tracking of economic relationships
DCM Event Almanac™ Identification of events that matter
DCM Machine Learning™
Model adaptation to changing key
drivers
DCM Intelligent Traders™ Fully automated, scalable execution
INSPIRED BY SILICON VALLEY SUCCESS
IN OTHER INDUSTRIES, DCM
TECHNOLOGIES EMPOWER QUANTITATIVE
INVESTORS IN THE DATA ECONOMY
2
1
3
9. CHALLEN GE 1: DATA ACQUISI TI ON
I never guess. It is a capital mistake to theorize before one has data – Sir Arthur Conan Doyle
OUTCOME: Faster access to a broader set of data
Data Models
Data storage
Data acquisition
Application/task specific data
architecture (NewSQL, key
value storage and graph
databases)
Polyglot solutions running on
cloud services, horizontally
scalable over commodity
machines
DCM approach
Hybrid, focused on stream
based, distributed messaging
systems
Relational databases (SQL)
and/or proprietary
specialized DB (e.g. kdb)
SQL-like solutions running
on high-end machines,
vertically scalable
Traditional approach
ETL Overnight feed based,
queuing systems for real-
time data
SQL, key-value storage,
document based, NoSQL,
NewSQL, Graph models…
Polyglot solutions running on
cloud services, horizontally
scalable over commodity
machines
Silicon Valley approach
ELT 24x7 data acquisition using
stream based, distributed
messaging systems
| DATA CAPITAL MANAGEMENT | 8
Source http://www.infosysblogs.com/testing-services/2013/02/etl_elt_etlthow_to_devise_the_.html
For illustrative purposes only.
10. CHALLEN GE 1: DATA INTEGR ATI ON
Data are just summaries of thousands of stories – Chip & Dan Heath
OUTCOME: Broader, deeper, faster access to relevant data
Data Querying
Data Bi-temporality
Data linkage
Unified query layer on top of multiple
data storage engines
Treatment of bi-temporality as a
fundamental property of all data, with
support over distributed systems
DCM approach
Hybrid, explicit relationships for
traditional data, and machine learning
based for novel data where appropriate
Relational, using SQL
queries over fact and
dimension tables of star
schemas
Data columns added and
treated as regular data,
with redundant indexing
Traditional approach
Implicit in encoded
business logic; all linkage
done while ingesting data
Task specialized:
relational, graph-like, key-
based with lightly
integrated query layers
Not really emphasized;
geo-location of data is
similar problem in essence
Silicon Valley approach
Machine learning based
clustering of identifiable
information
Source: http://www.slideshare.net/DavidColebatch/20121029-graph-tointro-to-pacer
For illustrative purposes only.
| DATA CAPITAL MANAGEMENT | 9
11. CHALLEN GE 2: DATA ANALYSIS
Data by itself is useless. Data is only useful if you apply it – Todd Park
OUTCOME: Non-obvious data-driven investment opportunities
Entity interconnections
Strategy development
Regime determination
Data driven relationships, using
unsupervised machine learning on
top of available information
Adaptive model calibration as
information becomes available
DCM approach
Guided pattern recognition based on
financial specific feature engineering
built by traditional methods
Pre-defined relationships
through sector, region,
client relationships, etc
Static models with ad-hoc
re-calibration (usually
batch based)
Traditional approach
Generative models:
parametric models are
fitted to data
Data driven relationships,
using unsupervised machine
learning on top of connection
information
Finance specific, no analog
Silicon Valley approach
Pattern recognition
breakthroughs in complete
information games (Go)
Source: http://cs231n.github.io/convolutional-networks/
http://www.businessinsider.com/magic-mushrooms-change-brain-connections-2014-10
For illustrative purposes only.
| DATA CAPITAL MANAGEMENT | 10
12. CHALLEN GE 2: DATA ANALYSIS
Demo time
| DATA CAPITAL MANAGEMENT | 11
13. CHALLEN GE 3: INFRAS TRUC TU RE AND ARCHI TE C TURE
If you torture the data long enough, it will confess – Ronald Coase
OUTCOME: Reliable and cost efficient operations
Resource management
Software architecture
Fault Tolerance
Elastic, demand based cloud
virtualized instances (variable costs)
using containers
Distributed modular service oriented
architectures (micro services)
specialized on parallel processing
of financial data
DCM approach
Redundancy at the data and service
level. Reliability through consensus
and presence protocols.
Static, upfront pre-allocation
of resources (upfront fixed
costs)
Monolithic applications
running on silo’d systems
Traditional approach
Redundancy at the machine
and system level. Network
load balancing and failover
Elastic, demand based cloud
virtualized instances (variable
costs)
Distributed modular service
oriented architectures (micro
services)
Silicon Valley approach
Redundancy at the data block
level. Reliability through consensus
and presence protocols.
Source: https://www.linkedin.com/pulse/microservices-reference-architecture-spring-boot-cloud-anil-allewar
https://nexa.polito.it/nexafiles/above_the_clouds.ppt.pdf
For illustrative purposes only.
| DATA CAPITAL MANAGEMENT | 12
14. PUTTING IT ALL TOGETHER : DCM ARCHI TEC T URE
| DATA CAPITAL MANAGEMENT | 13
In 5 seconds, DCM is able to generate end-to-end identification, analysis and trade execution on new event catalysts.
Load Meta Data
Reference Table
Score Data
Quality
Cyber Security
& Controls
Understand
the API
Understand
the Data
Permission
&
Entitlement
Visual
Tools
Machine
Learning
Statistical
Tools
Valuation
Tools
Natural
Language
Process
DATA EXTRACT
Economic
0:00.00 0:01.00 0:02.00 0:03.00 0:04.00 0:05.00
DCM DATA LAKE AND DISTRIBUTED SERVICES TRADING PLATFORM
REAL-TIME DATA API DCM DEVELOPMENT ENVIRONMENT STRATEGY BACK TESTING
Fundamental
Prices
News/Events
Novel Data
Back-Testing
Visualization Tools
Paper Trading
Live Trading
Efficient Markets Hypothesis = New Information Drives Price Changes DCM Edge = Data is “On-Demand” Analysis in under 5
seconds
1 2 4
3
5
6
15. THE FUTURE OF QUANTI TAT IVE INVESTMEN T
DATA WILL BECOME MORE
PREVALENT THAN EVER
The Internet of Things and other
technology advancements will create
more data than ever in the world
COGNITIVE COMPUTING AND
ADVANCE AI WILL DISRUPT
CAPITAL MARKETS
Advances on AI make computers
capable of learning with experience,
similar to how humans do. Knowledge
fields like Finance are ripe to be
disrupted by these changes
“ADVANCES IN ALGORITHMS,
HARDWARE, NETWORKS AND BIG
DATA, SMART MACHINES ARE
PROVING READY TO DISRUPT
CONVENTIONAL APPROACHES TO
MUCH OF WHAT THE IT
ORGANIZATION DOES”
| DATA CAPITAL MANAGEMENT | 14
Sources: http://thefuturesagency.com/category/banking-financial-services-money/ http://www.pbs.org/newshour/rundown/carnegie-mellon-wagers-computer-can-take-top-poker-players/
https://gogameguru.com/alphago-shows-true-strength-3rd-victory-lee-sedol/ https://en.wikipedia.org/wiki/Watson_(computer)#/media/File:Watson_Jeopardy.jpg
For illustrative purposes only. Back test results are not indicative of future returns.
We have come regretfully to the conclusion that the current algorithmically driven market environment is one which is
increasingly incompatible with our fundamental, research orientated, investment process – Martin Taylor, Nevsky Capital
16. Q U E S T I O N S
Extremely Confidential. Not for Public Distribution. | DATA CAPITAL MANAGEMENT | 15Extremely Confidential. Not for Public Distribution. | DATA CAPITAL MANAGEMENT | 15
17. EMPOWERING QUANTITATIVE INVESTORS
IN THE DATA ECONOMY
| DATA CAPITAL MANAGEMENT | 16
Dr. Napoleon Hernandez
Got API?
dcm@datacapitalmanagement.com