Contenu connexe Similaire à Replacing Traditional Technologies with MongoDB: A Single Platform for All Financial Data at AHL (20) Replacing Traditional Technologies with MongoDB: A Single Platform for All Financial Data at AHL2. Opinions expressed are those of the author and may not be shared by all personnel of Man Group plc
(‘Man’). These opinions are subject to change without notice, and are for information purposes only and do not
constitute an offer or invitation to make an investment in any financial instrument or in any product to which any
member of Man’s group of companies provides investment advisory or any other services. Any forward-looking
statements speak only as of the date on which they are made and are subject to risks and uncertainties that may
cause actual results to differ materially from those contained in the statements. Unless stated otherwise this
information is communicated by Man Investments Limited and AHL Partners LLP which are both authorised and
regulated in the UK by the Financial Conduct Authority.
© Man 2014 2
Legal Stuff
3. © Man 2014 3
Introductions
Gary Collier James Blackburn
4. © Man 2014 4
Agenda
The Story of MongoDB at AHL
1. What is a Systematic Fund Manager?
2. Low Frequency Futures and FX Data
3. Single Stock Equity Trading
4. Building a Tick Store
5. Now and the Future?
7. Removing the first impedance mismatch…
© Man 2014 7
Quants and Techies Speak the Same Language
9. But…
© Man 2014 9
All Data is Behind an API
Performance
User Experience
Cluster Compute
Onboarding
New Data
Impedance Mismatch
Mix of
Technologies
Is there one
Technology
which could
address?
Many
Moving Parts
Reliability
10. © Man 2013 10
Chapter 1
Starting Small: Low Frequency Data
11. © Man 2014 11
The Data
8000 rows x 200 markets
100 MB
5000000 rows x 250 markets
500 GB
14. © Man 2014 14
MongoDB Solution
node 85 node 96node 86 …node 87
node 1 node 2 node 12
node 73 node 84node 74
…
…
.
.
.
.
.
.
node 3
node 75
.
.
SSD
shard 1 shard 2 shard 3 shard 4
shard 1 shard 2 shard 3 shard 4
shard 1 shard 2 shard 3 shard 4
MongoDB Cluster
Linux
24 cores
96 GB RAM
Bloomberg
Adapter
JPM
Adapter
Markit
Adapter
GS
Adapter
15. © Man 2014 15
Performance: 200 Future Markets
Previous Solution MongoDB
100x faster to retrieve data
Consistent retrieval times
16. © Man 2014 16
Performance: EURUSD 1-Minute Data
Previous Solution MongoDB
2-5x faster to retrieve data
Consistent retrieval times
17. © Man 2014 17
Low Frequency Data - Conclusions
MongoDB faster than previous RDBMS/File Solution at…
• ALL data sizes and ALL client load levels
• …consistently
Game changing new features:
• No impedance mismatch: onboard new data in minutes
• Version Store: can ask “What did the data look like?”
Cost Savings:
• Proprietary parallel filesystem replaced by commodity
SSD’s
18. © Man 2013 18
Chapter 2
Getting Bigger: Single Stock Equities
19. © Man 2014 19
Single Stock Data - Scale
Thousands
of Stocks
Many years of
Time-series Data
Tens of different Data
Item for each Stock
Complex trading models with
many Quants sharing the Data
20. Trading
Signal
Derived Data
Item
Derived Data
Item
Derived Data
Item
Derived Data
Item
Derived Data
Item
Raw Data
ItemsRaw Data
ItemsRaw Data
ItemsRaw Data
ItemsRaw Data
Item
Multi-user, versioned, interactive graph-based computation
© Man 2014 20
Single Stock Data
Source Data
(Managed
RDBMS)
Raw Data
ItemsRaw Data
ItemsRaw Data
ItemsRaw Data
ItemsRaw Data
Item
Derived Data
Item
Derived Data
Item
Derived Data
Item
Derived Data
Item
Derived Data
Item
Trading
Signal
shard 1 shard 2 shard 3 shard 4
shard 1 shard 2 shard 3 shard 4
shard 1 shard 2 shard 3 shard 4
MongoDB Cluster
~1TB Data
~10,000 Stocks
~20 Years
250 Data Items Each Item is 600 MB
Single model ~150GB
Many Quants and models
Hours Minutes
21. © Man 2014 21
Single Stock Trading - Conclusions
MongoDB faster than previous RDBMS/File Solution at…
• Fast interactive research
• Read/write a 600MB Data item in < 1 second
• Rebuild complex model: hours minutes
22. © Man 2013 22
Chapter 3
MongoDB as a Tick Store
23. Almost, but not quite
© Man 2014 23
Big Data?
30TB Historic Data
Ticks/1000 per second
Sparse Data
24. © Man 2014 24
Third-Party Tick Stores
Typically…
• Expensive
• Proprietary query languages
• Database-centric architectures, so…
• Not ideal for cluster compute
• Unless you pay for lots of cores…
• Expensive!
So…
• A real $$$ saving opportunity!
25. © Man 2014 25
Architecture
Reuters
RMDSMessageBus
Bloomberg
Banks
Kafka Queue
Kafka Queue
Kafka Queue
16 shard cluster
Master + 1 replica
Linux
12 cores
256 GB RAM
96TB Disk
Infiniband network
LZ4 compressed data
MongoDB Cluster
26. Parallel Access
© Man 2014 26
Tick Store Performance
Infiniband
saturated
25x greater tick throughput
With just 2 machines!
27. © Man 2014 27
Tick Store: System Load
OtherTick Mongo (x2)N Tasks = 32
28. © Man 2014 28
Tick Store - Conclusions
Happy Quants!
• 25x improvement in tick throughput
• So fit models 25x as fast
Happy Accountants!
• >40x cost saving of MongoDB Support compared to
previous Tick Store licensing.
29. © Man 2014 29
Epilogue
Where are we now and where next?
30. Performance
Low Frequency Data: 100x faster
Equities Models: Hours Seconds
Tick Data: 25x faster
© Man 2014 30
Key Facts
Cost Savings
Parallel File System Commodity SSD’s
Proprietary Tick Store MongoDB
Orders of magnitude $$$ savings…
Efficiencies
4 storage technologies 1
Fully utilise expensive HPC resources
Support load on team down > 50%
Game Changers
Onboard Data: Days Minutes
Data Versioning
The technology is no longer the bottleneck
“Peopleware”
Attract and retain great Quants
Attract and retain great Techies
And attend a great conference
31. © Man 2014 31
Where Next?
1. Extend the data ecosystem further
2. Broader application across the company as a whole
3. Open Source?
32. © Man 2014 32
Questions
Gary Collier
gcollier@ahl.com
James Blackburn
jblackburn@ahl.com
Notes de l'éditeur Everything running orders of magnitude faster
Move from proprietary tech commodity and MongoDB has realised significant cost savings
Complexity down, and getting more out of what we have, both in hardware and people
Including onboarding new data.
Peopleware: often overlooked, but really the most important factor in our sorts of industries
“The reason I love working here so much is because the technology is soooo good”