1. MongoDB is well-suited for reference data solutions due to its dynamic and flexible schema, built-in replication and high availability features, and tag aware sharding which allows for geographic distribution of data.
2. A case study of a global broker dealer showed how MongoDB could replace expensive and complex ETL processes for distributing reference data, saving over $40 million over 5 years.
3. Key benefits included real-time data distribution, faster querying of local data, and avoiding regulatory penalties from delays in data distribution.
6. Relational Database Challenges
Data Types Agile Development
• Unstructured data • Iterative
• Semi-structured • Short development
data cycles
• Polymorphic data • New workloads
Volume of Data New Architectures
• Petabytes of data • Horizontal scaling
• Trillions of records • Commodity servers
• Tens of millions of • Cloud computing
queries per second
6
7. Financial Services Use Cases
1. Risk Analysis & Reporting
2. Tick Data Capture & Analysis
3. Portfolio and P&L Reporting
4. Product Catalog and Trade Lifecycle Management
5. Trade Repository
6. Quantitative Analysis & Automated Trading
7. Order Capture
8. Reference Data
7
8. Reference Data
• How do you globally distribute reference data?
– Polymorphic data
• Price / Products / Securities Master
• Counterparty information - KYC
• Corporate Actions
• Golden / Single source truth
– Often changing in structure,
• e.g. new products
– Often High volume
• How is this typically solved today?
8
9. Current Implementations
• What do reference data solutions look like today?
• Storage
– Relational Database or Caching Technologies
• Replication
– ETL or Messaging
• Complex, Costly and Brittle
– Maintenance
• schema changes
• infrastructure
– Multiple technologies
9
10. Why MongoDB?
• What features in MongoDB are ideally suited for
Global replicated reference data systems?
1. Dynamic and flexible schema
10
11. Relational: All Data is Column/Row
IssID
IssuerName
PVCurrency
117883
DWS
Vietnam
Fund
USD
69461
Independence
III
Cdo
Ltd
USD
102862
Zamano
Plc
EUR
73277
Green
Way
BMD
65134
First
European
Growth
Inc.
CHF
SecID
EventID
Company_Mee9ng
IssID
762288
407341
AGM
117883
81198
243459
SDCHG
69461
422999
410626
AGM
102862
422999
243440
SDCHG
102862
75128
20056
ISCHG
65134
11
13. Benefits of MongoDB’s Document Model
• Expressiveness
of
Data
Modeling
– A
single
document
can
express
and
encompass
a
wide
variety
of
noTons
• Flexible
Modeling
– No
need
to
migrate
for
simple
extensions
• Simplifica9on
of
Data
Modeling
– Fewer
collecTons
as
most
data
can
be
encapsulated
in
a
single
document
• Easier
Development
– Developers
understand
documents
as
it
maps
well
to
their
data
structures
• Faster
Time
to
Market
– Agile
development
means
faster
results
And
enables
beEer
data
locality
=>
faster
performance
and
scaling
13
14. Why MongoDB?
• What features in MongoDB are ideally suited for
Globally replicated reference data systems?
1. Dynamic and flexible schema
2. Built in replication and high availability
14
15. High Availability
• Automated replication and failover
• Multi-data center support
• Improved operational simplicity (e.g., HW swaps)
• Data durability and consistency
15
17. Why MongoDB?
• What features in MongoDB are ideally suited for
Globally replicated reference data systems?
1. Dynamic and flexible schema
2. Built in replication and high availability
3. Tag Aware Sharding (Geo)
17
20. 1. Case Study: Global Broker Dealer -
Reference Data Management
ETL
ETL
ETL ETL
ETL
Feeds & Batch data ETL
• Pricing Source
• Accounts Master Data ETL
• Securities Master (RDBMS)
• Corporate actions
Each represents
• People $
• Hardware $
Destination
• License $
Data
• Reg penalty $
(RDBMS)
• & other downstream
problems
20
21. Solution with MongoDB
Real-time
Real-time
Real-time Real-time
Real-time
Feeds & Batch data Real-time
• Pricing
• Accounts Real-time
MongoDB
• Securities Master
Primary
• Corporate actions
Each represents
• No people $
• Less hardware $
• Less license $
• No penalty $ MongoDB
• & many less Secondaries
problems
21
22. Case Study: Global investment bank
Distribute reference data globally in real-time for
fast local accessing and querying
Problem Why MongoDB Results
• Delays up to 20 hours in • Dynamic schema • Will save about
distributing data via ETL management: update $40,000,000 in costs and
• Had to manage 20 immediately & in one penalties over 5 years
distributed systems with place
same data • Greater throughput means
• Auto-replication: data charging more to internal
• Incurring regulatory distributed in real-time groups
penalties from missing
SLAs • Both cache and database: • Network and disk speed is
• Stale data caused cache always up-to-date the bottleneck, not
operational issues software and applications
• Simple data modeling &
analysis: easy changes
and understanding
22
23. Summary
• Why MongoDB for Reference Data solutions?
1. Dynamic and flexible schema
2. Built in replication and high availability
3. Tag Aware Sharding (Geo)
23
24. Q&A
Up And Coming
FS webinar in April - Tick database
• http://www.10gen.com/webinar/using-mongodb-as-tick-database
FS webinar in April - Risk
• http://www.10gen.com/webinar/mitigate-risk-with-mongodb
MongoDB Days - London, San Francisco, and NYC
• http://www.10gen.com/events
MongoDB 2.4 Release
• http://www.mongodb.org/downloads
25. Key Features
JSON Data Model with Auto-Sharding for
Dynamic Schema Horizontal Scalability
Rich, Document-Based
Flexible, Full Index Support
Queries
Built-In Replication and
Fast, In-Place Updates
High Availability
Aggregation Framework and
GridFS for Large File Storage
Map/Reduce
25
26. For More Information
Resource User Data Management
Location
MongoDB Downloads www.mongodb.org/download
Free Online Training education.10gen.com
Webinars and Events www.10gen.com/events
White Papers www.10gen.com/white-papers
Customer Case Studies www.10gen.com/customers
Presentations www.10gen.com/presentations
Documentation docs.mongodb.org
Additional Info info@10gen.com
26