2. Welcome to MongoDB Evenings Minneapolis!
Agenda
5:30pm: Pizza, Beer & Soft Drinks
6:00pm: Welcome Joey Taralson, User Engagement, Vidku
Agenda Chris Moses, Regional Director, MongoDB
6:10pm: MongoDB is Cool, But When Should I Use It?
Matt Kalan, Senior Solutions Architect, MongoDB
7:00pm: Medtronic’s Journey With MongoDB
Jeff Lemmerman, Principal Software Engineer, Medtronic
Matthew Chimento, Engineering Manager, Medtronic
7:45pm: Announcements
Q&A
3. MongoDB is Cool,
But When Should I Use It?
Matt Kalan
Sr. SolutionArchitect
matt.kalan@mongodb.com
@matthewkalan
7. The World Has Changed
Data
• Volume
• Velocity
• Variety
Time
• Iterative
• Agile
• Short Cycles
Risk
• Always On
• Scale
• Global
Cost
• Open-Source
• Cloud
• Commodity
10. Document Data Model
Relational MongoDB
{ customer_id : 1,
first_name : "Mark",
last_name : "Smith",
city : "San Francisco",
phones: [
{
number : “1-212-777-1212”,
dnc : true,
type : “home”
},
number : “1-212-777-1213”,
type : “cell”
}]
}
Customer ID First Name Last Name City
0 John Doe New York
1 Mark Smith San Francisco
2 Jay Black Newark
3 Meagan White London
4 Edward Daniels Boston
Phone Number Type DNC Customer ID
1-212-555-1212 home T 0
1-212-555-1213 home T 0
1-212-555-1214 cell F 0
1-212-777-1212 home T 1
1-212-777-1213 cell (null) 1
1-212-888-1212 home F 2
11. Document Model Benefits
{
customer_id : 1,
first_name : "Mark",
last_name : "Smith",
city : "San Francisco",
phones: [
{
number : “1-212-777-1212”,
dnc : true,
type : “home”
},
number : “1-212-777-1213”,
type : “cell”
}]
}
Agility and flexibility
Data model supports business change
Rapidly iterate to meet new requirements
Intuitive, natural data representation
Eliminates ORM layer
Developers are more productive
Reduces the need for joins, disk seeks
Programming is more simple
Performance delivered at scale
12. Rich Functionality
MongoDB
Expressive Queries
• Find anyone with phone # “1-212…”
• Check if the person with number “555…” is on the “do not
call” list
Geospatial
• Find the best offer for the customer at geo coordinates of 42nd
St. and 6th Ave
Text Search • Find all tweets that mention the firm within the last 2 days
Aggregation • Count and sort number of customers by city
Native Binary
JSON support
• Add an additional phone number to Mark Smith’s without
rewriting the document
• Select just the mobile phone number in the list
• Sort on the modified date
{ customer_id : 1,
first_name : "Mark",
last_name : "Smith",
city : "San Francisco",
phones: [ {
number : “1-212-777-1212”,
dnc : true,
type : “home”
},
{
number : “1-212-777-1213”,
type : “cell”
}]
}
Left outer join
($lookup)
• Query for all San Francisco residences, lookup their
transactions, and sum the amount by person
22. Common MongoDB Use Cases
Single View Internet of Things Mobile Real-Time Analytics
Catalog Personalization Content Management
23. 1. Operational Data Store (ODS)
2. Enterprise Data Service
3. Datamart/Cache
4. Master Data Distribution
5. Single Operational View
6. Operationalizing Hadoop
Architecture Patterns
System of Record
System of Engagement
24. Top 15
Global Bank
Kicking Out Oracle
Global bank with 48M customers in 50 countries terminates Oracle ULA &
makes MongoDB database of choice
Problem Why MongoDB ResultsProblem Solution Results
Slow development cycles due to RDBMS’
rigid data model hindering ability to meet
business demands
High TCO for hardware, licenses,
development, and support
(>$50M Oracle ULA)
Poor overall performance of customer-facing
and internal applications
Building dozens of apps on MongoDB,
both net new and migrations from Oracle –
e.g., significant portion of retail banking,
including customer-facing and backoffice
apps, fraud detection, card activation, equity
research content mgt.)
Flexible data model to develop apps quickly
and accommodate diverse data
Ability to scale infrastructure and costs
elastically
Able to cancel Oracle ULA. Evaluating what
apps can be migrated to MongoDB. For new
apps, MongoDB is default choice
Apps built in weeks instead of months or
years, e.g., ebanking app prototyped in 2
weeks and in production in 4 weeks
70% TCO reduction
26. Best Fit for MongoDB over RDBMSs
Data
Variably or unstructured
Hierarchical objects
Geo-coordinates
Disparate sources
Schema changes often
Querying
Real-time analytics &
aggregations
Location-based
Lowest latency
Performance affects user
experience
Known relationships between
entities
Local reading/writing globally
Other requirements
Agile development
Fastest time-to-market
Cloud infrastructure
Data will grow quickly
Lowest latency
Highest throughput
Always on (~99.999%) availability
Lowest TCO
Challenges today with RDBMS
27. Best Fit for MongoDB over NoSQL
Data
Hierarchical objects
Geo-coordinates
Disparate sources
Schema changes often
Querying
Secondary indexes useful
Strong consistency desired
In-DB analytics & aggregations
Geospatial (location-based)
SQL-based access & BI
Other requirements
Robust management tools
Highest read/write concurrency
Lowest TCO
Full application DB
Largest ecosystem
Future proofing & recruiting
Want to influence roadmap
Commercial license desired
28. Potential Challenges with MongoDB Alone
(often can add another component)
Read/write patterns
Graph queries (graph data OK)
Advanced search queries (can add search engine)
Pure data discovery (unknown relationships of
entities)
Disconnected consumers writing to DB server
After schema design, still require atomic cross-
document writes
Require index intersection of many indexes (can add
search engine)
Other requirements
Drop-in replacement for RDBMS (often a
schema change)
Multiple threads desired for each query per
partition (can shard)
30. Notes about transactional requirements
1. Consider the impact of writing without transactions for your use case –
today many people are used to a slight delay online
2. Typically writes for a user in one thread still happen before that same user
reads the same data (e.g. web page confirmation)
3. With RDBMSs, transactions only work well in one DB - you will have the
same issues among distributed systems and at scale
4. There are high performance design patterns, e.g. Event Sourcing, because
RDBMSs do not perform – they are perfect for MongoDB
5. There could be some more app logic with MongoDB for a transaction, but
consider if the benefits outweigh that
32. Contacts
• name
• company
• title
Addresses
• type
• street
• city
• state
• zip_code
Phones
• type
• number
Emails
• type
• address
Thumbnails
• mime_type
• data
Portraits
• mime_type
• data
Groups
• name
N
1
N
1
N
N
N
1
1
1
11
Twitters
• name
• location
• web
• bio
1
1
Logical Entity-Relationships
33. Most common operations
Most common reads
1. (40%) View a list of
contacts
2. (40%) View one contact
record
3. (10%) View all members
in a group
4. (10%) Other reads
Most common writes
1. (70%) Update contact
2. (15%) Update group
membership
3. (15%) Other writes
90% Reads 10% Writes
34. One-to-One Schema Design Choices
contact
• twitter_id
twitter1 1
contact twitter
• contact_id1 1
Redundant to track relationship on both sides
• Both references must be updated for consistency
• May save a fetch?
Contact
• twitter
twitter 1
35. One-to-Many Schema Design Choices
contact
• phone_ids: [ ]
phone1 N
contact phone
• contact_id1 N
Redundant to track relationship on both sides
• Both references must be updated for consistency
• Not possible in relational DBs
• Save a fetch?
Contact
• phones
phone N
36. Many-to-Many Schema Design Choices
group
• contact_ids: [ ]
contactN N
group contact
• group_ids: [ ]N N
Redundant to track relationship on
both sides
• Both references must be updated for
consistency
group
• contacts
contact
N
contact
• groups
group
N
37. Contacts
• name
• company
• title
addresses
• type
• street
• city
• state
• zip_code
phones
• type
• number
emails
• type
• address
thumbnail
• mime_type
• data
Portraits
• mime_type
• data
Groups
• name
N
1
N
1
twitter
• name
• location
• web
• bio
N
N
N
1
1
Optimized for workload
40. User
• user_id
• name
• company
• title
Addresses
• type
• street
• city
• state
• zip_code
Phones
• type
• number
Emails
• type
• address
Item
• item_id
• name
• price
• description
• amount
• quantity
N
1
N
N
1
1
Shopping Cart
• cart_id
• num_items
• total_value 1
1
Logical Entity-Relationships
N
1
41. Most common operations
Most common reads
1. (100%) View cart
Most common writes
1. (50%) Add item
2. (35%) Remove item
3. (15%) Change quantity
50% Reads 50% Writes
42. Shopping Cart
• cart_id
• num_items
• total_value
Item
• item_id
• name
• price
• description
• amount
• quantity
11
Logical Entity-Relationships
N
Users
• user_id
• name
• company
• title
Addresses
• type
• street
• city
• state
• zip_code
Phones
• type
• number
Emails
• type
• address
N
N
N
44. Customer
• customer_id
• name
• company
• title
Addresses
• type
• street
• city
• state
• zip_code
Phones
• type
• number
Emails
• type
• address
Transactions
• account_id
• customer_id
• date
• description
• amount
N
1 1
N
N
N
1
1
Account
• account_id
• acct_type
• open_date
• region_id
• balance
• last_txn_date
N
1
Logical Entity-Relationships
N
1
45. Most common operations
Most common reads
1. (70%) View account
balance
2. (10%) View transactions
3. (20%) Other reads
Most common writes
1. (40%) Update balance
2. (20%) Insert debit card
transaction
3. (20%) Transfer between
accounts
4. (20%) Other writes
40% Reads 60% Writes
46. Satisfies transactional requirements
Account Balances
• Balances: Array[1..*]
• account_id
• balance
• last_txn_dates
Contacts
• customer_id
• name
• company
• title
Addresses
• type
• street
• city
• state
• zip_code
Phones
• type
• number
Emails
• type
• address
Transactions
• account_id
• customer_id
• date
• description
• amount
N
1
N
N
N
1
1
N
1
1N
Account
• account_id
• acct_type
• open_date
• region_id
N
1
48. Adding Instant Transfers to Friends
Most common reads
1. (70%) View account
balance
2. (10%) View transactions
3. (20%) Other reads
Most common writes
1. (40%) Update balance
2. (20%) Insert debit card
transaction
3. (20%) Transfer between
accounts
4. (10%) Instant Transfer
to Friends
5. (10%) Other writes
40% Reads 60% Writes
51. Common Batch Loading Questions
What about loading 100,000 new documents atomically
1. App queries for all data older than a timestamp for the
beginning of the batch load
2. Once load is complete, notify app to update the timestamp
What about upserting 100,000 documents?
1. Copy collection & do upsert on copied collection
2. Once complete, change variable name for name of collection
52. Questions About Versioning?
What if I want to upsert 1000 documents but most of collection stays the same?
1. While loading, app queries for not later than the start of load
2. Insert new documents/versions, with later timestamp
3. Once job done, queries include timestamp at end of the load - return latest
document (sort & limit if appropriate)
What if I want to read and write different historical versions of data?
1. Have a transaction log collection for writing all versions with system &
business timestamps
2. Often also have a materialized view collection with the latest values
(queried most often)
53. Summary Thoughts
• MongoDB is a general purpose database covering most use
cases, with better agility, performance, TCO, and HA
• Often schema design is necessary to determine whether it is a
good fit (esp. for transactional requirements)
• Cross-document transactions can always be done if necessary
but consider the effort vs. benefits of MongoDB
• MongoDB, Inc. is happy to help you determine if it’s a good fit
54. For More Information
Resource Location
Case Studies mongodb.com/customers
Presentations mongodb.com/presentations
Free Online Training education.mongodb.com
Webinars and Events mongodb.com/events
Documentation docs.mongodb.org
MongoDB Downloads mongodb.com/download
Additional Info info@mongodb.com
Options for Transactional Writes Blog Post
Data Modeling Documentation https://docs.mongodb.org/manual/data-modeling/
55. 2nd annual Open Source North Conference — a gathering of
open source enterprise developers and industry experts,
where you can learn, share and connect.
• Thursday, June 9, Normandale Community College
• Speakers from Google, Microsoft, Red Hat, Elastic, MongoDB, Target,
Mozilla, SmartThings, and Code 42 – just to name a few
• Earl Bird Registration opens Monday, April 4 ($175)
• Drawing tonight for 2 free passes!