SlideShare a Scribd company logo
1 of 75
#MDBlocal
ETL for Pros: Getting Data Into
MongoDB
Sam Harley, Senior Solutions Architect
#MDBlocal
Agenda
● Discuss the MongoDB document model
● Discuss how data can be migrated to MongoDB
● Discuss important MongoDB schema design considerations
● Talk through common approaches to MongoDB schema migration
● Outline how threading and batching can be leveraged for ETL purposes
#MDBlocal
Introduction
● At some point, most applications need to batch-load large amounts of data
○ Billions of rows
○ Huge initial load
○ Daily updates (CDC)
● This is typically facilitated using an ETL platform
#MDBlocal
Schema Design Principles
#MDBlocal
The Document Model
{
first_name: ‘Paul’,
surname: ‘Miller’,
city: ‘London’,
location: {
type : ‘Point’,
coordinates : [45.123,47.232]
},
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
]
}
MongoDB
RDBMS
#MDBlocal
Documents are Rich Data Structures
{
first_name: ‘Paul’,
surname: ‘Miller’,
cell: 447557505611,
city: ‘London’,
location: {
type : ‘Point’,
coordinates : [45.123,47.232]
},
Profession: [‘banking’, ‘finance’, ‘trader’],
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
]
}
Fields can contain an array
of sub-documents
Typed field values
Fields can contain arrays
Number
Fields
#MDBlocal
Two Big ‘Rules’
● Consider how the data will be used
● Data that works together lives together
#MDBlocal
Example > Guitar Collectors
ERD
Relational DB MongoDB
Table → Collection
Column → Field
Row → Document
Terminology
#MDBlocal
Example > Guitar Collectors
Relational
Consider your consumers:
how will this data be used?
#MDBlocal
Example > Guitar Collectors
MongoDB - Embedding
#MDBlocal
Example > Guitar Collectors
Typical queries from a consumer of this data might be:
● “What guitars does Aimee Doe own?”
● “Show me all the people who own a Gibson
Les Paul of any age”
● “How many people within 100 miles of Seattle
own guitars made before 1970?”
Consider how the data will be used
X
#MDBlocal
Example > Guitar Collectors
Consider how the data will be used
Typical queries from a consumer of this data might be:
● “What guitars does Aimee Doe own?”
● “Show me all the people who own a Gibson
Les Paul of any age”
● “How many people within 100 miles of Seattle
own guitars made before 1970?”
#MDBlocal
• Ab Initio
• Talend
• Pentaho
• Informatica
How Do I Migrate Data?
#MDBlocal
WYOC (Write Your Own Code)
More challenging, but you have got
ultimate control
How Do I Migrate Data?
#MDBlocal
How do I do it efficiently?
Image: Julian Lim
#MDBlocal
An Example
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
#MDBlocal
{
"first_name" : "James",
"last_name" : "Bond",
"address" : "Nassau, Bahamas, US",
"items" : [
{ "qty": 1, "description" : "Aston Martin", "price" : 120000 },
{ "qty": 1, "description" : "Dinner Jacket", "price" : 4000 },
{ "qty": 3, "description" : "Champagne Veuve-Cliquot", "price": 200 }
],
"tracking" : [
{ "timestamp" : "1985-04-30 09:48:00", "status": "ORDERED" }
]
}
#MDBlocal
{
"first_name" : "James",
"last_name" : "Bond",
"address" : "Nassau, Bahamas, US",
"items" : [
{ "qty": 1, "description" : "Aston Martin", "price" : 120000 },
{ "qty": 1, "description" : "Dinner Jacket", "price" : 4000 },
{ "qty": 3, "description" : "Champagne Veuve-Cliquot", "price": 200 }
],
"tracking" : [
{ "timestamp" : "1985-04-30 09:48:00", "status": "ORDERED" }
]
}
#MDBlocal
{
"first_name" : "James",
"last_name" : "Bond",
"address" : "Nassau, Bahamas, US",
"items" : [
{ "qty": 1, "description" : "Aston Martin", "price" : 120000 },
{ "qty": 1, "description" : "Dinner Jacket", "price" : 4000 },
{ "qty": 3, "description" : "Champagne Veuve-Cliquot", "price": 200 }
],
"tracking" : [
{ "timestamp" : "1985-04-30 09:48:00", "status": "ORDERED" }
]
}
#MDBlocal
{
"first_name" : "James",
"last_name" : "Bond",
"address" : "Nassau, Bahamas, US",
"items" : [
{ "qty": 1, "description" : "Aston Martin", "price" : 120000 },
{ "qty": 1, "description" : "Dinner Jacket", "price" : 4000 },
{ "qty": 3, "description" : "Champagne Veuve-Cliquot", "price": 200 }
],
"tracking" : [
{ "timestamp" : "1985-04-30 09:48:00", "status": "ORDERED" }
]
}
#MDBlocal
Three Common Approaches
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
#MDBlocal
Approach #1 – Nested Queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id
doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id
doc.tracking.push (y)
mongodb.insert (doc)
#MDBlocal
Approach #1 – Nested Queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id
doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id
doc.tracking.push (y)
mongodb.insert (doc)
#MDBlocal
Approach #1 – Nested Queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id
doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id
doc.tracking.push (y)
mongodb.insert (doc)
#MDBlocal
Approach #1 – Nested Queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id
doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id
doc.tracking.push (y)
mongodb.insert (doc)
#MDBlocal
Approach #1 – Nested Queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id
doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id
doc.tracking.push (y)
mongodb.insert (doc)
#MDBlocal
Approach #1 – Nested Queries
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id
doc.items.push (y)
for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id
doc.tracking.push (y)
mongodb.insert (doc)
#MDBlocal
Fan-In & Fan-Out
Number of Database Operations per MongoDB Document
ETL Job
1 per order
+ 2
1
#MDBlocal
Benchmark Results
14.5
0
5
10
15
20
Time (min)
Nested Queries
• 1 million orders
• 10 million line items
• 3 million tracking states
• MySQL (local) to MongoDB (local)
• Python
#MDBlocal
Approach #2 – Build Documents In DB
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
mongodb.insert (doc)
for y in SELECT * FROM ITEMS
mongodb.update ({"_id" : y.order_id},
{"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING
mongodb.update ({"_id" : z.order_id},
{"$push" : {"tracking" : z}})
#MDBlocal
Approach #2 – Build Documents In DB
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
mongodb.insert (doc)
for y in SELECT * FROM ITEMS
mongodb.update ({"_id" : y.order_id},
{"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING
mongodb.update ({"_id" : z.order_id},
{"$push" : {"tracking" : z}})
#MDBlocal
Approach #2 – Build Documents In DB
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
mongodb.insert (doc)
for y in SELECT * FROM ITEMS
mongodb.update ({"_id" : y.order_id},
{"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING
mongodb.update ({"_id" : z.order_id},
{"$push" : {"tracking" : z}})
#MDBlocal
Approach #2 – Build Documents In DB
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
mongodb.insert (doc)
for y in SELECT * FROM ITEMS
mongodb.update ({"_id" : y.order_id},
{"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING
mongodb.update ({"_id" : z.order_id},
{"$push" : {"tracking" : z}})
#MDBlocal
Approach #2 – Build Documents In DB
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
mongodb.insert (doc)
for y in SELECT * FROM ITEMS
mongodb.update ({"_id" : y.order_id},
{"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING
mongodb.update ({"_id" : z.order_id},
{"$push" : {"tracking" : z}})
#MDBlocal
Approach #2 – Build Documents In DB
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
mongodb.insert (doc)
for y in SELECT * FROM ITEMS
mongodb.update ({"_id" : y.order_id},
{"$push" : {"items" : y}})
for z in SELECT * FROM TRACKING
mongodb.update ({"_id" : z.order_id},
{"$push" : {"tracking" : z}})
#MDBlocal
Fan-In & Fan-Out
Number of Database Operations per MongoDB Document
ETL Job3 per order 1 + i + t
i = # of assoc. line items
t = # of assoc. tracking rows
#MDBlocal
Benchmark Results
14.5
95.9
0
20
40
60
80
100
120
Time (min)
Nested Queries Build in DB
#MDBlocal
Approach #3 – Load It All Into Memory
db_items = SELECT * FROM ITEMS
db_tracking = SELECT * FROM TRACKING
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
doc.items.pushAll (db_items.getAll(x.order_id))
doc.tracking.pushAll (db_tracking.getAll(x.order_id))
mongodb.insert (doc)
#MDBlocal
Approach #3 – Load It All Into Memory
db_items = SELECT * FROM ITEMS
db_tracking = SELECT * FROM TRACKING
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
doc.items.pushAll (db_items.getAll(x.order_id))
doc.tracking.pushAll (db_tracking.getAll(x.order_id))
mongodb.insert (doc)
#MDBlocal
Approach #3 – Load It All Into Memory
db_items = SELECT * FROM ITEMS
db_tracking = SELECT * FROM TRACKING
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
doc.items.pushAll (db_items.getAll(x.order_id))
doc.tracking.pushAll (db_tracking.getAll(x.order_id))
mongodb.insert (doc)
#MDBlocal
Approach #3 – Load It All Into Memory
db_items = SELECT * FROM ITEMS
db_tracking = SELECT * FROM TRACKING
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
doc.items.pushAll (db_items.getAll(x.order_id))
doc.tracking.pushAll (db_tracking.getAll(x.order_id))
mongodb.insert (doc)
#MDBlocal
Approach #3 – Load It All Into Memory
db_items = SELECT * FROM ITEMS
db_tracking = SELECT * FROM TRACKING
for x in SELECT * FROM ORDERS
doc = { "first_name" : x.first_name,
"last_name" : x.last_name,
"address" : x.address,
"items" : [], "tracking" : [] }
doc.items.pushAll (db_items.getAll(x.order_id))
doc.tracking.pushAll (db_tracking.getAll(x.order_id))
mongodb.insert (doc)
#MDBlocal
Fan-In & Fan-Out
Number of Database Operations per MongoDB Document
ETL Job3 per order 1
#MDBlocal
Benchmark Results
14.5
95.9
8.5
0
20
40
60
80
100
120
Time (min)
Nested Queries Build in DB Lookup from Memory
#MDBlocal
Co-Iteration
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "James",
"last_name" : "Bond",
"address" : "Nassau, Bahamas, US"
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "James",
"last_name" : "Bond",
"address" : "Nassau, Bahamas, US",
"items" : [
{ ..., "description" : "Aston Martin", ... }
]
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "James",
"last_name" : "Bond",
"address" : "Nassau, Bahamas, US",
"items" : [
{ ..., "description" : "Aston Martin", ... },
{ ..., "description" : "Dinner Jacket", ... }
]
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "James",
"last_name" : "Bond",
"address" : "Nassau, Bahamas, US",
"items" : [
{ ..., "description" : "Aston Martin", ... },
{ ..., "description" : "Dinner Jacket", ... },
{ ..., "description" : "Champagne...", ... }
]
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "James",
"last_name" : "Bond",
"address" : "Nassau, Bahamas, US",
"items" : [
{ ..., "description" : "Aston Martin", ... },
{ ..., "description" : "Dinner Jacket", ... },
{ ..., "description" : "Champagne...", ... }
]
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "James",
"last_name" : "Bond",
"address" : "Nassau, Bahamas, US",
"items" : [
{ ..., "description" : "Aston Martin", ... },
{ ..., "description" : "Dinner Jacket", ... },
{ ..., "description" : "Champagne...", ... }
],
"tracking" : [
{ ... "1985-04-30 09:48:00", ... "ORDERED" }
]
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "James",
"last_name" : "Bond",
"address" : "Nassau, Bahamas, US",
"items" : [
{ ..., "description" : "Aston Martin", ... },
{ ..., "description" : "Dinner Jacket", ... },
{ ..., "description" : "Champagne...", ... }
],
"tracking" : [
{ ... "1985-04-30 09:48:00", ... "ORDERED" }
]
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "Ernst",
"last_name" : "Blofeldt",
"address" : "Caracas, Venezuela"
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "Ernst",
"last_name" : "Blofeldt",
"address" : "Caracas, Venezuela",
"items" : [
{ ..., "description" : "Cat Food", ... }
]
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "Ernst",
"last_name" : "Blofeldt",
"address" : "Caracas, Venezuela",
"items" : [
{ ..., "description" : "Cat Food", ... },
{ ..., "description" : "Launch Pad", ... }
]
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "Ernst",
"last_name" : "Blofeldt",
"address" : "Caracas, Venezuela",
"items" : [
{ ..., "description" : "Cat Food", ... },
{ ..., "description" : "Launch Pad", ... }
],
"tracking" : [
{ ... "1985-04-23 01:30:22", ... "ORDERED" }
]
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "Ernst",
"last_name" : "Blofeldt",
"address" : "Caracas, Venezuela",
"items" : [
{ ..., "description" : "Cat Food", ... },
{ ..., "description" : "Launch Pad", ... }
],
"tracking" : [
{ ... "1985-04-23 01:30:22", ... "ORDERED" },
{ ... "1985-04-25 08:30:00", ... "SHIPPED" }
]
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
{
"first_name" : "Ernst",
"last_name" : "Blofeldt",
"address" : "Caracas, Venezuela",
"items" : [
{ ..., "description" : "Cat Food", ... },
{ ..., "description" : "Launch Pad", ... }
],
"tracking" : [
{ ... "1985-04-23 01:30:22", ... "ORDERED" },
{ ... "1985-04-25 08:30:00", ... "SHIPPED" },
{ ... "1985-05-14 21:37:00", .. "DELIVERED" }
]
}
#MDBlocal
ORDER
S
TRACKING
ITEM
S
ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS
1 James Bond Nassau, Bahamas, US
2 Ernst Blofeldt Caracas, Venezuela
ID ORDER_ID QTY DESCRIPTION PRICE
1 1 1 Aston Martin 120,000
2 1 1 Dinner Jacket 4,000
3 1 3 Champagne Veuve-Cliquot 200
4 2 100 Cat Food 1
5 2 1 Launch Pad 1,000,000
ORDER_ID TIMESTAMP STATUS
1 1985-04-30 09:48:00 ORDERED
2 1985-04-23 01:30:22 ORDERED
2 1985-04-25 08:30:00 SHIPPED
2 1985-05-14 21:37:00 DELIVERED
Done!
#MDBlocal
Fan-In & Fan-Out
Number of Database Operations per MongoDB Document
ETL Job3 per order 1
#MDBlocal
Benchmark Results
14.5
95.9
8.5 8.1
0
20
40
60
80
100
120
Time (min)
Nested Queries Build in DB Lookup from Memory Co-Iteration
#MDBlocal
Oh, And One More Thing …
#MDBlocal
Threading & Batching
Batch
Size
Threads
Throughput
#MDBlocal
Fan-In & Fan-Out
Number of Database Operations per MongoDB Document
ETL Job3 per order
1 for every
1000 orders
#MDBlocal
Benchmark Results
14.5 9.1
95.9
36.2
8.5 48.1 3.9
0
20
40
60
80
100
120
Simple Batch = 1000
Nested Queries Build in DB Lookup from Memory Co-Iteration
#MDBlocal
Summary
● Remember the two big rules regarding MongoDB schema design
○ Consider how the data will be used
○ Data that works together lives together
● Consider the common approaches to MongoDB schema migration
● Prototype, prototype, prototype
● Don’t forget to utilise batching and threading
#MDBlocal
What Now?
university.mongodb.comAsk The Experts
Sydney MongoDB
User Group
#MDBlocal
THANK YOU
FOR JOINING!

More Related Content

What's hot

Webinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible SchemasWebinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible SchemasMongoDB
 
Strongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasStrongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasNorberto Leite
 
Doing More with MongoDB Aggregation
Doing More with MongoDB AggregationDoing More with MongoDB Aggregation
Doing More with MongoDB AggregationMongoDB
 
MongoDB Europe 2016 - Debugging MongoDB Performance
MongoDB Europe 2016 - Debugging MongoDB PerformanceMongoDB Europe 2016 - Debugging MongoDB Performance
MongoDB Europe 2016 - Debugging MongoDB PerformanceMongoDB
 
Embedding a language into string interpolator
Embedding a language into string interpolatorEmbedding a language into string interpolator
Embedding a language into string interpolatorMichael Limansky
 
Introduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWIntroduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWAnkur Raina
 
Back to Basics Webinar 3: Schema Design Thinking in Documents
 Back to Basics Webinar 3: Schema Design Thinking in Documents Back to Basics Webinar 3: Schema Design Thinking in Documents
Back to Basics Webinar 3: Schema Design Thinking in DocumentsMongoDB
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsMongoDB
 
Dev Jumpstart: Schema Design Best Practices
Dev Jumpstart: Schema Design Best PracticesDev Jumpstart: Schema Design Best Practices
Dev Jumpstart: Schema Design Best PracticesMongoDB
 
MongoDB - Back to Basics - La tua prima Applicazione
MongoDB - Back to Basics - La tua prima ApplicazioneMongoDB - Back to Basics - La tua prima Applicazione
MongoDB - Back to Basics - La tua prima ApplicazioneMassimo Brignoli
 
Back to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB ApplicationBack to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB ApplicationMongoDB
 
Back to Basics Webinar 3 - Thinking in Documents
Back to Basics Webinar 3 - Thinking in DocumentsBack to Basics Webinar 3 - Thinking in Documents
Back to Basics Webinar 3 - Thinking in DocumentsJoe Drumgoole
 
MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMike Friedman
 
Back to Basics Webinar 1 - Introduction to NoSQL
Back to Basics Webinar 1 - Introduction to NoSQLBack to Basics Webinar 1 - Introduction to NoSQL
Back to Basics Webinar 1 - Introduction to NoSQLJoe Drumgoole
 
Mongo DB schema design patterns
Mongo DB schema design patternsMongo DB schema design patterns
Mongo DB schema design patternsjoergreichert
 
Building Your First App with MongoDB
Building Your First App with MongoDBBuilding Your First App with MongoDB
Building Your First App with MongoDBMongoDB
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDBrogerbodamer
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseMike Dirolf
 

What's hot (20)

Indexing
IndexingIndexing
Indexing
 
Webinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible SchemasWebinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible Schemas
 
Strongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasStrongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible Schemas
 
Doing More with MongoDB Aggregation
Doing More with MongoDB AggregationDoing More with MongoDB Aggregation
Doing More with MongoDB Aggregation
 
MongoDB Europe 2016 - Debugging MongoDB Performance
MongoDB Europe 2016 - Debugging MongoDB PerformanceMongoDB Europe 2016 - Debugging MongoDB Performance
MongoDB Europe 2016 - Debugging MongoDB Performance
 
Embedding a language into string interpolator
Embedding a language into string interpolatorEmbedding a language into string interpolator
Embedding a language into string interpolator
 
Introduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWIntroduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUW
 
Back to Basics Webinar 3: Schema Design Thinking in Documents
 Back to Basics Webinar 3: Schema Design Thinking in Documents Back to Basics Webinar 3: Schema Design Thinking in Documents
Back to Basics Webinar 3: Schema Design Thinking in Documents
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in Documents
 
Dev Jumpstart: Schema Design Best Practices
Dev Jumpstart: Schema Design Best PracticesDev Jumpstart: Schema Design Best Practices
Dev Jumpstart: Schema Design Best Practices
 
MongoDB - Back to Basics - La tua prima Applicazione
MongoDB - Back to Basics - La tua prima ApplicazioneMongoDB - Back to Basics - La tua prima Applicazione
MongoDB - Back to Basics - La tua prima Applicazione
 
Back to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB ApplicationBack to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB Application
 
Back to Basics Webinar 3 - Thinking in Documents
Back to Basics Webinar 3 - Thinking in DocumentsBack to Basics Webinar 3 - Thinking in Documents
Back to Basics Webinar 3 - Thinking in Documents
 
MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World Examples
 
Back to Basics Webinar 1 - Introduction to NoSQL
Back to Basics Webinar 1 - Introduction to NoSQLBack to Basics Webinar 1 - Introduction to NoSQL
Back to Basics Webinar 1 - Introduction to NoSQL
 
MongoDB 3.2 - Analytics
MongoDB 3.2  - AnalyticsMongoDB 3.2  - Analytics
MongoDB 3.2 - Analytics
 
Mongo DB schema design patterns
Mongo DB schema design patternsMongo DB schema design patterns
Mongo DB schema design patterns
 
Building Your First App with MongoDB
Building Your First App with MongoDBBuilding Your First App with MongoDB
Building Your First App with MongoDB
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 

Similar to ETL for Pros: Getting Data Into MongoDB

Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsMongoDB
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichNorberto Leite
 
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Keshav Murthy
 
Mongo Web Apps: OSCON 2011
Mongo Web Apps: OSCON 2011Mongo Web Apps: OSCON 2011
Mongo Web Apps: OSCON 2011rogerbodamer
 
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB
 
How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6Maxime Beugnet
 
[MongoDB.local Bengaluru 2018] Keynote
[MongoDB.local Bengaluru 2018] Keynote[MongoDB.local Bengaluru 2018] Keynote
[MongoDB.local Bengaluru 2018] KeynoteMongoDB
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentationMurat Çakal
 
Mongodb intro
Mongodb introMongodb intro
Mongodb introchristkv
 
Online | MongoDB Atlas on GCP Workshop
Online | MongoDB Atlas on GCP Workshop Online | MongoDB Atlas on GCP Workshop
Online | MongoDB Atlas on GCP Workshop Natasha Wilson
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation FrameworkMongoDB
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling rogerbodamer
 
MongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop ConnectorMongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop ConnectorMongoDB
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMongoDB
 
Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB MongoDB
 
Webinar: Schema Design
Webinar: Schema DesignWebinar: Schema Design
Webinar: Schema DesignMongoDB
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsMongoDB
 
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...MongoDB
 
Building Your First MongoDB Application
Building Your First MongoDB ApplicationBuilding Your First MongoDB Application
Building Your First MongoDB ApplicationTugdual Grall
 

Similar to ETL for Pros: Getting Data Into MongoDB (20)

Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev Teams
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days Munich
 
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
 
Mongo Web Apps: OSCON 2011
Mongo Web Apps: OSCON 2011Mongo Web Apps: OSCON 2011
Mongo Web Apps: OSCON 2011
 
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
 
How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6
 
[MongoDB.local Bengaluru 2018] Keynote
[MongoDB.local Bengaluru 2018] Keynote[MongoDB.local Bengaluru 2018] Keynote
[MongoDB.local Bengaluru 2018] Keynote
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentation
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
Online | MongoDB Atlas on GCP Workshop
Online | MongoDB Atlas on GCP Workshop Online | MongoDB Atlas on GCP Workshop
Online | MongoDB Atlas on GCP Workshop
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling
 
MongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop ConnectorMongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
MongoDB Days Silicon Valley: MongoDB and the Hadoop Connector
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB
 
Webinar: Schema Design
Webinar: Schema DesignWebinar: Schema Design
Webinar: Schema Design
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation Options
 
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
 
Building Your First MongoDB Application
Building Your First MongoDB ApplicationBuilding Your First MongoDB Application
Building Your First MongoDB Application
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 

Recently uploaded (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 

ETL for Pros: Getting Data Into MongoDB

  • 1. #MDBlocal ETL for Pros: Getting Data Into MongoDB Sam Harley, Senior Solutions Architect
  • 2. #MDBlocal Agenda ● Discuss the MongoDB document model ● Discuss how data can be migrated to MongoDB ● Discuss important MongoDB schema design considerations ● Talk through common approaches to MongoDB schema migration ● Outline how threading and batching can be leveraged for ETL purposes
  • 3. #MDBlocal Introduction ● At some point, most applications need to batch-load large amounts of data ○ Billions of rows ○ Huge initial load ○ Daily updates (CDC) ● This is typically facilitated using an ETL platform
  • 5. #MDBlocal The Document Model { first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: { type : ‘Point’, coordinates : [45.123,47.232] }, cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ] } MongoDB RDBMS
  • 6. #MDBlocal Documents are Rich Data Structures { first_name: ‘Paul’, surname: ‘Miller’, cell: 447557505611, city: ‘London’, location: { type : ‘Point’, coordinates : [45.123,47.232] }, Profession: [‘banking’, ‘finance’, ‘trader’], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ] } Fields can contain an array of sub-documents Typed field values Fields can contain arrays Number Fields
  • 7. #MDBlocal Two Big ‘Rules’ ● Consider how the data will be used ● Data that works together lives together
  • 8. #MDBlocal Example > Guitar Collectors ERD Relational DB MongoDB Table → Collection Column → Field Row → Document Terminology
  • 9. #MDBlocal Example > Guitar Collectors Relational Consider your consumers: how will this data be used?
  • 10. #MDBlocal Example > Guitar Collectors MongoDB - Embedding
  • 11. #MDBlocal Example > Guitar Collectors Typical queries from a consumer of this data might be: ● “What guitars does Aimee Doe own?” ● “Show me all the people who own a Gibson Les Paul of any age” ● “How many people within 100 miles of Seattle own guitars made before 1970?” Consider how the data will be used X
  • 12. #MDBlocal Example > Guitar Collectors Consider how the data will be used Typical queries from a consumer of this data might be: ● “What guitars does Aimee Doe own?” ● “Show me all the people who own a Gibson Les Paul of any age” ● “How many people within 100 miles of Seattle own guitars made before 1970?”
  • 13. #MDBlocal • Ab Initio • Talend • Pentaho • Informatica How Do I Migrate Data?
  • 14. #MDBlocal WYOC (Write Your Own Code) More challenging, but you have got ultimate control How Do I Migrate Data?
  • 15. #MDBlocal How do I do it efficiently? Image: Julian Lim
  • 17. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED
  • 18. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED
  • 19. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED
  • 20. #MDBlocal { "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { "qty": 1, "description" : "Aston Martin", "price" : 120000 }, { "qty": 1, "description" : "Dinner Jacket", "price" : 4000 }, { "qty": 3, "description" : "Champagne Veuve-Cliquot", "price": 200 } ], "tracking" : [ { "timestamp" : "1985-04-30 09:48:00", "status": "ORDERED" } ] }
  • 21. #MDBlocal { "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { "qty": 1, "description" : "Aston Martin", "price" : 120000 }, { "qty": 1, "description" : "Dinner Jacket", "price" : 4000 }, { "qty": 3, "description" : "Champagne Veuve-Cliquot", "price": 200 } ], "tracking" : [ { "timestamp" : "1985-04-30 09:48:00", "status": "ORDERED" } ] }
  • 22. #MDBlocal { "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { "qty": 1, "description" : "Aston Martin", "price" : 120000 }, { "qty": 1, "description" : "Dinner Jacket", "price" : 4000 }, { "qty": 3, "description" : "Champagne Veuve-Cliquot", "price": 200 } ], "tracking" : [ { "timestamp" : "1985-04-30 09:48:00", "status": "ORDERED" } ] }
  • 23. #MDBlocal { "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { "qty": 1, "description" : "Aston Martin", "price" : 120000 }, { "qty": 1, "description" : "Dinner Jacket", "price" : 4000 }, { "qty": 3, "description" : "Champagne Veuve-Cliquot", "price": 200 } ], "tracking" : [ { "timestamp" : "1985-04-30 09:48:00", "status": "ORDERED" } ] }
  • 25. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED
  • 26. #MDBlocal Approach #1 – Nested Queries for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y) for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y) mongodb.insert (doc)
  • 27. #MDBlocal Approach #1 – Nested Queries for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y) for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y) mongodb.insert (doc)
  • 28. #MDBlocal Approach #1 – Nested Queries for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y) for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y) mongodb.insert (doc)
  • 29. #MDBlocal Approach #1 – Nested Queries for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y) for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y) mongodb.insert (doc)
  • 30. #MDBlocal Approach #1 – Nested Queries for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y) for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y) mongodb.insert (doc)
  • 31. #MDBlocal Approach #1 – Nested Queries for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } for y in SELECT * FROM ITEMS WHERE ORDER_ID = x.order_id doc.items.push (y) for z in SELECT * FROM TRACKING WHERE ORDER_ID = x.order_id doc.tracking.push (y) mongodb.insert (doc)
  • 32. #MDBlocal Fan-In & Fan-Out Number of Database Operations per MongoDB Document ETL Job 1 per order + 2 1
  • 33. #MDBlocal Benchmark Results 14.5 0 5 10 15 20 Time (min) Nested Queries • 1 million orders • 10 million line items • 3 million tracking states • MySQL (local) to MongoDB (local) • Python
  • 34. #MDBlocal Approach #2 – Build Documents In DB for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc) for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}}) for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
  • 35. #MDBlocal Approach #2 – Build Documents In DB for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc) for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}}) for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
  • 36. #MDBlocal Approach #2 – Build Documents In DB for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc) for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}}) for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
  • 37. #MDBlocal Approach #2 – Build Documents In DB for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc) for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}}) for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
  • 38. #MDBlocal Approach #2 – Build Documents In DB for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc) for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}}) for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
  • 39. #MDBlocal Approach #2 – Build Documents In DB for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } mongodb.insert (doc) for y in SELECT * FROM ITEMS mongodb.update ({"_id" : y.order_id}, {"$push" : {"items" : y}}) for z in SELECT * FROM TRACKING mongodb.update ({"_id" : z.order_id}, {"$push" : {"tracking" : z}})
  • 40. #MDBlocal Fan-In & Fan-Out Number of Database Operations per MongoDB Document ETL Job3 per order 1 + i + t i = # of assoc. line items t = # of assoc. tracking rows
  • 42. #MDBlocal Approach #3 – Load It All Into Memory db_items = SELECT * FROM ITEMS db_tracking = SELECT * FROM TRACKING for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } doc.items.pushAll (db_items.getAll(x.order_id)) doc.tracking.pushAll (db_tracking.getAll(x.order_id)) mongodb.insert (doc)
  • 43. #MDBlocal Approach #3 – Load It All Into Memory db_items = SELECT * FROM ITEMS db_tracking = SELECT * FROM TRACKING for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } doc.items.pushAll (db_items.getAll(x.order_id)) doc.tracking.pushAll (db_tracking.getAll(x.order_id)) mongodb.insert (doc)
  • 44. #MDBlocal Approach #3 – Load It All Into Memory db_items = SELECT * FROM ITEMS db_tracking = SELECT * FROM TRACKING for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } doc.items.pushAll (db_items.getAll(x.order_id)) doc.tracking.pushAll (db_tracking.getAll(x.order_id)) mongodb.insert (doc)
  • 45. #MDBlocal Approach #3 – Load It All Into Memory db_items = SELECT * FROM ITEMS db_tracking = SELECT * FROM TRACKING for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } doc.items.pushAll (db_items.getAll(x.order_id)) doc.tracking.pushAll (db_tracking.getAll(x.order_id)) mongodb.insert (doc)
  • 46. #MDBlocal Approach #3 – Load It All Into Memory db_items = SELECT * FROM ITEMS db_tracking = SELECT * FROM TRACKING for x in SELECT * FROM ORDERS doc = { "first_name" : x.first_name, "last_name" : x.last_name, "address" : x.address, "items" : [], "tracking" : [] } doc.items.pushAll (db_items.getAll(x.order_id)) doc.tracking.pushAll (db_tracking.getAll(x.order_id)) mongodb.insert (doc)
  • 47. #MDBlocal Fan-In & Fan-Out Number of Database Operations per MongoDB Document ETL Job3 per order 1
  • 50. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED
  • 51. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED
  • 52. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US" }
  • 53. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { ..., "description" : "Aston Martin", ... } ] }
  • 54. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { ..., "description" : "Aston Martin", ... }, { ..., "description" : "Dinner Jacket", ... } ] }
  • 55. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { ..., "description" : "Aston Martin", ... }, { ..., "description" : "Dinner Jacket", ... }, { ..., "description" : "Champagne...", ... } ] }
  • 56. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { ..., "description" : "Aston Martin", ... }, { ..., "description" : "Dinner Jacket", ... }, { ..., "description" : "Champagne...", ... } ] }
  • 57. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { ..., "description" : "Aston Martin", ... }, { ..., "description" : "Dinner Jacket", ... }, { ..., "description" : "Champagne...", ... } ], "tracking" : [ { ... "1985-04-30 09:48:00", ... "ORDERED" } ] }
  • 58. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "James", "last_name" : "Bond", "address" : "Nassau, Bahamas, US", "items" : [ { ..., "description" : "Aston Martin", ... }, { ..., "description" : "Dinner Jacket", ... }, { ..., "description" : "Champagne...", ... } ], "tracking" : [ { ... "1985-04-30 09:48:00", ... "ORDERED" } ] }
  • 59. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED
  • 60. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela" }
  • 61. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela", "items" : [ { ..., "description" : "Cat Food", ... } ] }
  • 62. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela", "items" : [ { ..., "description" : "Cat Food", ... }, { ..., "description" : "Launch Pad", ... } ] }
  • 63. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela", "items" : [ { ..., "description" : "Cat Food", ... }, { ..., "description" : "Launch Pad", ... } ], "tracking" : [ { ... "1985-04-23 01:30:22", ... "ORDERED" } ] }
  • 64. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela", "items" : [ { ..., "description" : "Cat Food", ... }, { ..., "description" : "Launch Pad", ... } ], "tracking" : [ { ... "1985-04-23 01:30:22", ... "ORDERED" }, { ... "1985-04-25 08:30:00", ... "SHIPPED" } ] }
  • 65. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED { "first_name" : "Ernst", "last_name" : "Blofeldt", "address" : "Caracas, Venezuela", "items" : [ { ..., "description" : "Cat Food", ... }, { ..., "description" : "Launch Pad", ... } ], "tracking" : [ { ... "1985-04-23 01:30:22", ... "ORDERED" }, { ... "1985-04-25 08:30:00", ... "SHIPPED" }, { ... "1985-05-14 21:37:00", .. "DELIVERED" } ] }
  • 66. #MDBlocal ORDER S TRACKING ITEM S ID FIRST_NAME LAST_NAME SHIPPING_ADDRESS 1 James Bond Nassau, Bahamas, US 2 Ernst Blofeldt Caracas, Venezuela ID ORDER_ID QTY DESCRIPTION PRICE 1 1 1 Aston Martin 120,000 2 1 1 Dinner Jacket 4,000 3 1 3 Champagne Veuve-Cliquot 200 4 2 100 Cat Food 1 5 2 1 Launch Pad 1,000,000 ORDER_ID TIMESTAMP STATUS 1 1985-04-30 09:48:00 ORDERED 2 1985-04-23 01:30:22 ORDERED 2 1985-04-25 08:30:00 SHIPPED 2 1985-05-14 21:37:00 DELIVERED Done!
  • 67. #MDBlocal Fan-In & Fan-Out Number of Database Operations per MongoDB Document ETL Job3 per order 1
  • 68. #MDBlocal Benchmark Results 14.5 95.9 8.5 8.1 0 20 40 60 80 100 120 Time (min) Nested Queries Build in DB Lookup from Memory Co-Iteration
  • 69. #MDBlocal Oh, And One More Thing …
  • 71. #MDBlocal Fan-In & Fan-Out Number of Database Operations per MongoDB Document ETL Job3 per order 1 for every 1000 orders
  • 72. #MDBlocal Benchmark Results 14.5 9.1 95.9 36.2 8.5 48.1 3.9 0 20 40 60 80 100 120 Simple Batch = 1000 Nested Queries Build in DB Lookup from Memory Co-Iteration
  • 73. #MDBlocal Summary ● Remember the two big rules regarding MongoDB schema design ○ Consider how the data will be used ○ Data that works together lives together ● Consider the common approaches to MongoDB schema migration ● Prototype, prototype, prototype ● Don’t forget to utilise batching and threading
  • 74. #MDBlocal What Now? university.mongodb.comAsk The Experts Sydney MongoDB User Group