Introduction to MongoDB

Intro To MongoDB
Buzz Moschetti
Enterprise Architect, MongoDB
buzz.moschetti@mongodb.com
@buzzmoschetti

Who is Talking To You?
• Yes, I use “Buzz” on my business cards
• Former Investment Bank Chief Architect at JPMorganChase
and Bear Stearns
• Over 30 years of designing and building systems
• Big and small
• Super-specialized to broadly useful in any vertical
• “Traditional” to completely disruptive
• Advocate of language leverage and strong factoring
• Inventor of perl DBI/DBD
• Not an award winner for PowerPoint
• Still programming – using emacs, of course

Agenda
• What is MongoDB?
• What are some good use cases?
• How do I use it?
• How do I deploy it?

MongoDB: The Leading NoSQL Database
Document
Data Model
Open-
Source
Fully Featured
High Performance
Scalable
{
name: “John Smith”,
pfxs: [“Dr.”,”Mr.”],
address: “10 3rd St.”,
phones: [
{ number: “555-1212”,
type: “land” },
{ number: “444-1212”,
type: “mobile” }
]
}

5
The best way to run
MongoDB
Automated.
Supported.
Secured.
Features beyond those in the
community edition:
Enterprise-Grade Support
Commercial License
Ops Manager or Cloud Manager Premium
Encrypted & In-Memory Storage Engines
MongoDB Compass
BI Connector (SQL Bridge)
Advanced Security
Platform Certification
On-Demand Training
MongoDB Enterprise Edition

Company Vital Stats
500+ employees 2000+ customers
Over $311 million in funding
Offices in NY & Palo Alto and
across EMEA, and APAC

RDBMS MongoDB
Database Database
Table Collection
Index Index
Row Document
Column Field
Join Embedding & Linking & $lookup
Terminology

{
_id: “123”,
title: "MongoDB: The Definitive Guide",
authors: [
{ _id: "kchodorow", name: "Kristina Chodorow“ },
{ _id: "mdirold", name: “Mike Dirolf“ }
],
published_date: ISODate(”2010-09-24”),
pages: 216,
language: "English",
thumbnail: BinData(0,"AREhMQ=="),
publisher: {
name: "O’Reilly Media",
founded: 1980,
locations: ["CA”, ”NY” ]
}
}
The Data Is The Schema

> db.authors.find()
{
_id: ”X12",
name: { first: "Kristina”, last: “Chodorow” },
personalData: {
favoritePets: [ “bird”, “dog” ],
awards: [ {name: “Hugo”, when: 1983}, {name: “SSFX”,
when: 1992} ]
}
}
{
_id: ”Y45",
name: { first: ”Mike”, last: “Dirolf” } ,
personalData: {
dob: ISODate(“1970-04-05”)
}
}
Treat Your Data More Like Objects

// Java: maps
DBObject query = new BasicDBObject(”publisher.founded”, 1980));
Map m = collection.findOne(query);
Date pubDate = (Date)m.get(”published_date”); // java.util.Date
List locs = (List)m.get(”locations”);
// Javascript: objects
m = collection.findOne({”publisher.founded” : 1980});
pubDate = m.published_date; // ISODate
year = pubDate.getUTCFullYear();
# Python: dictionaries
m = coll.find_one({”publisher.founded” : 1980 });
pubDate = m[”pubDate”].year # datetime.datetime
Documents Natively Map to Language

12
Traditional Data Design
• Static, Uniform Scalar Data
• Rectangles
• Low-level, physical representation

13
Document Data Design
• Flexible, Rich Shapes
• Objects
• High-level, business representation

15
MongoDB 3.0 Set The Stage…
7x-10x Performance, 50%-80% Less Storage
How: WiredTiger Storage Engine
• Same data model, query language, & ops
• 100% backwards compatible API
• Non-disruptive upgrade
• Storage savings driven by native
compression
• Write performance gains driven by
– Document-level concurrency control
– More efficient use of HW threads
• Much better ability to scale vertically
MongoDB 3.0MongoDB 2.6
Performance

16
MongoDB 3.2 :
Efficient Enterprise MongoDB
• Much better ability to scale vertically
+
• Document Validation Rules
• Encryption at rest
• BI Connector (SQL bridge)
• MongoDB Compass
• New Relic & AppDynamics integration
• Backup snapshots on filesystem
• Advanced Full-text languages
• $lookup (“left outer JOIN”)
More
general-purpose
solutions

17
MongoDB Sweet Spot Use Cases
Big Data
Product &
Asset Catalogs Security &
Fraud
Internet of
Things
Database-as-
a- Service
Mobile
Apps
Customer
Data
Management
Single View Social &
Collaboration
Content
Management
Intelligence
Agencies
Top Investment
and Retail Banks
Top Global
Shipping Company
Top Industrial
Equipment
Manufacturer
Top Media
Company
Top Investment
and Retail Banks
Complex Data
Management
Top Investment
and Retail Banks
Embedded /
ISV
Cushman &
Wakefield

19
https://www.mongodb.com/download-center

20
Unpack and Start The Server
$ tar xf mongodb-osx-x86_64-enterprise-3.2.0.tgz
$ mkdir -p ~/mydb/data
$ mongodb-osx-x86_64-enterprise-3.2.0/bin/mongod
> --dbpath ~/mydb/data
> --logpath ~/mydb/mongod.log
> --fork
about to fork child process, waiting until server is
ready for connections.
forked process: 6517
child process started successfully, parent exiting

21
Verify Operation
$ mongodb-osx-x86_64-enterprise-3.2.0/bin/mongo
MongoDB shell version: 3.2.0
connecting to: 127.0.0.1:27017/test
Server has startup warnings:
2016-01-01T12:44:01.646-0500 I CONTROL [initandlisten]
2016-01-01T12:44:01.646-0500 I CONTROL [initandlisten] ** WARNING:
soft rlimits too low. Number of files is 256, should be at least
1000
MongoDB Enterprise > use mug
switched to db mug
MongoDB Enterprise > db.foo.insert({name:”bob”,hd: new ISODate()});
MongoDB Enterprise > db.foo.insert({name:"buzz"});
MongoDB Enterprise > db.foo.insert({pets:["dog","cat"]});
MongoDB Enterprise > db.foo.find();
{ "_id" : ObjectId("5686cef538ea4981e63111dd"), "name" : "bob", "hd"
: ISODate("2016-01-01T19:09:41.442Z") }
{ "_id" : ObjectId("5686…79d5"), "name" : "buzz" }
{ "_id" : ObjectId("5686…79d6"), "pets" : [ "dog", "cat" ] }

22
The Simple Java App
import com.mongodb.client.*;
import com.mongodb.*;
import java.util.Map;
public class mug1 {
public static void main(String[] args) {
try {
MongoClient mongoClient = new MongoClient();
MongoDatabase db = mongoClient.getDatabase("mug”);
MongoCollection coll = db.getCollection("foo");
MongoCursor c = coll.find().iterator();
while(c.hasNext()) {
Map doc = (Map) c.next();
System.out.println(doc);
}
} catch(Exception e) {
// ...
}
}
}

23
Compile and Run!
$ curl -o mongodb-driver-3.0.4.jar
https://oss.sonatype.org/content/repositories/releases/org
/mongodb/mongodb-driver/3.0.4/mongodb-driver-3.0.4.jar
$ javac –cp mongo-java-driver-3.0.4.jar:. mug1.java
$ java –cp mongo-java-driver-3.0.4.jar:. mug1
(logger output)
Document{{_id=5686cef538ea4981e63111dd, name=bob,
hd=Fri Jan 01 14:09:41 EST 2016}}
Document{{_id=5686c71338ea4981e63111d6, name=buzz}}
Document{{_id=5686c71938ea4981e63111d7, pets=[dog, cat]}}

24
The Same App in python
from pymongo import MongoClient
client = MongoClient()
db = client.mug
coll = db.foo
for c in coll.find():
print c
$ python mug1.py
{u'_id': ObjectId('5686cef538ea4981e63111dd'), u'name': u'bob',
u'hd': datetime.datetime(2016, 1, 1, 19, 9, 41, 442000)}
{u'_id': ObjectId('5686f54b38ea4981e631124c'), u'name': u'buzz'}
{u'_id': ObjectId('5686f55138ea4981e631124d'), u'pets': [u'dog',
u'cat']}

25
…and, as expected in Perl…
$ perl -MMongoDB -MData::Dumper –e 'my $c =
MongoDB::MongoClient->new()->get_database("mug")-
>get_collection("foo")->find(); while($c->has_next()) {
print Dumper($c->next()); }’
$VAR1 = { '_id' => bless( {'value' => '5686cef538ea4981e63111dd’},
'MongoDB::OID' ),
'hd' => bless( {
'local_rd_secs' => 68981,
'rd_nanosecs' => 442000000, // etc
}, 'DateTime' ),
'name' => 'bob'
};
$VAR1 = { '_id' => bless( {'value' => '5686c71338ea4981e63111d6’},
'MongoDB::OID' ),
'name' => 'buzz'
};
$VAR1 = { 'pets' => [ 'dog’,'cat’],
'_id' => bless( {'value' => '5686c71938ea4981e63111d7’},
'MongoDB::OID' )
};

26
Drivers A’Plenty
Java
Python
Perl
Ruby
JavaScript
Haskell
…and more
Factory
Community

Document Validation: Stronger Than
Schema…?
> db.createCollection(”contacts", { "validator":
{ $or: [
{ $and: [ { “vers": 1},
{ ”customer_id": {$type: “string”} }
]
},
{ $and: [ { "vers": 2},
{ ”customer_id": {$type: “string”} },
{ $or: [
{ ”name.f": {$type: “string”},
”name.l": {$type: “string”}}
,
{ ”ssn": {$type: “string”}}
]
}
]
}]});

A Slightly Bigger Example
Relational MongoDB
{ vers: 1,
customer_id : 1,
name : {
“f”:"Mark”,
“l”:"Smith” },
city : "San Francisco",
phones: [ {
number : “1-212-777-1212”,
dnc : true,
type : “home”
},
{
number : “1-212-777-1213”,
type : “cell”
}]
}
Customer
ID
First Name Last Name City
0 John Doe New York
1 Mark Smith San Francisco
2 Jay White Dallas
3 Meagan White London
4 Edward Daniels Boston
Phone Number Type DNC
Customer
ID
1-212-555-1212 home T 0
1-212-555-1213 home T 0
1-212-555-1214 cell F 0
1-212-777-1212 home T 1
1-212-777-1213 cell (null) 1
1-212-888-1212 home F 2

29
MongoDB Queries Are Expressive
SQL select A.did, A.lname, A.hiredate, B.type,
B.number from contact A left outer join phones B
on (B.did = A.did) where b.type = ’home' or
A.hiredate > '2014-02-02'::date
MongoDB CLI db.contacts.find({"$or”: [
{"phones.type":”home”},
{"hiredate": {”$gt": new ISODate("2014-
02-02")}}
]});
Find all contacts with at least one home phone or
hired after 2014-02-02

30
MongoDB Aggregation Is Powerful
Sum the different types of phones and create a list
of the owners if there is more than 1 of that type
> db.contacts.aggregate([
{$unwind: "$phones"}
,{$group: {"_id": "$phones.t", "count": {$sum:1},
"names": {$push: "$name"} }}
,{$match: {"count": {$gt: 1}}}
]);
{ "_id" : "home", "count" : 2, "names" : [
{ "f" : "John", "l" : "Doe" },
{ "f" : "Mark", "l" : "Smith" } ] }
{ "_id" : "cell", "count" : 4, "names" : [
{ "f" : "John", "l" : "Doe" },
{ "f" : "Meagan", "l" : "White" },
{ "f" : "Edward", "l" : "Daniels” }
{ "f" : "Mark", "l" : "Smith" } ] }

31
$lookup: “Left Outer Join++”
> db.leases.aggregate([ ]);
{
"_id" : ObjectId("5642559e0d4f2076a43584fc"),
"leaseID" : "A5",
"sku" : "GD652",
"origDate" : ISODate("2010-01-01T00:00:00Z"),
"histDate" : ISODate("2010-10-28T00:00:00Z"),
"monthlyDue" : 10,
"vers" : 11,
"delinq" : { "d30" : 10, "d60" : 10, "d90" : 60
},
"credit" : 0
}
// 66 more ….
Step 1: Get a sense of the raw material

32
Step 2: Group leases by SKU and capture count and max value of 90
day delinquency
> db.leases.aggregate([
{$group: { _id: "$sku", n:{$sum:1},
max90:{$max:"$delinq.d90"} }}
]);
{ "_id" : "AC775", "n" : 27, "max90" : 20 }
{ "_id" : "AB123", "n" : 26, "max90" : 5 }
{ "_id" : "GD652", "n" : 14, "max90" : 80 }

33
Step 3: Reverse sort and then limit to the top 2
,{$sort: {max90:-1}}
,{$limit: 2}
]);
{ "_id" : "GD652", "n" : 14, "max90" : 80 }
{ "_id" : "AC775", "n" : 27, "max90" : 20 }

34
Step 4: $lookup to product collection and assign to new field
,{$sort: {max90:-1}}
,{$limit: 2}
,{$lookup: { from: "products", localField: "_id", foreignField:
"productSKU", as:"productData"}}
]);
{ "_id" : "GD652”, "n" : 14, "max90" : 80,
"productData" : [
{
"_id" : ObjectId("5642559e0d4f2076a43584b5"),
"productType" : "rigidDumptruck",
"productSKU" : "GD652",
"properties" : {
"model" : "TR45",
"payload" : {
"std" : 45,
"unit" : "UStons"
},

35
Step 5: Trim excess data away, leaving just type
,{$sort: {max90:-1}}
,{$limit: 2}
,{$lookup: { from: "products", localField: "_id",
foreignField: "productSKU", as:"productData"}}
,{$project: { _id:1, n:1, max90:1, type:
"$productData.productType"}}
]);
{"_id”:"GD652”,"n”:14,"max90”:80,"type”:[”dumptruck”]}
{"_id”:"AC775”,"n”:27,"max90”:20,"type”:["loader”]}

37
• Single-click provisioning
• Scaling & upgrades
• Admin tasks
• Monitoring with charts
• Dashboards and alerts on 100+
metrics
• Backup and restore with point-in-
time recovery
• Support for sharded clusters
MongoDB Ops/Cloud Manager

38
MongoDB High Availability
mongod
Application
DRIVER

39
MongoDB High Availability
PRIMARY
Application
DRIVER
secondary
secondary
The Replica Set
• 1 Primary
• 2 – 48 Secondaries
• Greatest failure isolation: Locally
attached storage (spinning or SSD)
• Less failure isolation: SAN, FLASH

40
Automatic, Seamless Failover
xxxxxxxx
Application
DRIVER
PRIMARY
secondary

41
HA and DR Are Isomorphic
PRIMARY
Application
DRIVER
secondary secondary Dual Data
Center HA/DR
Replica Set
secondary
Arbiter
(DC3 or cloud)
Data Center 1 Data Center 2

42
MongoDB Scalability
PRIMARY
Application
DRIVER
secondary
secondary
What If:
1. Workload bottlenecks network or disk?
2. Data footprint starts to get large (e.g. 5TB)?
3. Regulations demand physical domicile of
data in-region?
4. Growth profile uncertain?
5. Budget prohibits buying capacity up-front?

43
Horizontal Scalability Through Sharding
PRIMARY
Application
DRIVER
secondary
secondary
PRIMARY
secondary
secondary
PRIMARY
secondary
secondary
mongos
Three Sharding Models:
1. Range
2. Tag
3. Hash
…
Shard 1
Symbols A-D
Shard 2
Symbols E-H
Shard n
Symbols ?-Z

44
For More Information
Resource Location
Case Studies mongodb.com/customers
Presentations mongodb.com/presentations
Free Online Training education.mongodb.com
Webinars and Events mongodb.com/events
Documentation docs.mongodb.org
MongoDB Downloads mongodb.com/download
Additional Info info@mongodb.com

Introduction to MongoDB

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Introduction to MongoDB

Similar to Introduction to MongoDB (20)

More from MongoDB

More from MongoDB (20)

Recently uploaded

Recently uploaded (20)

Introduction to MongoDB

Editor's Notes