Webinar: General Technical Overview of MongoDB for Dev Teams

Senior Solutions Architect, 10gen
Mark Helmstetter
General Technical
Overview of MongoDB for
Dev Teams

Agenda
• A bit of history…
• Introducing MongoDB
• MongoDB CRUD Operations
• Working with Indexes in MongoDB
• Aggregation Framework
• MongoDB Ecosystem

RDBMS Strengths
• Data stored is very compact
• Rigid schemas have led to powerful query
capabilities
• Data is optimized for joins and storage
• Robust ecosystem of tools, libraries, integrations
• 40 years old!

It hides what you’re really
doing

It makes development hard
Relational
Database
Object Relational
Mapping
Application
Code XML Config DB Schema

Enter “Big Data”
• Gartner defines it with 3Vs
• Volume
– Vast amounts of data being collected
• Variety
– Evolving data
– Uncontrolled formats, no single schema
– Unknown at design time
• Velocity
– Inbound data speed
– Fast read/write operations
– Low latency

Mapping Big Data to RDBMS
• Difficult to store uncontrolled data formats
• Scaling via big iron or custom data
marts/partitioning schemes
• Schema must be known at design time
• Impedance mismatch with agile development
and deployment techniques
• Doesn‟t map well to native language constructs

Goals
• Scale horizontally over commodity systems
• Incorporate what works for RDBMSs
– Rich data models, ad-hoc queries, full indexes
• Drop what doesn‟t work well
– Complex schemas, multi-row transactions, complex joins
• Do not homogenize APIs
• Match agile development and deployment
workflows

Key Features
• Data represented as documents (JSON)
– Flexible-schema
– Storage/wire format is BSON
• Full CRUD support (Create, Read, Update, Delete)
– Atomic in-place updates
– Ad-hoc queries: Equality, RegEx, Ranges,Geospatial,Text
• Secondary indexes
• Replication – redundancy, failover
• Sharding – partitioning for read/write scalability

Terminology
RDBMS MongoDB
Database Database
Table Collection
Row Document

Document Oriented, Dynamic
Schema
name: “jeff”,
eyes: “blue”,
height: 72,
boss: “ben”}
{name: “brendan”,
aliases: [“el diablo”]}
name: “ben”,
hat: ”yes”}
{name: “matt”,
pizza: “DiGiorno”,
height: 72,
boss: 555.555.1212}
{name: “will”,
eyes: “blue”,
birthplace: “NY”,
aliases: [“bill”, “la
ciacco”],
gender: ”???”,
boss: ”ben”}

MongoDB is full featured
MongoDB
{
first_name: „Paul‟,
surname: „Miller‟,
city: „London‟,
location: [45.123,47.232],
cars: [
{ model: „Bently‟,
year: 1973,
value: 100000, … },
{ model: „Rolls Royce‟,
year: 1965,
value: 330000, … }
}
}
Rich Queries
• Find Paul’s cars
• Find everybody who owns a car built
between 1970 and 1980
Geospatial • Find all of the car owners in London
Text Search
• Find all the cars described as having
leather seats
Aggregation
• What’s the average value of Paul’s car
collection
Map Reduce
• What is the ownership pattern of colors
by geography over time? (is purple
trending up in China?)

Developers are more
productive
Application
Code
Relational
Database
Object Relational
Mapping
XML Config DB Schema

MongoDB Scales Better
Vs.
Price
Scale
Price
Scale

> use blog
> var post = {
author: "markh",
date: new Date(),
title: "My First Blog Post",
body: "MongoDB is an open source document-oriented database
system developed and supported by 10gen.",
tags: ["MongoDB"]
}
> db.posts.insert(post)
Create – insert()

> var post = {
"_id" : 1,
"author" : "markh",
"title" : "MetLife builds innovative customer service application
using MongoDB",
"body" : "MetLife built a working prototype in two weeks and
was live in U.S. call centers in 90 days.",
"date" : ISODate("2013-05-07T00:00:00.000Z"),
"tags" : ["MongoDB", "Database", "Big Data"]
}
> db.posts.update({ _id:1 }, post, { upsert : true })
// upsert option with <query> argument on _id -- same as save()
Upsert

> db.posts.findOne()
{
"_id" : ObjectId("517ed472e14b748a44dc0549"),
"author" : "markh",
"date" : ISODate("2013-05-29T20:13:37.349Z"),
"title" : "My First Blog Post",
"body" : "MongoDB is an open source document-oriented
database system developed and supported by 10gen.",
"tags" : ["MongoDB"]
}
// _id is unique but can be anything you like
Read – findOne()

> db.posts.findOne({author:"markh"})
{
"author" : "markh",
"date" : ISODate("2013-05-29T20:13:37.349Z"),
}
Read – findOne()

> db.posts.find({author:"markh"})
{
"author" : "markh",
"date" : ISODate("2013-05-29T20:13:37.349Z"),
}
…
Read – find()

> db.posts.find( { author:"markh" } , { _id:0, author:1 } )
{ "author" : "markh" }
Projections

// Ranges: $lt, $lte, $gt, $gte
> db.posts.find( {
author : "markh",
date : {
$gte : ISODate("2013-01-01T00:00:00.000Z"),
$lt : ISODate("2013-05-10T02:50:27.874Z")
}
})
{ "title" : "MetLife builds innovative customer service application
using MongoDB",
"date" : ISODate("2013-05-07T00:00:00Z") }
Range Query Operators

// Set: $in, $all, $nin
> db.posts.find( {
author : "markh",
tags : { $all : [ "MongoDB", "Database", "Big Data" ]}
}, { title:1 })
{
"_id" : 1,
"title" : "MetLife builds innovative customer service application
using MongoDB"
}
Set Query Operators

// Logical: $or, $and, $not, $nor
> db.posts.find( {
author : "markh",
$or : [
{ title : /first/i },
{ body : /prototype/i }
]
} ).count()
2
Logical Query Operators

> var post = {
author: "markh",
date : ISODate("2013-05-29T20:13:37.349Z"),
title: "MongoDB is the #1 NoSQL Database",
body: "MongoDB is an open source document-oriented database
system developed and supported by 10gen.",
tags: ["MongoDB"]
}
> db.posts.update(
{ _id:ObjectId("517ed472e14b748a44dc0549") },
post
)
Update

> db.posts.update(
{ _id: 1},
{ $set: {slug:"mongodb"} }
)
> db.posts.update(
{ _id: 1 },
{ $unset: {slug:1} }
)
> db.posts.update(
{ _id: 1 },
{ $inc: {revision:1} }
)
Update Operators

// Array Update Operators
// $set, $unset
// $push, $pop, $pull, $pullAll, $addToSet
> comment = {
userid: "fred",
date: new Date(),
text: "I totally agree!"
}
> db.posts.update( { _id: 1 }, {
$push: {comments: comment}
});
Array Update Operators

book = {
_id: 123456789,
title: "MongoDB: The Definitive Guide",
available: 1,
checkout: [ { by: "joe", date: ISODate("2012-10-15") } ]
}
> db.books.findAndModify ( {
query: { _id: 123456789, available: { $gt: 0 } },
update: {$inc: { available: -1 },
$push: { checkout: { by: "abc", date: new Date() } } }
} )
findAndModify

// Delete EVERYTHING in the collection
> db.posts.remove()
// Delete based on query criteria
> db.posts.remove( { _id:3 } )
> db.posts.remove( { author:"john" } )
// Only delete one document using "justOne" option
> db.posts.remove( { author:"john" }, true )
Delete

Working with Indexes in
MongoDB

Indexes are the single biggest
tunable performance factor in
MongoDB
Absent or suboptimal indexes
are the most common
avoidable MongoDB
performance problem.

// Default (unique) index on _id
// create an ascending index on “author”
> db.posts.ensureIndex({author:1})
> db.posts.find({author:"markh"})
Indexing a single value

// Arrays of values (multikey indexes)
tags: [“MongoDB”, “Database”, “NoSQL”]
> db.posts.ensureIndex({ tags: 1 })
> db.posts.find({tags: "MongoDB"})
Indexing Array Elements

// Multiple fields (compound key indexes)
> db.posts.ensureIndex({
author: 1,
date: -1
})
> db.posts.find( {
author : "markh",
date : {
$gte : ISODate("2013-01-01T00:00:00.000Z"),
$lt : ISODate("2013-05-10T02:50:27.874Z")
}
})
Compound Indexes

// Subdocuments
{
…
comments: [ {
userid: "fred",
date: new Date(),
text: "I totally agree!"
} ]
}
db.posts.ensureIndex( { "comments.userid": 1 } )
Indexing Subdocuments

> db.pubs.insert( [
{name: "Ned Devine's",
loc : { type : "Point", coordinates : [ -77.410018, 38.9516 ] } },
{name: "O'Sullivan's",
{name: "Old Brogue",
{name: "Finnegan's",
loc : { type : "Point", coordinates : [ -77.395275, 38.952734 ] } }
])
> db.pubs.ensureIndex( { loc : "2dsphere" }
Geospatial Indexes

> db.pubs.find( { loc : { $near :
{ $geometry :
{ type : "Point" ,
coordinates: [ -77.386164, 38.971088 ] } },
$maxDistance : 500
} } )
{ "name" : "O'Sullivan's",
"loc" : { "type" : "Point", "coordinates" : [ -77.386329, 38.970754 ] } }
Geospatial Indexes

> db.pubs.insert( [
{ description: "Irish Pub, serving great Irish beers",
menuItems: ["Bangers & Mash", "Fish & Chips"] },
{ description:"Sports bar and restaurant",
menuItems: [ "Burgers", "Wings", "Pizza"] },
{ description:"Beer joint",
menuItems: [ "Belgian Beers", "Micro Brews", "Cask Ales"] }
])
> db.pubs.ensureIndex( {
description: "text",
menuItems: "text"
} )
Text Indexes

> db.pubs.runCommand( "text", { search: "beer",
project: { _id: 0 } } ).results
[
{"score" : 1.5, "obj" : {
"description" : "Beer joint",
"menuItems" : ["Belgian Beers", "Micro Brews", "Cask Ales"]
} },
{"score" : 0.5833333333333334, "obj" : {
"description" : "Irish Pub, serving great Irish beers",
"menuItems" : ["Bangers & Mash", "Fish & Chips", "Sheperd's Pie"]
} } ]
Text Indexes

// username in users collection must be unique
db.users.ensureIndex( { username: 1 }, { unique: true } )
Uniqueness Constraints

// Only documents with comments.userid will be indexed
db.posts.ensureIndex(
{ "comments.userid": 1 } ,
{ sparse: true }
)
// Allow multiple documents to not have a sku field
db.products.ensureIndex( {sku: 1}, {unique: true, sparse: true} )
Sparse Indexes

Pipeline
• Process a stream of documents
– Original input is a collection
– Final output is a result document
• Series of operators
– Filter or transform data
– Input/output chain
ps ax | grep mongod | head -n 1

Pipeline Operators
• $match
• $project
• $group
• $unwind
• $sort
• $limit
• $skip

Aggregation Framework
SQL Statement MongoDB Aggregation Statment
SELECT COUNT(*) AS count
FROM orders
db.orders.aggregate( [
{ $group: { _id: null,
count: { $sum: 1 } } }
] )
SELECT SUM(price) AS total
FROM orders
{ $group: { _id: null,
total: { $sum: "$price" } } }
] )
SELECT cust_id, SUM(price) AS total
FROM orders
GROUP BY cust_id
{ $group: { _id: "$cust_id",
] )
SELECT cust_id, SUM(price) as total
FROM orders
WHERE status = 'A'
GROUP BY cust_id
{ $match: { status: 'A' } },
{ $group: { _id: "$cust_id",
] )

{
title: "The Great Gatsby",
pages: 218,
language: "English"
}
{
title: "War and Peace",
pages: 1440,
language: "Russian"
}
{
title: "Atlas Shrugged",
pages: 1088,
language: "English"
}
Matching Field Values
{ $match: {
language: "Russian"
}}
{
pages: 1440,
language: "Russian"
}

{
pages: 218,
language: "English"
}
{
pages: 1440,
language: "Russian"
}
{
pages: 1088,
language: "English"
}
Matching with Query
Operators
{ $match: {
pages: { $gt: 1000 }
}}
{
pages: 1440,
language: "Russian"
}
{
pages: 1088,
language: "English"
}

{
pages: 218,
language: "English"
}
{
pages: 1440,
language: "Russian"
}
{
pages: 1088,
language: "English"
}
Calculating an Average
{ $group: {
_id: "$language",
avgPages: { $avg:
"$pages" }
}}
{
_id: "Russian",
avgPages: 1440
}
{
_id: "English",
avgPages: 653
}

{
pages: 218,
language: "English"
}
{
pages: 1440,
language: "Russian”
}
{
pages: 1088,
language: "English"
}
Summating Fields and
Counting
{ $group: {
_id: "$language",
numTitles: { $sum: 1 },
sumPages: { $sum: "$pages" }
}}
{
_id: "Russian",
numTitles: 1,
sumPages: 1440
}
{
_id: "English",
numTitles: 2,
sumPages: 1306
}

MongoDB Ecosystem
Drivers
Libraries /
Framework
s
Applications
Monitoring
Tools
Admin
Tools
ETL Tools
Deployment
Tools
IDEs
Communit
y
http://docs.mongodb.org/ecosystem/

Global Community
4,000,000+
MongoDB Downloads
50,000+
Online Education Registrants
15,000+
MongoDB User Group Members
14,000+
MongoDB Monitoring Service (MMS) Users
10,000+
Annual MongoDB DaysAttendees

MongoDB is the leading NoSQL
Database

Use Cases
Content
Management
Operational
Intelligence
High Volume
Data Feeds
E-Commerce
User Data
Management
http://www.10gen.com/customers

• http://docs.mongodb.org/
Online Manual

• http://education.10gen.com
Free Online MongoDB Training

Drivers
Community Supported
Drivers
Drivers
Drivers for most popular
programming languages and
frameworks
Java
Python
Perl
Ruby
Haskell
JavaScript
PowerShell
Prolog
Racket

• CakePHP
• CodeIgniter
• Fat-Free
• Lithium
• Symfony 2
• TechMVC
• Vork
• TURBOPY
• Grails
• Pyramid
• Django
Ecosystem – Web Application
Frameworks

• Morphia
• Doctrine
• Kohana
• Yii
• MongoRecord
• ActiveMongo
• Comfi
• Mandango
• MongoDB PHP ODM
• Mongoose
• Spring Data MongoDB
• Hibernate OGM
• MongoEngine
• Salat
Ecosystem – ODM

• Drupal
• Locomotive
• MongoPress
• Calipso
• Frameworks
– Web frameworks with CMS capabilitites
– v7files / Vermongo
Ecosystem – CMS

• MMS
• Munin MongoDB Plugin
• Nagios
• Ganglia
• Zabbix
• Scout
• Server Density
Ecosystem – Monitoring Tools

• Edda
• Fang of Mongo
• Umongo
• MongoExplorer
• MongoHub
• MongoVision
• MongoVUE
• mViewer
• Opricot
• PHPMoAdmin
• RockMongo
• Genghis
• Meclipse
• Humongous
• MongoDB ODA
plugin for BIRT
• Monad
Management
for MongoDB
Ecosystem – Admin UIs

• HTTP Console / Simple REST Interface (built-in)
• Sleepy Mongoose (Python)
• DrowsyDromedary (Ruby)
• MongoDB Rest (Node.js)
• MongoDB Java REST Server
Ecosystem – HTTP/REST
Interfaces

• GridFS
– Thundergrid
– Mongofilesystem
• Deployment /
Mangagement
– puppet-mongodb
– chef-mongodb
• MongoDB Pagination
• Morph
• Mongodloid
• SimpleMongoPhp
• Mongo-Queue-PHP
Ecosystem – Miscelenia

• MongoDB is a full-featured, general purpose
database
• Flexible document data model provides
– Greater flexibility
– Greater agility
• MongoDB is built for "Big Data"
• Healthy, strong, and growing ecosystem /
community
Conclusion

• Presentations / Webinars
– http://www.10gen.com/presentations
• Customer Success Stories
– http://www.10gen.com/customers
• MongoDB Documentation
– http://docs.mongodb.org/
• Community
– https://groups.google.com/group/mongodb-user
– http://stackoverflow.com/questions/tagged/mongodb
Resources

Senior Solutions Architect, 10gen
Mark Helmstetter
Thank You
@helmstetter

Webinar: General Technical Overview of MongoDB for Dev Teams

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Webinar: General Technical Overview of MongoDB for Dev Teams

Similaire à Webinar: General Technical Overview of MongoDB for Dev Teams (20)

Plus de MongoDB

Plus de MongoDB (20)

Dernier

Dernier (20)

Webinar: General Technical Overview of MongoDB for Dev Teams

Notes de l'éditeur