New to MongoDB? This talk will introduce the philosophy and features of MongoDB. We’ll discuss the benefits of the document-based data model that MongoDB offers by walking through how one can build a simple app to store books. We’ll cover inserting, updating, and querying the database of books. This session will jumpstart your knowledge of MongoDB development, providing you with context for the rest of the day's content.
11. MongoDB is a ___________
database
• Document
• Open source
• High performance
• Horizontally scalable
• Full featured
12. Document Database
• Not for .PDF & .DOC files
• Adocument is essentially an associative array
• Document = JSON object
• Document = PHPArray
• Document = Python Dict
• Document = Ruby Hash
• etc
13. Open Source
• MongoDB is an open source project
• On GitHub
• Licensed under theAGPL
• Started & sponsored by MongoDB Inc (formerly
10gen)
• Commercial licenses available
• Contributions welcome
14. High Performance
• Written in C++
• Extensive use of memory-mapped files
i.e. read-through write-through memory caching.
• Runs nearly everywhere
• Data serialized as BSON (fast parsing)
• Full support for primary & secondary indexes
• Document model = less work
17. Full Featured
• Ad Hoc queries
• Real time aggregation
• Rich query capabilities
• Strongly consistent
• Geospatial features
• Support for most programming languages
• Flexible schema
18. Setting Expectations
• What is MongoDB
• How to develop with MongoDB
• Scale with MongoDB
• Analytics
• MMS
• Sharding
• Setting the correct environment
34. In a MongoDB based app
We start building our app
and let the schema evolve
35. MongoDB ERD
Accounts
Alerts
- account
- user
- password
-
refresh_rate
- uri
Messages
- text
- user
- time
- retweets
- from
- to
- body
- attachments
- id
- time
- account_id
- subscribers
- channel
- rate
- period
- metrics:[]
…
43. _id
• _id is the primary key in MongoDB
• Automatically indexed
• Automatically created as an ObjectId if not provided
• Any unique immutable value could be used
44. ObjectId
• ObjectId is a special 12 byte value
• Guaranteed to be unique across your cluster
• ObjectId("50804d0bd94ccab2da652599")
|----ts-----||---mac---||-pid-||----inc-----|
4 3 2 3
51. > db.messages.update({
"_id" : ObjectId("54523d2d25784427c6fabce1") },
{$set: { opened:
{date: ISODate("2012-08-15T22:32:34Z"), user: ’Norberto'}
}
})
>
Using Update to Add a Comment
set new field on the documentwhich is a subdocument
52. > db.message.findOne({"_id" : ObjectId("54523d2d25784427c6fabce1")})
{
"_id" : ObjectId("54523d2d25784427c6fabce1"),
"From" : "norberto@mongodb.com",
"To" : "mongodb-user@googlegroups.com",
"Date" : ISODate("2012-08-15T22:32:34Z"),
"body" : {
"text/plain" : ”Hello Munich, nice to see yalll!"
},
"Subject" : ”Live From MongoDB World”
"opened" : {"date": ISODate("2012-08-15T22:32:34Z"), "user": ’Norberto'}
}
Post with Comment Attached
Find document by primary key
62. Legacy Migration
1. Copy existing schema & some data to MongoDB
2. Iterate schema design development
Measure performance, find bottlenecks, and embed
1. one to one associations first
2. one to many associations next
3. many to many associations
3. Migrate full dataset to new schema
New SoftwareApplication? Embed by default
63. Embedding over Referencing
• Embedding is a bit like pre-joined data
– BSON (Binary JSON) document ops are easy for the
server
• Embed (90/10 following rule of thumb)
– When the “one” or “many” objects are viewed in the
context of their parent
– For performance
– For atomicity
• Reference
– When you need more scaling
– For easy consistency with “many to many” associations
without duplicated data
64. It’s All About Your Application
• Programs+Databases = (Big) DataApplications
• Your schema is the impedance matcher
– Design choices: normalize/denormalize,
reference/embed
– Melds programming with MongoDB for best of both
– Flexible for development and change
• Programs×MongoDB = Great Big DataApplications
We are here to help out on questions, suggestions, tips …
Basically define what do you want to store
das ist beängstigende Scheiße
several differen values to the array example!
For many-to-many associations, eliminate join table using array of references or embedded documents
Choose embedding by default as oppose to referencing.
Referencing is not just the default for relational DBs, there is no other choice.
May you build Great Big Data Applications.
Perhaps you can say inspiring quotes like Ken Thompson, “Play chess with God.”
Ken and I worked on Perceptual Audio Coding, better known as Advanced Audio Coding or AAC as found in the iPod and iPhone.
So I hope that this will inspire you to
“Play music with God”
to design your killer app