3. Here’s a definition:
MongoDB (from humongous) is an open-source, highperformance, scalable, general purpose database. It is
used by organizations of all sizes to power online
applications where low latency and high availability are
critical requirements of the system.
4. You can have at most two of these properties for
any shared-data system. Dr. Eric A. Brewer, 2000
5. Main characteristics
• Document based
• Schemaless
• Open source (on GitHub)
• High performance
• Horizontally scalable
• Full featured
10. Document oriented
{
_id : 1,
nome : 'João',
idade : 25,
genero : 'Masculino'
}
MongoDB stores data as document in a
binary representation called BSON
(Binary JSON)
Size: up to 16 MB
14. Relational schema design
• Large ERD diagrams
• Create table statements
• ORM to map tables to objects
• Tables just to join tables together
• Lots of revision and alter table statements until we
get it just right
15. In a MongoDB based app we
start building our app
and let the schema evolve.
21. Inserting a document
> user = {name : ‘marcelo’, age : 29, gender : ‘Male’}
> db.users.insert(user)
>
• No collection creation needed!
22. Querying a document
> db.users.findOne()
{
"_id" : ObjectId("5269d66271de67aa7c3c41b4"),
"name" : “marcelo",
"age" : 29,
“gender" : “male"
}
•
•
•
•
_id is the primary key in MongoDB
Automatically indexed
Automatically created as an ObjectId if not provided
Any unique immutable value could be used
40. Replica Sets
• A replica set is a group of mongod instances that host the
same data set
• Replication provides redundancy and increases data
availability
• The primary accepts all write operations from clients (only
one primary allowed)
• Replication can be used to increase read capacity
• Asynchronous replication
• Automatic failover
44. Sharding
Issues of scaling:
• High query rates can exhaust the CPU capacity of the server
• Larger data sets exceed the storage capacity of a single
machine
• Working set sizes larger than the system’s RAM stress the I/O
capacity of disk drives
Vertical scaling X Sharding
45. Sharding
Horizontally Scalable
• Sharding is the process of storing data across multiple machines
• Each shard is an independent database, and collectively, the shards make up a
single logical database
47. Range Based Sharding
• Supports more efficient range queries
• Results in an uneven distribution of data
• Monotonically increasing keys should be avoided