This document provides an introduction to MongoDB, a non-relational NoSQL database. It discusses what NoSQL databases are and their benefits compared to SQL databases, such as being more scalable and able to handle large, changing datasets. It then describes key features of MongoDB like high performance, rich querying, and horizontal scalability. The document outlines concepts like document structure, collections, and CRUD operations in MongoDB. It also covers topics such as replication, sharding, and installing MongoDB.
2. What is NoSQL?
A form of Database Management System that is non relational
System are schema less, avoid joins and are easy to scale.
NoSQL and Not Only SQL describe an approach to database design that
implements a key-value store, document store, column store or graph format for
data. It is an alternative to the Structured Query Language (SQL) database
prevalent beginning in the 1980s. NoSQL contrasts to databases that adhere to
SQL's relational methods, where data are placed in tables and data schema are
carefully designed before the database is built. NoSQL databases especially
target large sets of distributed data.
3. The Benefits of NoSQL
When compared to relational databases, NoSQL databases are more scalable and
provide superior performance, and their data model addresses several issues that
the relational model is not designed to address:
Large volumes of rapidly changing structured, semi-structured, and
unstructured data
Agile sprints, quick schema iteration, and frequent code pushes
Object-oriented programming that is easy to use and flexible
Geographically distributed scale-out architecture instead of expensive,
monolithic architecture
4. NoSQL vs. SQL Summary
SQL Databases NOSQL Databases
Types One type (SQL database) with minor variations Many different types including key-value
stores, document databases, wide-column
stores, and graph databases
Development History Developed in 1970s to deal with first wave of data storage
applications
Developed in late 2000s to deal with
limitations of SQL databases, especially
scalability, multi-structured data, geo-
distribution and agile development sprints
Examples MySQL, Postgres, Microsoft SQL Server, Oracle Database MongoDB, Cassandra, HBase, Neo4j
Data Storage Model Data stores in multiple columnar fashion in a table only two columns ('key' and 'value')
Schemas Structure and data types are fixed in advance and to store new
data entirely database altered, during which time the database
must be taken offline.
Fully Dynamic, with some enforcing data
validation rules
Scaling Typically vertical scaling is easy but Horizontal Scaling results
validation rules failure like FK.
Simply Horizontal Scaling is possible.
Development Model Mix of open-source Open-source
Supports Transactions Yes In certain circumstances
5. Comparison of RDMS & MongoDB
RDBMS MongoDB
Database Database
Table Collection
Tuple/Row Document
Column Field
Table Join Embedded Documents
Primary Key Primary Key (Default key _id provided by mongodb itself)
Database Server and Client
Mysqld/Oracle mongod
mysql/sqlplus mongo
6. What is MongoDB
MongoDB is an open-source document database that provides high
performance, high availability, and automatic scaling.
The advantages of using documents are:
• Documents (i.e. objects) correspond to native data types in many programming languages.
• Embedded documents and arrays reduce need for expensive joins.
• Dynamic schema supports fluent polymorphism.
Key Features of MongoDB
High Performance [Embedded data model reduce I/O Activity, Index support make query faster ]
Rich Query Language [Data aggregation, Text Search & Geospatial Query]
High Availability [Replica Set provide automatic Failure & data redundancy]
Horizontal Scalability [Horizontal Scalability through Sharding (Also Support for zones)]
Support for Multiple Storage Engines [Like MySql has Storage engine support]
9. Document
MongoDB stores data records as BSON documents. BSON is
a binary representation of JSON documents, though it
contains more data types than JSON
10. Document Structure
The key decision in designing data models for MongoDB applications revolves around
the structure of documents and how the application represents relationships
between data. There are two tools that allow applications to represent these
relationships: references and embedded documents.
References: (Normalized data models.)
11. Embedded Data: (denormalized data models allow applications to retrieve
and manipulate related data in a single database operation)
12. Which One is Better?
• In general, use embedded data models when:
• you have “contains” relationships between entities.
• you have one-to-many relationships between entities. In these relationships the “many” or
child documents always appear with or are viewed in the context of the “one” or parent
documents.
• In general, use normalized data models:
• when embedding would result in duplication of data but would not provide sufficient
read performance advantages to outweigh the implications of the duplication.
• to represent more complex many-to-many relationships.
• to model large hierarchical data sets.
Embedded Normalized
Better performance for read operations More flexibility than embedding
Single atomic read/write operation Client-side applications must issue follow-up queries to
resolve the references
Require more round trips to the server
13. var mydoc = {
_id: ObjectId("5099803df3f4948bd2f98391"),
name: { first: "Alan", last: "Turing" },
birth: new Date('Jun 23, 1912'),
death: new Date('Jun 07, 1954'),
contribs: [ "Turing machine", "Turing test", "Turingery" ],
views : NumberLong(1250000)
}
The above fields have the following data types:
• _id holds an ObjectId.
• name holds an embedded document that contains the fields first and last.
• birth and death hold values of the Date type.
• contribs holds an array of strings.
• views holds a value of the NumberLong type.
14. Field Names
Documents have the following restrictions on field names:
• The field name _id is reserved for use as a primary key; its value
must be unique in the collection, is immutable, and may be of any
type other than an array.
• The field names cannot start with the dollar sign ($) character.
• The field names cannot contain the dot (.) character.
• The field names cannot contain the null character.
Dot Notation
• MongoDB uses the dot notation to access the elements of an array and to
access the fields of an embedded document.
<array>.<index>, "<embedded document>.<field>"
Ex. contribs.2
15. The _id Field
The _id field has the following behavior and constraints:
• By default, MongoDB creates a unique index on the _id field during the creation of a
collection.
• The _id field is always the first field in the documents. If the server receives a
document that does not have the _id field first, then the server will move the field
to the beginning.
• The _id field may contain values of any BSON data type, other than an array.
ObjectId
ObjectIds are small, likely unique, fast to generate, and ordered. ObjectId values
consists of 12-bytes, where the first four bytes are a timestamp that reflect the
ObjectId’s creation, specifically:
• a 4-byte value representing the seconds since the Unix epoch,
• a 3-byte machine identifier,
• a 2-byte process id, and
• a 3-byte counter, starting with a random value.
16. Key Points of Schema Design in MongoDB
Design your schema according to user requirements.
Combine objects into one document if you will use them together. Otherwise
separate them (but make sure there should not be need of joins).
Duplicate the data (but limited) because disk space is cheap as compare to
compute time.
Do joins while write, not on read.
Optimize your schema for most frequent use cases.
Do complex aggregation in the schema.
17. Install MongoDB on Ubuntu
Import the public key used by the package management system.
The Ubuntu package management tools (i.e. dpkg and apt) ensure package
consistency and authenticity by requiring that distributors sign packages with GPG
keys. Issue the following command to import the MongoDB public GPG Key
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
Create a list file for MongoDB.
Create the /etc/apt/sources.list.d/mongodb-org-3.0.list list file using the
command appropriate for your version of Ubuntu:
echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.0
multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list
Reload local package database
Issue the following command to reload the local package database:
sudo apt-get update
18. Install MongoDB on Ubuntu continue....
Install the MongoDB packages.
Install the latest stable version of MongoDB
You can install either the latest stable version of MongoDB or a specific version of
MongoDB.
sudo apt-get install -y mongodb-org
Install a specific release of MongoDB.
To install a specific release, you must specify each component package individually along
with the version number, as in the following example:
sudo apt-get install -y mongodb-org=3.0.15 mongodb-org-server=3.0.15 mongodb-org-
shell=3.0.15 mongodb-org-mongos=3.0.15 mongodb-org-tools=3.0.15
19. Database Operation
If there is no existing database, the following command is used to create a
new database.
>use jisl
Note: If the database already exists, it will return the existing database.
To check the currently selected database, use the command db:
>db
To check the database list, use the command show dbs:
>show dbs
dropDatabase command is used to drop a database. It also deletes the
associated data files
>db.dropDatabase()
20. Collection Operation
To create Collection
>db.createCollection(“user")
To check the created collection
>show collections
To Drop the created collection
>db.user.drop()
21. MongoDB CRUD Operations
Create Operations
Create or insert operations add new documents to a collection. If the collection does not
currently exist, insert operations will create the collection.
Read Operations
Read operations retrieves documents from a collection; i.e. queries a collection for
documents
Update Operations
Update operations modify existing documents in a collection.
Delete Operations
Delete operations remove documents from a collection.
Bulk Write
MongoDB provides the ability to perform write operations in bulk.(Not covered in this PPT)
23. Read Operations
db.collection.find()
Ex.
In
db.inventory.find( { status: { $in: [ "A", "D" ] } } )
AND
db.inventory.find( { status: "A", qty: { $lt: 30 } } )
OR
db.inventory.find( { $or: [ { status: "A" }, { qty: { $lt: 30 } } ] } )
AND with OR: status equals "A" and either qty is less than ($lt) 30 or item starts with the character p
db.inventory.find({status: "A", $or: [ { qty: { $lt: 30 } }, { item: /^p/ } ]})
24. SQL SELECT Statements MongoDB find() Statements
SELECT * FROM people db.people.find()
SELECT id, user_id, status FROM people db.people.find( { }, { user_id: 1, status: 1 } )
SELECT user_id, status FROM people db.people.find( { }, { user_id: 1, status: 1, _id: 0 } )
SELECT * FROM people WHERE status = "A" db.people.find( { status: "A" } )
SELECT user_id, status FROM people WHERE status = "A" db.people.find( { status: "A" }, { user_id: 1, status: 1, _id: 0 } )
SELECT * FROM people WHERE status != "A" db.people.find( { status: { $ne: "A" } } )
SELECT * FROM people WHERE status = "A" AND age = 50 db.people.find( { status: "A", age: 50 } )
SELECT * FROM people WHERE status = "A" OR age = 50 db.people.find( { $or: [ { status: "A" } , { age: 50 } ] } )
25. SELECT * FROM people WHERE age > 25 db.people.find( { age: { $gt: 25 } } )
SELECT * FROM people WHERE age < 25 db.people.find( { age: { $lt: 25 } } )
SELECT * FROM people WHERE age > 25 AND age <= 50 db.people.find( { age: { $gt: 25, $lte: 50 } } )
SELECT * FROM people WHERE user_id like "%bc%"
db.people.find( { user_id: /bc/ } )
-or-
db.people.find( { user_id: { $regex: /bc/ } } )
SELECT * FROM people WHERE user_id like "bc%"
db.people.find( { user_id: /^bc/ } )
-or-
db.people.find( { user_id: { $regex: /^bc/ } } )
SELECT * FROM people WHERE status = "A" ORDER BY user_id ASC db.people.find( { status: "A" } ).sort( { user_id: 1 } )
SELECT * FROM people WHERE status = "A" ORDER BY user_id DESC db.people.find( { status: "A" } ).sort( { user_id: -1 } )
SELECT COUNT(*) FROM people
db.people.count()
or
db.people.find().count()
SELECT COUNT(user_id) FROM people
db.people.count( { user_id: { $exists: true } } )
or
db.people.find( { user_id: { $exists: true } } ).count()
SELECT COUNT(*) FROM people WHERE age > 30
db.people.count( { age: { $gt: 30 } } )
or
db.people.find( { age: { $gt: 30 } } ).count()
SELECT DISTINCT(status) FROM people db.people.distinct( "status" )
SELECT * FROM people LIMIT 1
db.people.findOne()
or
db.people.find().limit(1)
SELECT * FROM people LIMIT 5 SKIP 10 db.people.find().limit(5).skip(10)
EXPLAIN SELECT * FROM people WHERE status = "A" db.people.find( { status: "A" } ).explain()
26. Update Operations
db.collection.updateOne()
db.collection.updateMany()
db.collection.replaceOne()
db.collection.updateOne()
db.inventory.updateOne({ item: "paper" }
,{$set:
{ "size.uom": "cm", status: "P" }
,$currentDate: { lastModified: true }
})
SQL Update Statements MongoDB updateMany() Statements
UPDATE people SET status = "C" WHERE age > 25 db.people.updateMany( { age: { $gt: 25 } }, { $set: { status: "C" } } )
UPDATE people SET age = age + 3 WHERE status = "A" db.people.updateMany( { status: "A" } , { $inc: { age: 3 } } )
27. Delete Operations
db.collection.deleteOne()
db.collection.deleteMany()
db.collection.deleteOne(): It will delete first occurance
db.inventory.deleteOne( { status: "D" } )
db.collection.deleteMany()
db.inventory.deleteMany( { status: "D" } )
Note:
Indexes: Delete operations do not drop indexes, even if deleting all documents
from a collection.
SQL Delete Statements MongoDB deleteMany() Statements
DELETE FROM people WHERE status = "D" db.people.deleteMany( { status: "D" } )
DELETE FROM people db.people.deleteMany({})
28. Thank You……
Impotent Reference link:
https://docs.mongodb.com/manual
TODO Next...
MongoDB Integration With Spring (CURD Application).
MongoBD Administration.
MongoDB Replication.
MongoDB Sharding.
MongoDB Data Models
MongoDB Constratints and Data Validation