This presentation is related to nosql database and nosql database types information. this presentationa also contains discussion about, how mongodb works and mongodb security and mongodb sharding information.
1. NoSQL & MongoDB
MD. TARIQUL ISLAM
LINKEDIN.COM/IN/TARIQUL-ISLAM-RONNIE
GITHUB.COM/TARIQULISLAM
2. Definitions
NoSQL means “Not SQL” or “Not Only SQL” provide the features
more than relational database, some NoSQL database support
SQL like query.
NoSQL provides more flexible data model and has no structure
(table, column, row) like relational Database management system.
Provide more Scalability and flexibility.
NoSQL created for handling the massive volumes of new, rapidly
changing data types — structured, semi-structured, unstructured
and polymorphic data.
NoSQL database is designed to cope with the scale and agility
challenges that face modern applications.
3. Why NOSQL?
NoSQL databases are more scalable and provide superior performance ,when
compared to relational databases.
NoSQL helps the programmer to improve productivity by using this database that
better matches an application's needs.
To improve data access performance via some combination of handling larger
data volumes, reducing latency, and improving throughput.
Object-oriented programming that is easy to use and flexible.
Geographically distributed scale-out architecture instead of expensive, monolithic
architecture
All NoSQL databases claim to be schema-less, which means there is no schema
enforced by the database themselves.
CAP theorem: managing consistency(C), availability(A) and partition toleration(P)
is important. Many NoSQL databases try to provide those options.
4. NoSQL database Type
There are 4 basic types of NoSQL databases:
In-Memory/Key-value Databases
Memcached,Redis,Riak,VoltDB
Document-Oriented Databases
Couchbase, CouchDB, MongoDB
Column Store Databases
Apache Hbase, Cassandra,Google’s BigTable
Graph Databases
InfiniteGraph,Neo4j,OrientDB
5. Key-value database
A key-value database, also called a key-value
store, is the most flexible type of NoSQL database.
• In Key-Value store, there is no schema and the
value of the data is opaque.
• Value are identified and accessed via a key.
• Stored value can be numbers, strings,
counters, JSON, XML, HTML, binaries, images,
short video.
• Application has complete control over what is
stored in the value at NoSQL flexible model.
6. Document Database
A Document database uses documents as the
structure for storage and queries.
Instead of columns with names and data
types that are used in a relational database,
a document contains a description of the
data type and the value for that
description.
Each document can have the same or
different structure.
To add additional types of data to a
document database, there is no need to
modify the entire database schema.
Data can simply be added by adding
objects to the database.
Documents are grouped into “collections,”
which serve a similar purpose to a
relational table.
A document database provides a query
mechanism to search collections for
documents with particular attributes.
7. Column Store Database
A column store database is a type of
database that stores data using a column
oriented model.
Columns store databases use a concept
called a keyspace.
A keyspace is kind of like a schema in
the relational model.
The keyspace contains all the column
families (kind of like tables in the
relational model), which contain rows,
which contain columns.
8. Column Store Database
A column family consists of multiple rows.
Each row can contain a different number
of columns to the other rows.
And the columns don’t have to match
the columns in the other rows. they can
have different column names, data
types.
Each column is contained to its row. It
doesn’t span all rows like in a relational
database.
Each column contains a name/value
pair, along with a timestamp.
In this example uses Unix/Epoch time
for the timestamp.
9. Column Store Database
Each element in the row as column family database:
Row Key: Each row has a unique key, which is a unique identifier for that row.
Column: Each column contains a name, a value, and timestamp.
Name: This is the name of the name/value pair.
Value: This is the value of the name/value pair.
Timestamp: This provides the date and time that the data was inserted. This can
be used to determine the most recent version of data.
10. Graph Database
Graph databases are based on graph theory, it has
nodes, edges, and properties.
Nodes: it represent entities such as people . They
are roughly the equivalent of the record, relation,
or row in a relational database, or the document in
a document database.
Edges: Graphs or relationships, are the lines that
connect nodes to other nodes; they represent the
relationship between them.
Edges are the key concept in graph databases,
representing an abstraction that is not directly
implemented in other systems.
Edges is persist into database for search and read
with write purpose.
Properties: those are germane information to
nodes.
11. Install Mongo DB in windows
Download Mongo DB from this link (https://www.mongodb.com/download-center)
Create the folder C:datadb
12. Terminal Usage for MongoDB server
Creating the MongoDB server service (Open the PowerShell as Administrator)
To run as window service provide this command
13. Terminal Usage for MongoDB server
To ensure the service is installed in window check this
To Access the MongoDB client use this command
14. Basic command for MongoDB
To get the database information from command line
To access the specific database command is
To get the collections information
15. Install MongoDB at Linux
Open the Terminal at Linux and create the folder for mongoDB
cd ~
mkdir –p mongodb-server
cd mongodb-server
Download the release versions for Mongo DB for Linux by this command
curl -O https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-3.6.3.tgz
Extract the mongodb tarball file
tar -zxvf mongodb-linux-x86_64-3.6.3.tgz
Move the file or copy the file by
mv mongodb-linux-x86_64-3.6.3 mongodb
16. Install MongoDB at Linux
Mongodb Binary in bin file at mongodb folder, you can find that at
~/mongodb-server/mongodb/bin
Add the mongodb binary at PATH first for running the mongodb server from terminal
Open ~/.bash_profile or ~/.bashrc or ~/.profile (if those file are not available you must
create this first), Add the line to end of the file
export PATH=<mongodb-install-directory>/bin:$PATH
Mongodb is install at ~/mongodb-server/mongodb, so
<mongodb-install-directory> = mongodb-server/mongodb
So the full bash PATH entry will be
export PATH = mongodb-server/mongodb/bin:$PATH
17. Install MongoDB at Linux
Create the folder for mongodb data store
~$ mkdir –p /data/db
Kill the terminal and Open provide the command
~$ mongod
To get the access to mongo client, you can use this command
~$ mongo
18. MongoDB Administration
Create the user and provide the predefine administration
To connect to mongodb just you have provide the command below
> mongo (to access the mongo client at local host)
Connect to remote server command line command will be
> mongo <hostname>:<port-number>/<database-name> -u <dbuser> -p <dbpassword>
I have created the sample database ad mLab and connect to remote server by provide
command
> mongo ds045795.mlab.com:45795/hrms -u <dbuser> -p <dbpassword>
20. MongoDB Administration
Example of creating the User for database
First I create the normal admin for database
I have created the normal admin using pre-
defined ‘readWrite’ and ‘dbAdmin’, there are
other predefine roles, you can get those
information from built-in roles section-
https://docs.mongodb.com/manual/refere
nce/built-in-roles/#built-in-roles
21. MongoDB Administration
Create the Advance user for MongoDB
CIDR- Classless Inter domain routing
The operation gives rsAdmin the following roles:
the ‘clusterAdmin’ and ‘readAnyDatabase’ roles on the admin
database
the ‘readWrite role’ on the `selected database` database
22. About DATA Model for MongoDB
Embedded Data Model
With MongoDB, we can embed the related data in to single structure or document. Which will
solve the lots of joining problem with tables. This model is know as “de-normalization” models.
Embedded model provides better
performance for read operations, as
well as the ability to request and
retrieve related data in a single
database operation.
Embedded data models make it
possible to update related data in a
single atomic write operation.
23. About Data Model for MongoDB
Normalization Technique
Normalized data models describe relationships using references between documents.
References provides more flexibility than embedding. But, normalized data models can require
more round trips to the server.
MongoDB does not support joins. In
MongoDB some data is denormalized,
or stored with related data
in documents to remove the need for
joins. However, in some cases it makes
sense to store related information in
separate documents, typically in
different collections or databases.
24. About Data model for MongoDB
MongoDB support two types of method to save the Referential Data, those are
Manual references
where you save the _id field of one document in another document as a
reference.
DBRefs
These are references from one document to another using the value of the first
document’s _id field, collection name, and, optionally, its database name.
{ "$ref" : <value>, "$id" : <value>, "$db" : <value> }
25. About Data Model for MongoDB
One to One and One to Many relationship in embedded model is
One to One
One to Many
26. About Data Model for MongoDB
One to Many relationship with Reference model
Main Document Reference Document
27. CRUD Operation In MongoDB
Insert Operation In MongoDB Collection
There is Two function available for MongoDB Insert operation for Collection
db.<Collection-name>.insertOne();
db.<Collection-name>.insertMany()
db.<Collection-name>.insert() (which support insert multiple and single insert
functionality)
Example for Insert Example for insert Many
28. CRUD Operation in MongoDB
Find Query for MongoDB
Normal Find Query
db.<Collection-name>.find({}); [e.g db.inventory.find({‘_id’: ‘2342’})
Normal SQL query
SELECT * FROM inventory where _id=‘2342’
And Condition in Query
db.<Collection-name>.find({ key: value, key: value})
[e,g db.student.find({‘name’: ‘akah’, age: { $lt: 30})]
Normal SQL query
SELECT * FROM student WHERE name=‘akah’ AND age < 30
29. CRUD Operation in MongoDB
Or Condition in Query
db.<Collection-name>.find({$or: []})
[e, g db.student.find({ $or:[{status: ‘active’}, { age: { $lt: 30}}]}) ]
Normal SQL query
SELECT * FROM student WHERE status=‘active’ OR age < 30
In Condition in Query
db.<Collection-name>.find({ <attribute>: { $in: [ ] }})
[e, g db.student.find({ age: { $in:[30, 31]}}) ]
Normal SQL query
SELECT * FROM student WHERE age IN(30,31)
30. CRUD operation in MongoDB
Query for embedded document
Query Will be
db.inventory.find( { "instock.qty": { $gt: 10, $lte: 20 } } )
db.inventory.find( { "instock.qty": 5, "instock.warehouse": "A" } )
31. CRUD operation in MongoDB
There are three function which will are used for update the document
db.<collection-name>.updateOne(<filter>, <update>, <options>)
db .<collection-name>.updateMany(<filter>, <update>, <options>)
db .<collection-name>.replaceOne(<filter>, <replacement>, <options>)
The replaceOne function can have different from
the original document. In the replacement
document, you can omit the _id field since the _id
field is immutable; however, if you do include the
_id field, it must have the same value as the
current value.
db.<collection-name>.updateOne and
db.<collection-name>.updateMany is similar, first
method update one document and second
document update multiple document
32. CRUD operation in MongoDB
Delete Document from Collection
There are two function for delete the document
db.<Collection-name>.deleteOne()
[e,g db.inventory.deleteOne({‘_id’: ‘1234’})
db.<Collection-name>.deleteMany()
[e,g db.inventory.deleteMany({‘item’: ‘bag’})
34. Handling JOIN for MongoDB
Performs a join to an collection in the same database or to filter in documents
we can use $lookup scope in MongoDB aggregate query.
35. Handling JOIN for MongoDB
orders
Inventory
If the Order and inventory has one to many relationship, such order has item
Query with $lookup Result for Query
36. Modify value to Embedded Model Object
$push operator appends a specified value to an array.
{ $push: { <field1>: <value1>, ... } }
$pull operator remove a specified value to an array. $pullAll for remove all value to an
array
{ $pull: { <field1>: <value| condition>, ... } }
For single value For multiple value
Remove Item by $pull
37. MongoDB Replication
Replication provides redundancy and increases data availability. With multiple copies of
data on different database servers, replication provides a level of fault tolerance against
the loss of a single database server.
MongoDB replication
A replica set is a group of mongos instances that maintain the same data set. A replica set
contains several data bearing nodes and optionally one arbiter node. Of the data bearing
nodes, one and only one member is deemed the primary node, while the other nodes are
deemed secondary nodes.
The primary node receives all write operations, primary node is capable of confirming
writes with { w: "majority" } write concern. Save the information to oplog.
38. MongoDB Replication
The secondaries replicate the primary’s oplog and apply the operations to their data sets
such that the secondaries’ data sets reflect the primary’s data set. If the primary is
unavailable, an eligible secondary will hold an election to elect itself the new primary.
39. MongoDB Sharding
Sharding is a method for distributing data across multiple machines. MongoDB uses
sharding to support deployments with very large data sets and high throughput
operations.
Why need Sharding
Database systems with large data sets or high throughput applications can challenge
the capacity of a single server.
High query rates can exhaust the CPU capacity of the server.
Working set sizes larger than the system’s RAM stress the I/O capacity of disk drives.
To handle work load of large dataset and high throughput of application, there is two
method for scaling the service or resource allocation.
40. MongoDB Sharding
Vertical Scaling, it involves increasing the capacity of a single server, such as using a more
powerful CPU, adding more RAM, or increasing the amount of storage space. Limitations in
available technology may restrict a single machine from being sufficiently powerful for a given
workload.
41. MongoDB Sharding
Horizontal Scaling involves dividing the system dataset and load over multiple servers, adding
additional servers to increase capacity as required. While the overall speed or capacity of a
single machine may not be high, each machine handles a subset of the overall workload,
potentially providing better efficiency than a single high-speed high-capacity server.
43. MongoDB Sharding
A MongoDB sharded cluster consists of the following components:
Shard: Each shard contains a subset of the sharded data. Each shard can be deployed as a
replica set.
Mongos: The mongos acts as a query router, providing an interface between client applications
and the sharded cluster.
Config servers: Config servers store metadata and configuration settings for the cluster. Config
servers must be deployed as a replica set (CSRS).
Shared Key:
To distribute the documents in a collection, MongoDB partitions the collection using the shard key.
The shard key consists of an immutable field or fields that exist in every document in the target collection.
You choose the shard key when sharding a collection. The choice of shard key cannot be changed after sharding.
A sharded collection can have only one shard key. See Shard Key Specification.
45. MongoDB Security
SCRAM
Salted Challenge Response Authentication Mechanism (SCRAM) is the default authentication mechanism for
MongoDB. SCRAM is based on the IETF RFC 5802 standard that defines best practices for implementation of
challenge-response mechanisms for authenticating users with passwords.
Using SCRAM, MongoDB verifies the supplied user credentials against the user’s name, password and authentication
database. The authentication database is the database where the user was created, and together with the user’s name,
serves to identify the user.
MongoDB’s implementation of SCRAM uses the SHA-1 hashing function.
SRAM Advantage
A tunable work factor (iterationCount),
Per-user random salts rather than server-wide salts,
A cryptographically stronger hash function (SHA-1 rather than MD5), and
Authentication of the server to the client as well as the client to the server.
46. MongoDB Security
MongoDB supports x.509 certificate authentication for client authentication and internal
authentication of the members of replica sets and sharded clusters.
x.509 certificate authentication requires a secure TLS/SSL connection.
To authenticate to servers, clients can use x.509 certificates instead of usernames and passwords.
Client Certificate Requirements
A single Certificate Authority (CA) must issue the certificates for both the client and the server.
Client certificates must contain the following fields:
keyUsage = digitalSignature
extendedKeyUsage = clientAuth
Each unique MongoDB user must have a unique certificate.