SlideShare une entreprise Scribd logo
1  sur  39
Télécharger pour lire hors ligne
Topic:Topic: NoSQLNoSQL DatabaseDatabase –– MongoDBMongoDB
Presenter: Rajesh KumarPresenter: Rajesh Kumar
Sr. Data ArchitectSr. Data Architect --Big Data Analytics & Information ManagementBig Data Analytics & Information Management
Agenda:
• What is NoSQL ,Why NoSQL
• The different Types of NoSQL Databases & Data Model approach
• Detailed overview of one of the most popular NoSQL database–MongoDB
• Model- Document oriented database
• JSON
• CRUD Operation
• Model Data In MongoDB
• Data Model design consideration
• Indexing
• Sharding• Sharding
• Replication
• Use cases
• Reference Architecture
• Insurance Conceptual Data Model
Relational database has been so well but..Relational database has been so well but..
 The relational Database has been excellent, But the world of data is rapidly changing.The
amount of data created each year is almost doubling , and it is kind of data explosion.And
these data are not simply transactional structured data.They are the new types of data-
generated from web log, documents, clickstream, devices, censors & other IoT;.
 Traditional RDBMS systems are not designed to handle such volume , variety and velocity
of these (semi-structured & unstructured) data produced in such enormous quantity.
Traditional RDBMS can’t provide scalability, performance, and flexibility needed for modern
distributed data storage and processing .
Mongo DBMongo DB
A Document based database
MongoDBMongoDB-- A NoSQL DBA NoSQL DB
What is NoSQLWhat is NoSQL -- Not Only SQL ?Not Only SQL ?
 Non relational,
 distributed,
 schema free,
 flexible,
 horizontal scalable,
 open-source
 simple API
Why NoSQL ?Why NoSQL ?
 Support for distributed platform in the age of Big data
 Ability to effectively deal with all kinds of data format images, docs, streaming, text, web, geospatial,
sensor, machine , real time operational
 Scalability and performance(low latency and faster data access )
 Rapid scale - scale out as much as business need to support more user and growing data
 24*7 data availability and global deployment
 Data to support next gen high performance apps
 Real time reporting and analytics (predictive analytics, Machine learning) support beyond their data
warehouses support
 Lowers data management cost Lowers data management cost
Types of NoSQL DatabasesTypes of NoSQL Databases
 Key/Value store – Memchased, DynamoDB,
 Column Store – cassandra, Hbase
 Document Store-MongoDB, CouchDB,DynamoDB
 Graph Store- Neo4j
 Multi-Model databases – DynamoDB,CouchDB
Mongo DB is document oriented database
Data structure is composed of key/value pair in JSON File format
What is MongoDBWhat is MongoDB ??
An Open source document oriented NoSQL database that provides high
performance, automatic scaling and flexible schema design.
MongoDB fulfills both traditional and new requirementMongoDB fulfills both traditional and new requirement
NoSQL but fully featuredNoSQL but fully featured
A quick recap of MongoDB CharacteristicsA quick recap of MongoDB Characteristics
 Distributed document oriented NoSQL Database
 MongoDB store data in JSON-Documents represented as BSON
 Dynamic and flexible schema
 Horizontal scaling, easy to scale
 Support reach query language
 Support CRUD for read and write operation
 Support forText search and Geospatial queries
 Efficient text and geospatial Index
 Very strong sharding and replicationVery strong sharding and replication
 _id : It’s a special key assign to each document
 -id is unique across a collection
A record in MongoDB is a document, which is a data structure composed ofA record in MongoDB is a document, which is a data structure composed of
field(key)field(key) and value pairsand value pairs.The values of fields may include other nested.The values of fields may include other nested
documents, arrays, and arrays of documents.documents, arrays, and arrays of documents.
MongoDB Data ModelMongoDB Data Model
 MongoDB store document in JSON(BSONActually)
 JSON - short for JavaScript Object Notation
 BSON is binary serialization of JSON objects
 A JSON object is a key-value("key" : "value" )pair data format that is enclosed in curly braces { }
 Document creation is free from schema- No structure, data type , size is required to be predefined.
You can create as many fields as you require dynamically.
 Data type supported BY JSON/BSON in MongoDB –Strings, Numbers(integer, long, double), Objects,
Arrays, Boolean(true/false),Null, Date,Timestamp.
 Other construct in MongoDB are Databases, collections, documents, fields
Mongo DB Data model core conceptsMongo DB Data model core concepts
 Databases-In MongoDB databases is physical container of collection that holds collection of
documents.
 Collection- Collection is a container of documents, document can be anything.
 Document- document is a group of fields in Key/Value pair and free from schema, table, column; a
document can hold any type of data.
 Think of Collection and Documents as table & rows in RDBMS
 Hierarchical
 A document can reference other document
 A document can contain other embedded document, array, arrays of document
Collection and DocumentCollection and Document
Mongo DB DataMongo DB Data ModelModel-- A Document StoreA Document Store ModelModel
Not PDF , Word, CSV or HTML,Not PDF , Word, CSV or HTML,
DocumentsDocuments are nested structures created using JavaScript Objectare nested structures created using JavaScript Object Notation(JSON).Notation(JSON).TThink of document ashink of document as
a records ina records in below example,below example, lets see howlets see how a document looka document look like in MongoDBlike in MongoDB
MongoDB Document type areMongoDB Document type are
MongoDB system componentMongoDB system component
 COMPONENTS
 mongod -The database process.
 mongo -The database shell (uses interactive javascript).The command line shell for interacting directly
with database.
 mongos - Sharding router
 UTILITIES UTILITIES
 mongostat - Show performance statistics
 mongofiles - Utility for putting and getting files from MongoDB GridFS
 mongoimport - Import into mongo from JSON or CSV
 mongoexport - Export a single collection (JSON, CSV)
Basic Mongo Shell commandsBasic Mongo Shell commands
MongoDB stores documents in collections. If a collection does not exist, MongoDB creates the collection
when you first store data for that collection.
 Select/create Database : use customerdb
 >db tells you the current database
 List databases:
>show dbs
local 0.78125GB
test 0.23012GB
customerdb
myDBmyDB
 Create collection:
db.createCollection(“products")
List collections,already created
>Show collections
Data Manipulation: Create & Read operationData Manipulation: Create & Read operation
DData manipulation frequently used methodsata manipulation frequently used methods
 The createCollection() Method
db.createCollection(name, options)
 The drop() Method
MongoDB's db.collection.drop() is used to drop a collection from the database.
 Rename Collection:
>db.collection.renameCollection(“NewColName”)
>db.cusstomer.renameCollection(“Customer”)
 The Insert Method ()
>db.COLLECTION_NAME.insert(document)
 Query document using find method-
>db.COLLECTION_NAME.find()
Update() Method Update() Method
>db.COLLECTION_NAME.update(SELECTION_CRITERIA, UPDATED_DATA)
>db.col.update({“title”:”MongoDB '},{$set:{“title”: “MongoDB Definitive Guide”}})
 The remove() Method
>db.col.remove({“title “ :”MongoDB”})
 The sort() Method
>db.COLLECTION_NAME.find().sort({KEY:1})
sorting order 1 and -1 are used. 1 is used for ascending order while -1 is used for descending order.
Basic DB operations in a complex documentBasic DB operations in a complex document
 Find operation
 Querying in embedded object
 Comparison operators
 Querying in arrays of document
 Indexing on embedded document
 Indexing on multiple key
ModelYour DataModelYour Data
Terminology:Terminology:
Example Schema.Example Schema.
Model Data in MongoDB: Model your data the way it is used.Model Data in MongoDB: Model your data the way it is used.
Lets Model some more data ..Lets Model some more data ..
Some schema design considerationsSome schema design considerations
 What is priority
 High consistency
 High read performance
 High write performance
 ODS application
 Real time
 How does the application access and manipulate data
 Data access path and types of queries
 Read versus write ratio Read versus write ratio
 Analytics( aggregation, video, images, machine, geospatial data)
IndexesIndexes--Indexes are special data structure that store subset of your data in an efficientIndexes are special data structure that store subset of your data in an efficient
way for easy & faster access to the dataway for easy & faster access to the data
 MongoDB store Index in a b-tree format which allows efficient traversal to the index content
 Proper Index selection is important in MongoDB and makes DB run optimally, improper Indexing
may bring system to a lot of issues in read-write operations and data distribution across sharded
cluster)
 IndexesTypes:
 -id
 Simple
 Compound
 Multi key
 FullText FullText
 Geo-spatial
 Hashed
Index continued..Index continued..
 The –id index : It is automatically created, immutable and can’t be removed.
 This is same as primary key in RDBMS.
 Default value is a 12 byte Object ID
 4-Byte timestamp, 3-byte machine id, 2-byte process id,3-byte counter
 Simple Index: A simple Index is an Index on a single key
 Compound Index:A compound Index is created over two or more fields in a document
 Multi-key Index:A multi-key Index is an Index created on a field that contains an array
 Full-text search Index:This is an Index over a text based field, similar to how google indexes web
pages. e.g Find all tweets that mention auto insurance within 30 days. Search Big Data in a blogpost
or all the tweets in last 30 days.
Geo-spatial Index: This Index is to support efficient queries of geospatial coordinate data .It is Geo-spatial Index: This Index is to support efficient queries of geospatial coordinate data .It is
used when you need to query location based spatial data.This Index is really a great feature
because location based data is one of the valuable data being collected today for targeted location
based customer, location based product analysis . e.g Find all customers that live within 50 miles of
NY.
 Hashed Index: Used mainly in Hash based sharding, and allows for more randomized data
distribution across shards
 Create Index syntax:
db.employee.ensureIndex({“email”:1},{“unique”:true})
db.employee.ensureIndex({“age”;1}, {“sparse”: true})
db.employee.find({age: {$gte :25}})
Index Continue..Index Continue..
 Index Properties:
 TTL Index-TTL indexes are special indexes that MongoDB can use to automatically remove documents from a
collection after a certain amount of time
 Sparse Index-The sparse property of an index ensures that the index only contain entries for documents that have the
indexed field.The index skips documents that do not have the indexed field.
 Unique Index- To enable the uniqueness of the field.
 Text Search Index:
MongoDB provides text indexes to support text search queries on text content.To perform text search queries, you
must have a text index on your collection.A collection can only have one text search index, but that index can cover
multiple fields.
Creating text search Index over the ”title” and “content” fields :
db.blogpost.ensureIndex( { title: "text", content: "text" } )db.blogpost.ensureIndex( { title: "text", content: "text" } )
Use the $text query operator to perform text searches
on a collection with a text index.
$text perform a logical OR of all such on the intended search string.
For example, we can use the following query to find term MongoDB and BigData in the blogpost.
db.blogpost.find( { $text: { $search:“MongoDB" } } )
db.blogpost.find({$text:{$search:”BigData”}})
DeletingText Index: To delete an existing text index, first find the name of index using the following query,
to get the name of the index >db.blogpost.getIndexes()
Now you can drop the text Index: >db.blogpost.dropIndex(“title_text_content_text")
TTextext indexesindexes to support text searchto support text search analyticsanalytics--By exampleBy example
Mongo DBMongo DB ShardingSharding
 Sharding is a method for storing data across multiple machines in clustered computing
environment. MongoDB uses sharding to support deployments with very large data
sets and high throughput operations.
 Purpose of Sharding
 When Database system grows very large, capacity of the single server machine can be
challenged in increased work load and high concurrent user that demands high throughput .
After a certain level ,you can’t keep doing vertical scaling by adding more CPU,RAM and
storage, vertical scaling has limitations.
 In contrast, Sharding works on Horizontal scaling; divides the data sets and distribute the data
over the multiple shards servers. Each shards work as an independent database and
collectively all the shards make up a single logical database unit.collectively all the shards make up a single logical database unit.
 Sharding reduces the amount of data that each server needs to store.When data grows you
can add more shards in the cluster and subsequently each shard stores less data as the cluster
grows.
 For example, if a database has a 1 terabyte data set, and there are 4 shards, then each shard
might hold only 256GB of data. If there are 40 shards, then each shard might hold only 25GB
of data
Shards in Mongo DB ArchitectureShards in Mongo DB Architecture
ReplicationReplication
The primary accepts all write operations.Then the secondary replicate the oplog
to apply to their data sets.
Replication Continue..Replication Continue..
Replica set members
A replica set in MongoDB is a group of mongod processes that provide redundancy
and high availability. The members of a replica set are:
Primary- It receives all write operations and records the operation in primary oplog.
Secondary – Secondary member replicate operations from the primary to maintain
an identical copy of data set to recover from failure.
Note :The minimum recommended configuration for a replica set is: A primary, a
secondary, and an arbiter. Most deployments, will keep three members that store
data: A primary and two secondary members
UseUse casescases--Type of workload suitable with NoSQLType of workload suitable with NoSQL
 Mobile app development
 Internet of things
 Digital advertisement
 Streaming application
 Web application
 Social applications
Gaming
 Content management
 Customer personalization
 Recommendation engine
 360 customer view of customer,
business, product
 Fraud detection
 Real time analytics  Gaming Real time analytics
MongoDB supports for programming languagesMongoDB supports for programming languages
Other cool stuffOther cool stuff
 Sharding
 Aggregation and map/reduce
 Storage engine-Wired Tiger
 Capped collection
 GridFS
 Text and GeoSpatial Index
Use of python, Java Scripting language for complex data handling Use of python, Java Scripting language for complex data handling
 Replication
That’s it
Thank you !
Email me:Rajesh-29.kumar-29@cognizant.com
Follow me on Twitter: @rajesh14k

Contenu connexe

Tendances

NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and UsesSuvradeep Rudra
 
Choosing your NoSQL storage
Choosing your NoSQL storageChoosing your NoSQL storage
Choosing your NoSQL storageImteyaz Khan
 
moving_from_relational_to_nosql_couchbase_2016
moving_from_relational_to_nosql_couchbase_2016moving_from_relational_to_nosql_couchbase_2016
moving_from_relational_to_nosql_couchbase_2016Richard (Rick) Nelson
 
Which no sql database
Which no sql databaseWhich no sql database
Which no sql databaseNitin KR
 
No SQL - MongoDB
No SQL - MongoDBNo SQL - MongoDB
No SQL - MongoDBMirza Asif
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless DatabasesDan Gunter
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sqlRam kumar
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL DatabasesRajith Pemabandu
 
Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases IJECEIAES
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2Fabio Fumarola
 
Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...ijdms
 
PostgreSQL - Object Relational Database
PostgreSQL - Object Relational DatabasePostgreSQL - Object Relational Database
PostgreSQL - Object Relational DatabaseMubashar Iqbal
 
Chapter 5 design of keyvalue databses from nosql for mere mortals
Chapter 5 design of keyvalue databses from nosql for mere mortalsChapter 5 design of keyvalue databses from nosql for mere mortals
Chapter 5 design of keyvalue databses from nosql for mere mortalsnehabsairam
 
MS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database ConceptsMS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database ConceptsDataminingTools Inc
 
Mongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMohan Rathour
 

Tendances (18)

NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
Choosing your NoSQL storage
Choosing your NoSQL storageChoosing your NoSQL storage
Choosing your NoSQL storage
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
moving_from_relational_to_nosql_couchbase_2016
moving_from_relational_to_nosql_couchbase_2016moving_from_relational_to_nosql_couchbase_2016
moving_from_relational_to_nosql_couchbase_2016
 
Which no sql database
Which no sql databaseWhich no sql database
Which no sql database
 
No SQL - MongoDB
No SQL - MongoDBNo SQL - MongoDB
No SQL - MongoDB
 
NOSQL vs SQL
NOSQL vs SQLNOSQL vs SQL
NOSQL vs SQL
 
Hdfs Dhruba
Hdfs DhrubaHdfs Dhruba
Hdfs Dhruba
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL Databases
 
Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
 
Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...
 
PostgreSQL - Object Relational Database
PostgreSQL - Object Relational DatabasePostgreSQL - Object Relational Database
PostgreSQL - Object Relational Database
 
Chapter 5 design of keyvalue databses from nosql for mere mortals
Chapter 5 design of keyvalue databses from nosql for mere mortalsChapter 5 design of keyvalue databses from nosql for mere mortals
Chapter 5 design of keyvalue databses from nosql for mere mortals
 
MS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database ConceptsMS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database Concepts
 
Mongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorial
 

Similaire à MongoDB NoSQL database a deep dive -MyWhitePaper

Introduction to MongoDB and its best practices
Introduction to MongoDB and its best practicesIntroduction to MongoDB and its best practices
Introduction to MongoDB and its best practicesAshishRathore72
 
NOSQL and MongoDB Database
NOSQL and MongoDB DatabaseNOSQL and MongoDB Database
NOSQL and MongoDB DatabaseTariqul islam
 
MongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behlMongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behlTO THE NEW | Technology
 
Storage dei dati con MongoDB
Storage dei dati con MongoDBStorage dei dati con MongoDB
Storage dei dati con MongoDBAndrea Balducci
 
Copy of MongoDB .pptx
Copy of MongoDB .pptxCopy of MongoDB .pptx
Copy of MongoDB .pptxnehabsairam
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql DatabasePrashant Gupta
 
A Study on Mongodb Database.pdf
A Study on Mongodb Database.pdfA Study on Mongodb Database.pdf
A Study on Mongodb Database.pdfJessica Navarro
 
A Study on Mongodb Database
A Study on Mongodb DatabaseA Study on Mongodb Database
A Study on Mongodb DatabaseIJSRD
 
Everything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptxEverything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptx75waytechnologies
 
Jumpstart: Building Your First MongoDB App
Jumpstart: Building Your First MongoDB AppJumpstart: Building Your First MongoDB App
Jumpstart: Building Your First MongoDB AppMongoDB
 
nosql [Autosaved].pptx
nosql [Autosaved].pptxnosql [Autosaved].pptx
nosql [Autosaved].pptxIndrani Sen
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB Habilelabs
 

Similaire à MongoDB NoSQL database a deep dive -MyWhitePaper (20)

Mongo db
Mongo dbMongo db
Mongo db
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Introduction to MongoDB and its best practices
Introduction to MongoDB and its best practicesIntroduction to MongoDB and its best practices
Introduction to MongoDB and its best practices
 
NOSQL and MongoDB Database
NOSQL and MongoDB DatabaseNOSQL and MongoDB Database
NOSQL and MongoDB Database
 
MongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behlMongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behl
 
MongoDB_ppt.pptx
MongoDB_ppt.pptxMongoDB_ppt.pptx
MongoDB_ppt.pptx
 
Storage dei dati con MongoDB
Storage dei dati con MongoDBStorage dei dati con MongoDB
Storage dei dati con MongoDB
 
Copy of MongoDB .pptx
Copy of MongoDB .pptxCopy of MongoDB .pptx
Copy of MongoDB .pptx
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql Database
 
A Study on Mongodb Database.pdf
A Study on Mongodb Database.pdfA Study on Mongodb Database.pdf
A Study on Mongodb Database.pdf
 
A Study on Mongodb Database
A Study on Mongodb DatabaseA Study on Mongodb Database
A Study on Mongodb Database
 
Everything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptxEverything You Need to Know About MongoDB Development.pptx
Everything You Need to Know About MongoDB Development.pptx
 
Jumpstart: Building Your First MongoDB App
Jumpstart: Building Your First MongoDB AppJumpstart: Building Your First MongoDB App
Jumpstart: Building Your First MongoDB App
 
MongoDB
MongoDBMongoDB
MongoDB
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
nosql [Autosaved].pptx
nosql [Autosaved].pptxnosql [Autosaved].pptx
nosql [Autosaved].pptx
 
MongoDB
MongoDBMongoDB
MongoDB
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
 
Nosql
NosqlNosql
Nosql
 

Dernier

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Dernier (20)

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

MongoDB NoSQL database a deep dive -MyWhitePaper

  • 1. Topic:Topic: NoSQLNoSQL DatabaseDatabase –– MongoDBMongoDB Presenter: Rajesh KumarPresenter: Rajesh Kumar Sr. Data ArchitectSr. Data Architect --Big Data Analytics & Information ManagementBig Data Analytics & Information Management Agenda: • What is NoSQL ,Why NoSQL • The different Types of NoSQL Databases & Data Model approach • Detailed overview of one of the most popular NoSQL database–MongoDB • Model- Document oriented database • JSON • CRUD Operation • Model Data In MongoDB • Data Model design consideration • Indexing • Sharding• Sharding • Replication • Use cases • Reference Architecture • Insurance Conceptual Data Model
  • 2. Relational database has been so well but..Relational database has been so well but..  The relational Database has been excellent, But the world of data is rapidly changing.The amount of data created each year is almost doubling , and it is kind of data explosion.And these data are not simply transactional structured data.They are the new types of data- generated from web log, documents, clickstream, devices, censors & other IoT;.  Traditional RDBMS systems are not designed to handle such volume , variety and velocity of these (semi-structured & unstructured) data produced in such enormous quantity. Traditional RDBMS can’t provide scalability, performance, and flexibility needed for modern distributed data storage and processing .
  • 3. Mongo DBMongo DB A Document based database
  • 5. What is NoSQLWhat is NoSQL -- Not Only SQL ?Not Only SQL ?  Non relational,  distributed,  schema free,  flexible,  horizontal scalable,  open-source  simple API
  • 6. Why NoSQL ?Why NoSQL ?  Support for distributed platform in the age of Big data  Ability to effectively deal with all kinds of data format images, docs, streaming, text, web, geospatial, sensor, machine , real time operational  Scalability and performance(low latency and faster data access )  Rapid scale - scale out as much as business need to support more user and growing data  24*7 data availability and global deployment  Data to support next gen high performance apps  Real time reporting and analytics (predictive analytics, Machine learning) support beyond their data warehouses support  Lowers data management cost Lowers data management cost
  • 7. Types of NoSQL DatabasesTypes of NoSQL Databases  Key/Value store – Memchased, DynamoDB,  Column Store – cassandra, Hbase  Document Store-MongoDB, CouchDB,DynamoDB  Graph Store- Neo4j  Multi-Model databases – DynamoDB,CouchDB Mongo DB is document oriented database Data structure is composed of key/value pair in JSON File format
  • 8. What is MongoDBWhat is MongoDB ?? An Open source document oriented NoSQL database that provides high performance, automatic scaling and flexible schema design.
  • 9. MongoDB fulfills both traditional and new requirementMongoDB fulfills both traditional and new requirement
  • 10. NoSQL but fully featuredNoSQL but fully featured
  • 11. A quick recap of MongoDB CharacteristicsA quick recap of MongoDB Characteristics  Distributed document oriented NoSQL Database  MongoDB store data in JSON-Documents represented as BSON  Dynamic and flexible schema  Horizontal scaling, easy to scale  Support reach query language  Support CRUD for read and write operation  Support forText search and Geospatial queries  Efficient text and geospatial Index  Very strong sharding and replicationVery strong sharding and replication  _id : It’s a special key assign to each document  -id is unique across a collection
  • 12. A record in MongoDB is a document, which is a data structure composed ofA record in MongoDB is a document, which is a data structure composed of field(key)field(key) and value pairsand value pairs.The values of fields may include other nested.The values of fields may include other nested documents, arrays, and arrays of documents.documents, arrays, and arrays of documents.
  • 13. MongoDB Data ModelMongoDB Data Model  MongoDB store document in JSON(BSONActually)  JSON - short for JavaScript Object Notation  BSON is binary serialization of JSON objects  A JSON object is a key-value("key" : "value" )pair data format that is enclosed in curly braces { }  Document creation is free from schema- No structure, data type , size is required to be predefined. You can create as many fields as you require dynamically.  Data type supported BY JSON/BSON in MongoDB –Strings, Numbers(integer, long, double), Objects, Arrays, Boolean(true/false),Null, Date,Timestamp.  Other construct in MongoDB are Databases, collections, documents, fields
  • 14. Mongo DB Data model core conceptsMongo DB Data model core concepts  Databases-In MongoDB databases is physical container of collection that holds collection of documents.  Collection- Collection is a container of documents, document can be anything.  Document- document is a group of fields in Key/Value pair and free from schema, table, column; a document can hold any type of data.  Think of Collection and Documents as table & rows in RDBMS  Hierarchical  A document can reference other document  A document can contain other embedded document, array, arrays of document
  • 16. Mongo DB DataMongo DB Data ModelModel-- A Document StoreA Document Store ModelModel Not PDF , Word, CSV or HTML,Not PDF , Word, CSV or HTML, DocumentsDocuments are nested structures created using JavaScript Objectare nested structures created using JavaScript Object Notation(JSON).Notation(JSON).TThink of document ashink of document as a records ina records in below example,below example, lets see howlets see how a document looka document look like in MongoDBlike in MongoDB
  • 17. MongoDB Document type areMongoDB Document type are
  • 18. MongoDB system componentMongoDB system component  COMPONENTS  mongod -The database process.  mongo -The database shell (uses interactive javascript).The command line shell for interacting directly with database.  mongos - Sharding router  UTILITIES UTILITIES  mongostat - Show performance statistics  mongofiles - Utility for putting and getting files from MongoDB GridFS  mongoimport - Import into mongo from JSON or CSV  mongoexport - Export a single collection (JSON, CSV)
  • 19. Basic Mongo Shell commandsBasic Mongo Shell commands MongoDB stores documents in collections. If a collection does not exist, MongoDB creates the collection when you first store data for that collection.  Select/create Database : use customerdb  >db tells you the current database  List databases: >show dbs local 0.78125GB test 0.23012GB customerdb myDBmyDB  Create collection: db.createCollection(“products") List collections,already created >Show collections
  • 20. Data Manipulation: Create & Read operationData Manipulation: Create & Read operation
  • 21. DData manipulation frequently used methodsata manipulation frequently used methods  The createCollection() Method db.createCollection(name, options)  The drop() Method MongoDB's db.collection.drop() is used to drop a collection from the database.  Rename Collection: >db.collection.renameCollection(“NewColName”) >db.cusstomer.renameCollection(“Customer”)  The Insert Method () >db.COLLECTION_NAME.insert(document)  Query document using find method- >db.COLLECTION_NAME.find() Update() Method Update() Method >db.COLLECTION_NAME.update(SELECTION_CRITERIA, UPDATED_DATA) >db.col.update({“title”:”MongoDB '},{$set:{“title”: “MongoDB Definitive Guide”}})  The remove() Method >db.col.remove({“title “ :”MongoDB”})  The sort() Method >db.COLLECTION_NAME.find().sort({KEY:1}) sorting order 1 and -1 are used. 1 is used for ascending order while -1 is used for descending order.
  • 22. Basic DB operations in a complex documentBasic DB operations in a complex document  Find operation  Querying in embedded object  Comparison operators  Querying in arrays of document  Indexing on embedded document  Indexing on multiple key
  • 24. Example Schema.Example Schema. Model Data in MongoDB: Model your data the way it is used.Model Data in MongoDB: Model your data the way it is used.
  • 25. Lets Model some more data ..Lets Model some more data ..
  • 26. Some schema design considerationsSome schema design considerations  What is priority  High consistency  High read performance  High write performance  ODS application  Real time  How does the application access and manipulate data  Data access path and types of queries  Read versus write ratio Read versus write ratio  Analytics( aggregation, video, images, machine, geospatial data)
  • 27. IndexesIndexes--Indexes are special data structure that store subset of your data in an efficientIndexes are special data structure that store subset of your data in an efficient way for easy & faster access to the dataway for easy & faster access to the data  MongoDB store Index in a b-tree format which allows efficient traversal to the index content  Proper Index selection is important in MongoDB and makes DB run optimally, improper Indexing may bring system to a lot of issues in read-write operations and data distribution across sharded cluster)  IndexesTypes:  -id  Simple  Compound  Multi key  FullText FullText  Geo-spatial  Hashed
  • 28. Index continued..Index continued..  The –id index : It is automatically created, immutable and can’t be removed.  This is same as primary key in RDBMS.  Default value is a 12 byte Object ID  4-Byte timestamp, 3-byte machine id, 2-byte process id,3-byte counter  Simple Index: A simple Index is an Index on a single key  Compound Index:A compound Index is created over two or more fields in a document  Multi-key Index:A multi-key Index is an Index created on a field that contains an array  Full-text search Index:This is an Index over a text based field, similar to how google indexes web pages. e.g Find all tweets that mention auto insurance within 30 days. Search Big Data in a blogpost or all the tweets in last 30 days. Geo-spatial Index: This Index is to support efficient queries of geospatial coordinate data .It is Geo-spatial Index: This Index is to support efficient queries of geospatial coordinate data .It is used when you need to query location based spatial data.This Index is really a great feature because location based data is one of the valuable data being collected today for targeted location based customer, location based product analysis . e.g Find all customers that live within 50 miles of NY.  Hashed Index: Used mainly in Hash based sharding, and allows for more randomized data distribution across shards  Create Index syntax: db.employee.ensureIndex({“email”:1},{“unique”:true}) db.employee.ensureIndex({“age”;1}, {“sparse”: true}) db.employee.find({age: {$gte :25}})
  • 29. Index Continue..Index Continue..  Index Properties:  TTL Index-TTL indexes are special indexes that MongoDB can use to automatically remove documents from a collection after a certain amount of time  Sparse Index-The sparse property of an index ensures that the index only contain entries for documents that have the indexed field.The index skips documents that do not have the indexed field.  Unique Index- To enable the uniqueness of the field.  Text Search Index: MongoDB provides text indexes to support text search queries on text content.To perform text search queries, you must have a text index on your collection.A collection can only have one text search index, but that index can cover multiple fields. Creating text search Index over the ”title” and “content” fields : db.blogpost.ensureIndex( { title: "text", content: "text" } )db.blogpost.ensureIndex( { title: "text", content: "text" } ) Use the $text query operator to perform text searches on a collection with a text index. $text perform a logical OR of all such on the intended search string. For example, we can use the following query to find term MongoDB and BigData in the blogpost. db.blogpost.find( { $text: { $search:“MongoDB" } } ) db.blogpost.find({$text:{$search:”BigData”}}) DeletingText Index: To delete an existing text index, first find the name of index using the following query, to get the name of the index >db.blogpost.getIndexes() Now you can drop the text Index: >db.blogpost.dropIndex(“title_text_content_text")
  • 30. TTextext indexesindexes to support text searchto support text search analyticsanalytics--By exampleBy example
  • 31. Mongo DBMongo DB ShardingSharding  Sharding is a method for storing data across multiple machines in clustered computing environment. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.  Purpose of Sharding  When Database system grows very large, capacity of the single server machine can be challenged in increased work load and high concurrent user that demands high throughput . After a certain level ,you can’t keep doing vertical scaling by adding more CPU,RAM and storage, vertical scaling has limitations.  In contrast, Sharding works on Horizontal scaling; divides the data sets and distribute the data over the multiple shards servers. Each shards work as an independent database and collectively all the shards make up a single logical database unit.collectively all the shards make up a single logical database unit.  Sharding reduces the amount of data that each server needs to store.When data grows you can add more shards in the cluster and subsequently each shard stores less data as the cluster grows.  For example, if a database has a 1 terabyte data set, and there are 4 shards, then each shard might hold only 256GB of data. If there are 40 shards, then each shard might hold only 25GB of data
  • 32. Shards in Mongo DB ArchitectureShards in Mongo DB Architecture
  • 33. ReplicationReplication The primary accepts all write operations.Then the secondary replicate the oplog to apply to their data sets.
  • 34. Replication Continue..Replication Continue.. Replica set members A replica set in MongoDB is a group of mongod processes that provide redundancy and high availability. The members of a replica set are: Primary- It receives all write operations and records the operation in primary oplog. Secondary – Secondary member replicate operations from the primary to maintain an identical copy of data set to recover from failure. Note :The minimum recommended configuration for a replica set is: A primary, a secondary, and an arbiter. Most deployments, will keep three members that store data: A primary and two secondary members
  • 35. UseUse casescases--Type of workload suitable with NoSQLType of workload suitable with NoSQL  Mobile app development  Internet of things  Digital advertisement  Streaming application  Web application  Social applications Gaming  Content management  Customer personalization  Recommendation engine  360 customer view of customer, business, product  Fraud detection  Real time analytics  Gaming Real time analytics
  • 36. MongoDB supports for programming languagesMongoDB supports for programming languages
  • 37.
  • 38. Other cool stuffOther cool stuff  Sharding  Aggregation and map/reduce  Storage engine-Wired Tiger  Capped collection  GridFS  Text and GeoSpatial Index Use of python, Java Scripting language for complex data handling Use of python, Java Scripting language for complex data handling  Replication
  • 39. That’s it Thank you ! Email me:Rajesh-29.kumar-29@cognizant.com Follow me on Twitter: @rajesh14k