2. MongoDB
2014-03-13 Enteros, Inc.
Overview
Before going deep into performance optimization ensure that MongoDB was right
choice for your project as it is completely non relational database means it is
document oriented database.
Map-Reduce
Map-reduce is a data processing paradigm for condensing large volumes of
data into useful aggregated results. For map-reduce operations, MongoDB
provides the mapReduce database command.
Consider the map-reduce operation on the next slide:
4. MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
Update to MongoDB 2.4 or later versions as it supports V8 JavaScript engine and
includes feature like security enhancements, and text search (beta) and hashed
index. The switch to V8 improves concurrency by permitting multiple JavaScript
operations to run at the same time.
In this map-reduce operation, MongoDB applies the map phase to each input
document (i.e. the documents in the collection that match the query
condition). The map function emits key-value pairs. For those keys that have
multiple values, MongoDB applies the reduce phase, which collects and
condenses the aggregated data. MongoDB then stores the results in a
collection. Optionally, the output of the reduce function may pass through a
finalize function to further condense or process the results of the aggregation.
5. MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
1. Sharding
Sharding is a method for storing data across multiple machines. MongoDB
uses sharding to support deployments with very large data sets and high
throughput operations.
Shard keys should satisfy the following:
• “distributable” – the worst case of the shard key is auto-incremented
value (this will entail the “hot shard” behavior, when all writes will be
balanced to the single shard – here is the bottle neck). Ideal shard key
should be as much “randomness” as possible.
• Ideal shard key should be the primary field used for your queries.
• An easily divisible shard key makes it easy for MongoDB to distribute
content among the shards. Shard keys that have a limited number of
possible values can result in chunks that are “unsplittable.”
• unique fields in your collection should be part of the shard key
Here is the doc about shard key
6. MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
2. Balancing
Bear in mind that moving chunks from shard to another shard is a very
expensive operation (adding of new shards may significantly slow down
the performance).
As an helpful option – you could stop the balancer during the “prime
time”.
7. MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
3. Disk Input Output operations
In most cases the hardware bottleneck will be HDD (not CPU or RAM),
especially if you have several shards. So, during the growth of data, the
number of I/O operations will rapidly increase. Also keep monitoring free
disk space. So fast disks are more important in case if you are using sharding.
8. MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
3. Disk Input Output operations
In most cases the hardware bottleneck will be HDD (not CPU or
RAM), especially if you have several shards. So, during the growth of
data, the number of I/O operations will rapidly increase. Also keep
monitoring free disk space. So fast disks are more important in case if you
are using sharding.
9. MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
4. Locks
MongoDB uses a readers-writer lock that allows concurrent reads access to a
database but gives exclusive access to a single write operation.
When a read lock exists, many read operations may use this lock.
However, when a write lock exists, a single write operation holds the lock
exclusively, and no other read or write operations may share the lock.
Locks are “writer greedy,” which means writes have preference over reads.
When both a read and write are waiting for a lock, MongoDB grants the lock
to the write.
10. MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
5. Fast Writes
Use Capped Collections for Fast Writes
Capped Collections are circular, fixed-size collections that keep documents
well-ordered, even without the use of an index. This means that capped
collections can receive very high-speed writes and sequential reads.
These collections are particularly useful for keeping log files but are not
limited to that purpose. Use capped collections where appropriate.
11. MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
6. Fast Reads
Use Natural Order for Fast Reads. To return documents in the order they
exist on disk, return sorted operations using the $natural operator. On a
capped collection, this also returns the documents in the order in which they
were written.
Natural order does not use indexes but can be fast for operations when you
want to select the first or last items on disk.
12. MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
7. Query Performance
Read out about query performance, especially please pay attention to
Indexes and Compound Indexes.
13. MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
9. The size of Database
As far as you might understand MongoDB will store e.g. this document
{ UserFirstAndLastName: "Mikita Manko",
LinkToUsersFacebookPage: "https://www.facebook.com/mikita.manko"
}
“as-is”. I mean that names of these fields “UserFirstAndLastName” and
“LinkToUsersFacebookPage” will reduce free space.
Buy the using “name shorting” technique you can minimize the usage of
memory (you can get rig of something like 30-40% of unnecessary data):
14. MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
Obviously that it will cause the creation of “mapper” in your code (You
should map shortened unreadable names from database to long ones to
allow to use readable fields in your code)
{ FL: "Mikita Manko",
BFL: "https://www.facebook.com/mikita.manko"
}
15. MongoDb
2014-03-13 Enteros, Inc.
Performance Optimization
C. Updates
The most obvious point is to be on the cutting edge of technologies and
Investigate and Install last updates.
16. Enteros
2014-03-13 Enteros, Inc.
Upbeat High Load Capture
Database Root Cause and Spike Analysis for multi-tiered applications
Enteros UpBeat High Load Capture is an software framework for database problem root cause analysis of
Oracle, DB2, SQL Server, MySQL, Sybase and MongoDB database centric multi-tiered applications. High
Load Capture user interface visually correlates performance and system load metrics across multiple IT
production infrastructure layers. With second-by-second granularity of data analysis, High Load Capture
makes analysis possible for the most transient database performance spikes.
Features
• Multi-threaded, high-precision performance collection engine
• Extensible, dynamically configurable, centrally controlled collection agents
• Comprehensive library of collector agents
• Cross-tier correlation
• Safe, secure agent communication
• Load-sensitive collection controller
18. Enteros
2014-03-13 Enteros, Inc.
Upbeat High Load Capture
Supported Infrastructure, Database, Application server, OS monitoring
Database Server OS:
Linux, Sun Solaris, HP/UX, AIX, Windows Server
Client OS:
Windows, Linux
Database:
Oracle, Microsoft SQL, IBM DB2, MySQL, Sybase, MongoDB
Application Server:
Oracle (BEA) WebLogic, Oracle OAS, JBOSS, IBM WAS
19. MongoDb
2014-03-13 Enteros, Inc.
Enteros, Inc
http://www.enteros.com
Enteros is an innovative software company specializing in
Performance Management and Load Testing Software for
Production Databases - RDBMS and NOSQL/Big Data
Enteros solutions enable IT professionals to identify
and remediate performance problems in business-
critical databases with unprecedented speed, accuracy
and scope.
Kevin Batt; kevin.batt@enteros.com
408-207-8408