- MongoDB is a general purpose database that uses documents rather than tables, making data storage more flexible. It can scale horizontally using sharding to distribute data across multiple servers.
- Choosing the right shard key is important for write and query performance. The shard key should distribute data evenly across shards and avoid scatter-gather queries that retrieve data from multiple servers.
- MongoDB uses replication for high availability so data is copied to secondary servers. Monitoring tools check for replication lag and assess if data is being copied over quickly enough.
7. @tgralltug@mongodb.com
It runs on expensive hardware
“Clients can also opt to run zEC12 without a raised
datacenter floor -- a first for high-end IBM mainframes.” !
!
IBM Press Release 28 Aug, 2012!
8. @tgralltug@mongodb.com
This was a problem for Google
250,000+MBP’s==4.1miles
2010 Search Index Size: !
100,000,000 GB
New data added per day!
100,000+ GB
Databases they could use!
0
11. @tgralltug@mongodb.com
MongoDB Vision
To provide the best database for how we build and run
apps today
Build
• New and complex data
• Flexible
• New languages
• Faster development
Run
• Big Data scalability
• Real-time
• Commodity hardware
• Cloud
13. @tgralltug@mongodb.com
Full Featured
Queries
• Find Paul’s cars
• Find everybody in London with a car built
between 1970 and 1980
Geospatial
• Find all of the car owners within 5km of
Trafalgar Sq.
Text Search
• Find all the cars described as having
leather seats
Aggregation
• Calculate the average value of Paul’s car
collection
Map Reduce
• What is the ownership pattern of colors
by geography over time? (is purple
trending up in China?)
{ first_name: ‘Paul’,
surname: ‘Miller’,
city: ‘London’,
location: {
! type: “Point”, !
coordinates :
! ! [-0.128, 51.507]
! },!
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
}
}
16. @tgralltug@mongodb.com
MongoDB Use Cases
16
Big Data Product & Asset
Catalogs
Security &
Fraud
Internet of
Things
Database-as-a-
Service
Mobile
Apps
Customer Data
Management
Data
Hub
Social &
Collaboration
Content
Management
Intelligence Agencies
Top Investment and
Retail Banks
Top US Retailer
Top Global Shipping
Company
Top Industrial Equipment
Manufacturer
Top Media Company
Top Investment and
Retail Banks
27. @tgralltug@mongodb.com
Monitor: What to look?
Locks Avoid long running operations that could slow down the database
Page Faults Check that the working set stays in RAM, reduce I/O operations
Nbr of Connections Reduce RAM consumption
Replication Lag Time to copy the data to secondaries
OpLog Size Size of the replication queue
Chunk Distribution Data should be balanced on all the nodes
35. @tgralltug@mongodb.com
Conclusion
• MongoDB is a general purpose database
• Document Design
• Create Index and check the explain plans
• MongoDB is a distributed database
• Choose the shard key wisely
• MongoDB use replication
• Check OpLog and Replication Lag
• Monitor all these with tools