MongoDB

MongoDB (for Java Developers)
Anthony Slabinck

Who am I?
• Internship at Provikmo
• 3 years 6 months
• Competitive cyclist

What is MongoDB?
• The leading NoSQL database (http://db-engines.com/en/)
• Open source
• Non-relational JSON document store
• BSON (Binary JSON)
• Dynamic schema
• Agile
• Scalable through replicaton and sharding
3

The leading NoSQL database
4
• LinkedIn Job Skills
• Google Search
• Indeed.com Trends

MongoDB relative to relational databases
5

By use case
• Single View
• Internet of Things
• Mobile
• Real-Time Analytics
• Personalization
• Content Management
• Catalog
7

From relational databases to MongoDB
8
{
first_name: "Anthony",
surname: "Slabinck",
city: "Bruges",
location: [45.123,47.232],
cars: [
{ model: "Bentley",
year: 1973,
value: 100000 },
{ mode: "Rolls Royce",
year: 1965,
value: 330000 } ]
}

MongoDB CRUD Operations
10
Documents

11
Collections

12
Read operations

13
Read operations

14
Write operations - insert

15
Write operations - update

16
Write operations - remove

Installation
• Download MongoDB from http://www.mongodb.org/downloads
• Download the Java Driver (maven)
• mongod
• Daemon process
• mongo
• Interactive JavaScript shell interface
• Robomongo
• Cross-platform management tool
17

Getting started with MongoDB
18
Demo

Data Models
• Flexible schema
• Collections do not enforce document structure
• Consider how applications will use your database
• No foreign keys, no joins
• Relationships between data
• Embedded documents
• References
• Documents require a unique _id field that acts as a primary key
19

Data Models
• Denormalized
• Better read performance
• Single atomic write operation
• Document growth
• Dot notation
20
Embedded Data Models

Data Model
• One-to-One Relationship
21
{
_id: "infasla",
name: "Anthony Slabinck",
address: {
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
}

Data Model
• One-to-Many Relationship
22
{
_id: "infasla",
name: "Anthony Slabinck",
addresses: [
{ street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345" },
{ street: "1 Other Street",
city: "Boston",
state: "MA",
zip: "12345"
}
]
}

Data Model
• Normalized
• Duplication of data
• Complex many-to-many
relationships
• Follow-up queries
23
References

Data Model
• One-to-Many Relationship
{ _id: "oreilly",
name: "O'Reilly Media",
founded: 1980,
location: "CA"
}
{ _id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher_id: "oreilly"
}
{ _id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English",
publisher_id: "oreilly"
}
24
References

Model Tree Structures
• Parent references
• Child references
• Array of Ancestors
• Materialized Paths
db.categories.insert( { _id: "MongoDB", parent: "Databases" } )
db.categories.insert( { _id: "dbm", parent: "Databases" } )
db.categories.insert( { _id: "Databases", parent: "Programming" } )
db.categories.insert( { _id: "Languages", parent: "Programming" } )
db.categories.insert( { _id: "Programming", parent: "Books" } )
db.categories.insert( { _id: "Books", parent: null } )
25

GridFS
• BSON-document size limit of 16MB
• Divides a file into parts, or chunks and stores each of those chunks as
a separate document
• Two collections
• File chunks
• File metadata
• Reassemble chunks as needed
26

Capped Collections
• Fixed-size collections
• Insert and retrieve documents based on insertion order
• Automatically removes the oldest document
• Ideal for logging
27

Aggregation
• Operations that process data records and return computed results
• Simplifies application code
• Limits resource requirements
• Aggregation modalities
• Aggregation pipelines
• Map-Reduce
• Single purpose aggregation operations
28

Aggregation
• Stages
• Preferred method
29
Aggregation pipelines

Aggregation
• Two phases
• JavaScript functions
• Less efficient and more
complex than the aggregation
pipeline
30
Map-Reduce

Aggregation
• Simple
• Count
• Distinct
• Grouping
31
Single purpose aggregation operations

Indexes
• Efficient execution of queries
• Data structure
• Stores the value of a specific
field or set of fields, ordered by
value the field
• Create indexes that support
your common and user-facing
queries
32

Indexes
• Default _id
• Single Field
• Compound Index
• Multikey Index
• Geospatial Index
• Text Indexes
• Hashed Indexes
33
Types

Indexes
• Unique Indexes
• Sparse Indexes
• TTL Indexes
34
Properties

Indexes
• db.people.ensureIndex( { zipcode: 1 } )
• db.people.ensureIndex( { zipcode: 1 }, { background: true } )
• db.people.ensureIndex( { zipcode: 1 }, { background: true, sparse: true } )
• db.accounts.ensureIndex( { username: 1 }, { unique: true, dropDups: true } )
35
Creation

Replication
• What?
• Synchronizing data across multiple servers
• Purpose?
• Provides redundancy and increases data availability
36

Replication
• A group of mongod instances
that host the same data set
• Primary receives all write
operations
• Primary logs all changes in its
oplog
• Secondaries apply operations
from the primary
37
Replica set

Replication
• Arbiter
• Does not maintain a data set
• Only exits to vote
38
Replica set

Replication
39
Replica set
• Automatic failover

Replication
• Additional features:
• Read preference
• Priority
• Hidden members
• Delayed members
40
Replica set

Sharding
• What?
• Storing data across multiple machines
• When?
• High query rates exhaust the CPU capacity of the server
• Larger data sets exceed the storage capacity of a single machine
• Working set sizes larger than the system’s RAM stress the I/O capacity of
disk drives
41

Sharding
• Adds more CPU and storage
42
Vertical scaling – scale up
Scale
Price

Sharding
• Distributes the data
43
Horizontal scaling – scale outPrice
Scale

Sharding
• Shards store the data
• Query Routers interface with
client applications and direct
operations
• Config servers store the
cluster’s metadata
44
Sharded cluster

Sharding
• Collection level
• Shard key
• Indexed field or an indexed
compound field that exists in
every document
• Chunks
• Range based partitioning
• Hash based partitioning
• Automatic balancing
45
Data partitioning

MongoDB at scale
• Cluster scale
• Distributing across 100+ nodes in multiple data centers
• Performance scale
• 100K+ database reads and writes per second while maintaining strict SLAs
• Data scale
• Storing 1B+ documents in the database
47
Metrics

Lower TCO
• Dev/Ops savings
• Ease of use
• Fast, iterative development
• Hardware savings
• Commodity hardware
• Scale out
• Software/Support savings
• No upfront licence
48
Relational database

POJO Mappers
• Morphia
• Spring Data MongoDB
• Hibernate OGM
49

Resources
• http://docs.mongodb.org/manual/
• https://university.mongodb.com/
• M101J: MongoDB for Java Developers
• M102: MongoDB for DBAs
50

Building an App with MongoDB
51
Demo

MongoDB

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à MongoDB

Similaire à MongoDB (20)

Dernier

Dernier (20)

MongoDB