2. MONGODB WORKSHOP
{
meetup: “NYC Open Data”,
presenters: [“Kannan Sankaran”, “Roman Kubiak”],
host: “Vivian is awesome, THANK YOU”,
location: “ThoughtWorks is awesome, THANK YOU”,
audience: “You guys are awesome, THANK YOU”
}
3. OUR TOPICS
OVERVIEW OF DATABASES
WHAT IS MONGODB?
MONGODB, NOSQL, AND RELATIONAL DATABASES
A PEEK AT MONGODB COMMANDS
SHARDING AND REPLICATION IN MONGODB
FUTURE OF MONGODB AND US
DEMO
WORKSHOP
9. DATABASES AND THEIR GROWTH
RELATIONAL
DATABASES
(RDBMS) CREATED
1970s
1980s
RDBMS CONTINUE
TO BE POPULAR
INTERNET ARRIVES
1990s
CLIENT/SERVER MODEL
STRUCTURED QUERY
LANGUAGE (SQL)
CREATED
2000s
MONGODB
CREATED
2007
INTERNET GROWS
NoSQL DATABASES
EMERGE
17. A DOCUMENT IS LIKE A ROW…
{
_id: ObjectID(“12AB34CD56EF”),
name: “Ed Brown”,
orderDate: “2-1-2014”
}
18. …BUT IT IS MORE FLEXIBLE
{
{
_id: ObjectID(“12AB34CD56EF”),
name: “Ed Brown”,
orderDate: “2-1-2014”,
payments:
{
car: “100.50”,
hotel: “200”
}
_id: ObjectID(“12AB34CD56EF”),
name: “Ed Brown”,
orderDate: “2-1-2014”,
payments:
{
car: “100.50”,
hotel: “200”
},
tags: [“shirt”, “tie”]
}
}
THAT LOOKS LIKE A DOCUMENT
WITHIN ANOTHER DOCUMENT!
WHAT IS THIS? MULTIPLE VALUES
WITHIN A COLUMN?
19. HOW LARGE CAN THIS DOCUMENT BE?
{
_id: ObjectID(“12AB34CD56EF”),
name: “Ed Brown”,
orderDate: “2-1-2014”,
payments:
{
car: “100.50”,
hotel: “200”
}
…
…
…
}
UP TO 16 MB
LEO TOLSTOY’S 1225PAGE BOOK ON WAR
AND PEACE CAN FIT IN
1 DOCUMENT, AS IT IS
ONLY AROUND 3 MB.
27. MONGODB FEATURES
EASY TO LEARN
DYNAMIC QUERY LANGUAGE
- SEARCH BY FIELDS, REGULAR EXPRESSIONS
- USER-DEFINED JAVASCRIPT FUNCTIONS
- AGGREGATION, INCLUDING MAP/REDUCE
INDEXING – SINGLE, COMPOUND, GEOSPATIAL
REPLICATION
LOAD BALANCING USING SHARDING
GRIDFS TO STORE FILES
30. DATABASE MANAGEMENT SYSTEMS
BERKELEY INGRES
ORACLE
1970s
MOST SYSTEMS
USE SOME
FLAVOR OF SQL
1980s
INFORMIX
DB2
SYBASE
SQL SERVER
MS ACCESS
POSTGRESQL
MYSQL
1990s
2000s
NETEZZA
GREENPLUM
VERTICA
MARIADB
MONGODB
2007
33. IN THE LATE 90s/EARLY 2000s…
DOT COM BUBBLE
DOT COM BUST
WEB SERVICES
SOCIAL NETWORKS
GOOGLE, AMAZON
COMPUTER OWNERS/USERS
WEBSITE DATA COLLECTION
DATABASE SIZES
35. SCALE UP
BIGGER
MACHINE
MORE DISK SPACE
MORE RAM
MORE PROCESSORS
MORE EXPENSIVE
SINGLE POINT OF FAILURE
HARDWARE HAS LIMITS!
SCALE
OUT
SMALLER
LESS DISK SPACE
MACHINES
LESS RAM
LESS PROCESSORS
LESS EXPENSIVE
NO SINGLE POINT OF FAILURE
HIGHER RELIABILITY DESPITE
FAILURE OF INDIVIDUAL MACHINES
39. WP_POSTS
A JOIN QUERY IN MYSQL
WP_COMMENTS
SELECT p.post_author,
p.post_date,
c.comment_author,
c.comment_date
FROM wp_posts AS p
INNER JOIN wp_comments AS c
ON p.ID = c.comment_post_ID
WHERE p.ID = 1;
60. WHEN NOT TO USE MONGODB
IF TRANSACTIONS ARE A MUST
IF JOINS ARE ABSOLUTELY NECESSARY
SOFTWARE PRODUCTS LIKE WORDPRESS
THAT ALREADY HAVE TONS OF SUPPORT
FOR RELATIONAL DATABASES
61. FOR MONGODB vs MYSQL
ARGUMENTS, WATCH…
Source: http://www.youtube.com/watch?v=b2F-DItXtZs
64. MONGODB FEATURES
EASY TO LEARN
DYNAMIC QUERY LANGUAGE
- SEARCH BY FIELDS, REGULAR EXPRESSIONS
- USER-DEFINED JAVASCRIPT FUNCTIONS
- AGGREGATION, INCLUDING MAP/REDUCE
INDEXING – SINGLE, COMPOUND, GEOSPATIAL
REPLICATION
LOAD BALANCING USING SHARDING
GRIDFS TO STORE FILES
77. READ SPECIFIC FIELDS IN DOCUMENT
SQL
SELECT id, Name FROM ParksNYC
MONGODB
db.ParksNYC.find(
{ },
{
_id: 1, Name: 1
}
)
78. READ DOCUMENTS WITH RANGE CRITERIA
SQL
SELECT id, Name FROM ParksNYC
WHERE Courts > 5
AND Courts <= 8
MONGODB
db.ParksNYC.find(
{
Courts: { $gt: 5, $lte: 8}
}
)
79. READ DOCUMENTS THAT START WITH
A LETTER (REGULAR EXPRESSION)
SQL
SELECT id, Name FROM ParksNYC
WHERE NAME LIKE ‘F%’
MONGODB
db.ParksNYC.find(
{
Name: /^F/
}
)
80. UPDATE FIELD IN DOCUMENT
SQL
UPDATE ParksNYC
SET VisitDate = ‘1/1/2014’
MONGODB
db.ParksNYC.update(
{ },
{
$set: { VisitDate: "1/1/2014" }
},
{ multi: true}
)
82. GROUP BY AND SUM
SQL
SELECT COUNT(Name) AS
Parks_Number,
SUM(Courts) AS Courts_Number
FROM ParksNYC
GROUP BY Accessible
MONGODB
db.ParksNYC.aggregate(
{ $group :
{
_id : "$Accessible",
Parks_Number : { $sum : 1 },
Courts_Number :
{ $sum : "$Courts" }
}
})
88. SHARDING STEPS
1. ENABLE SHARDING ON DATABASE.
2. PICK A SHARD KEY FROM THE COLLECTION.
MAKE SURE THE KEY IS
- INDEXED
- SUFFICIENTLY UNIQUE SO IT WILL HAVE
A VARIETY OF UNIQUE VALUES.
3. SIT BACK AND RELAX. MONGODB WILL
AUTOMATICALLY DO THE SHARDING.
91. BREAKING THE RANGE INTO CHUNKS
SHARD0000
MONGOD
$minKey
Abba1234
RobG
TimA
LambW
RobF
SHARD0001
MONGOD
TimB
$maxKey
MONGOS
CarlZ
FrankT
MONGOD SHARD0002
CLIENT
FrankY
JackA
Abba1235
CarlW
JackB
LambV
92. BENEFITS OF SHARDING
1.
2.
3.
4.
INCREASES AVAILABLE MEMORY.
REDUCES LOAD ON THE SERVER.
INCREASES HARD DISK SPACE.
LOCATION-BASED SHARD KEYS CAN PUT DATA
CLOSE TO THE USERS AND KEEP RELATED DATA
TOGETHER.
104. AND THANK YOU TO EVERYONE WHO HELPED US
DR. BILL HOWE, UNIVERSITY OF WASHINGTON
JASON CHEN, MONGODB RECRUITER
KRISTINA CHODOROW (DEFINITIVE GUIDE AUTHOR)
FRANCESCA KRIHELY (MONGODB COMMUNITY MANAGER)
DR. MARKUS SCHMIDBERGER, RMONGODB
JOHANNES BRANDSTETTER, MONGOSOUP
(THE FIRST EUROPEAN PARTNER OF MONGODB TO PROVIDE
MONGODB AS A SERVICE)
DR. RAMNATH VAIDYANATHAN, RCHARTS
105. REFERENCES
MongoDB
http://www.mongodb.org
Book: MongoDB, The Definitive Guide – Kristina Chodorow
Book: NoSQL Distilled – Pramod J. Sadalage and Martin Fowler
NoSQL
http://en.wikipedia.org/wiki/NoSQL
MongoDB Use Cases
http://www.mongodb.com/use-cases
First NoSQL Meetup Notes
http://developer.yahoo.com/blogs/ydn/notes-nosql-meetup7663.html
Billion dollar club
http://graphics.wsj.com/billion-dollar-club/
Photos from Google