MongoDB at MapMyFitness

Route & Elevation data example
(Lost on the way to MongoSeattle)

Implementation Patterns

• 
Standard
Datastore
-‐
3
member
replica
set

(small
to
med
implementa:ons)

• 
Big
Data
implementa:on
–
sharded
cluster
(TB+)

• 
Buffering
Layer
-‐
high
memory

(load
all
data
and
index
files
into
RAM)

• 
Write
Heavy
-‐
u:lize
sharding
to
op:mize
for
writes

• 
Read
Heavy
-‐
3+n
replica
set
configura:on
for
rapid
read
scaling

(up
to
12
nodes)

Implementation Patterns

• 
In
the
cloud,
tune
the
instance
type
to
the
mongo

implementa:on

• 
On
iron,
plan
carefully
and
dedicate
servers
completely
to
mongo

to
avoid
memory
map
conten:on

• 
For
DR,
spin
up
a
delayed,
hidden
replica
node
(preferably
in
a

diﬀerent
datacenter)

• 
Aggrega:on
framework
can
be
used
in
myriad
ways,
including

bridging
the
gap
to
SQL
data
warehousing
via
ETL.

• 
Automate
install
paYerns
for
rapid
development,
prototyping,

and
infrastructure
scaling.

Operational Automation
( example of automated mongodb install via puppet )

Replica Set Expansion

•  MongoDB
is
“replica:on
made
elegant”

•  Ridiculously
simple
to
add
addi:onal
members

•  Be
sure
to
run
Ini:alSync
from
a
secondary!

rs.add(
“host”
:
“livetrack_db09”,
“ini:alSync”
:
{
“state”
:
2
}
)

•  Both
rs.add()
and
rs.remove()
can
be
scripted
and
connected
to

Monitoring
systems
for
autoscaling

Monitoring and Introspection

• 
MMS,
10gen's
cloud-‐based
monitoring
service
(best

available)

• 
Supported
by
Zabbix,
Nagios,
Munin,
Server
Density,
etc

• 
mongostat,
mongotop,
REST
interface,
database
proﬁler

• 
Monitoring
system
triggers
can
ini:ate
node
addi:ons,

removals,
service
restarts,
etc

• 
In
addi:on
to
service-‐level
monitoring,
use
more
advanced

tests
to
check
for
and
alert
on
query
latency
spikes

10gen's MMS
(the one-stop shop for mongdb metrics)

Mongo in Zabbix
( Mikoomi Plugins: http://code.google.com/p/mikoomi )

mongostat
( Very useful for real-time troubleshooting )

Operational Automation
( example of automated mongodb restart action )

Security Considerations

• 
MongoDB
provides
authen:ca:on
support
and
basic

permissions

• 
Auth
is
turned
oﬀ
by
default
to
allow
for
op:mal
performance

• 
Always
run
databases
in
a
trusted
network
environment

• 
Lock
down
host
based
ﬁrewalls
to
limit
access
to
required

clients

• 
Automate
iptables
with
puppet
or
chef,
in
EC2
use
security

groups

Network Security Automation

## Puppet Pattern for Mongodb network security

class iptables::public {

iptables::add_rule { '001 MongoDB established':
rule => '-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT'
}

iptables::add_rule { '002 MongoDB':
rule => '-A RH-Firewall-1-INPUT -i eth1 -p tcp -m tcp --dport 27017 -j ACCEPT'
}

iptables::add_rule { '003 MongoDB MMF Phase II Network':
rule => '-A RH-Firewall-1-INPUT -i eth0 -s 172.16.16.0/20 -p tcp -m tcp --dport 27017 -j ACCEPT'
}

iptables::add_rule { '004 MongoDB MMF Cloud Network':
rule => '-A RH-Firewall-1-INPUT -i eth0 -s 10.178.52.0/24 -p tcp -m tcp --dport 27017 -j ACCEPT'
}

}

Security Considerations

• 
Use
the
rule
of
least-‐privilege
to
allow
access
to
environments

• 
Data
sensi:vity
should
determine
the
extent
of
security

measures

• 
For
non-‐sensi:ve
data,
good
network
security
can
be
suﬃcient

• 
In
open
environments,
be
sure
experience
matches
access
level

• 
Lack
of
granular
perms
allows
for
full
admin
access,
use

discre:on

Maintenance

• 
Far
less
maintenance
required
than
tradi:onal
RDMBS
systems

• 
Regularly
perform
query
profile
analysis
and
index
audi:ng

• 
Rebuild
databases
to
reclaim
space
lost
due
to
fragmenta:on

• 
Automate
checks
of
log
files
for
known
red-‐flags

• 
Regularly
review
data
throughput
rate,
storage
growth
rate,
and

overall
business
growth
graphs
to
inform
capacity
planning.

• 
For
HA
tes:ng,
periodically
step-‐down
the
primary
to
force
failover

Indexing Patterns or “Know Your App”

•  Proper
indexing
cri:cal
to
performance
at
scale

(monitor
slow
queries
to
catch
non-‐performant
requests)

•  MongoDB
is
ul:mately
ﬂexible,
being
schemaless

(mongo
gives
you
enough
rope
to
hang
yourself,
choose
wisely)

•  Avoid
un-‐indexed
queries
at
all
costs

(it's
quickest
way
to
crater
your
app...
consider
-‐-‐notablescan)

•  Onus
on
DevOps
to
match
applica:on
to
indexes

(know
your
query
proﬁle,
never
assume)

•  Shoot
for
'covered
queries'
wherever
possible

(answer
can
be
obtained
from
indexes
only)

Capped Collections

•  Use
standard
capped
collec:ons
for
retaining
a
ﬁxed
amount

of
data.

Uses
a
FIFO
strategy
for
pruning.

(based
on
data
size,
not
number
of
rows)

•  TTL
Collec:ons
(2.2)
age
out
data
based
on
a
reten:on
:me

conﬁgura:on.

(great
for
data
reten:on
requirements
of
all
types)

Gotcha!

Explicitly
create
the
capped
collec:on
before
any
data
is
put

into
the
system
to
avoid
auto-‐crea:on
of
collec:on

Lessons Learned

• 
Mongo
2.2
upgrade
containing
a
capped
collec:on
created
in
1.8.4.

This
severely

impacted
replica:on
(RC:
no
"_id"
index,

FIX:
add
"_id"
index)

• 
Never
start
mongo
when
a
mount
point
is
missing
or
incorrectly
configured.
Mongo

may
decide
to
take
maYers
into
it's
own
hands
and
resync
itself
with
the
replica
set.

Make
sure
your
devops
and
your
hos0ng
provider
admins
are
aware
of
this

• 
Some
drivers
that
use
connec:on
pooling
can
freak
the
freaky
freak
when
the
primary

member
changes
(older
pymongo).

Kicking
the
applica:on
can
fix,
also:
upgrade
drivers

• 
High
locked
%
is
a
big
red-‐flag,
and
can
be
caused
by
a
large
number
of
simultaneous

dml
ac:ons
(high
insert
rate,
high
update
rate).
Consider
this
in
the
design
phase.

• 
Be
wary
of
automa:on
that
can
change
the
state
of
a
node
during
maintenance
mode.

Disable
automa:on
agents
for
reduced
risk
during
cri:cal
administra:ve
opera:ons

(filesystem
maint,
etc)

MongoDB at MapMyFitness

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à MongoDB at MapMyFitness

Similaire à MongoDB at MapMyFitness (20)

Dernier

Dernier (20)

MongoDB at MapMyFitness