SlideShare une entreprise Scribd logo
1  sur  19
Télécharger pour lire hors ligne
9/14/12	
  
Route & Elevation data example
   (Lost on the way to MongoSeattle)
Implementation Patterns

•                 	
  Standard	
  Datastore	
  -­‐	
  3	
  member	
  replica	
  set	
  
	
  	
  	
  	
  (small	
  to	
  med	
  implementa:ons)	
  
                  	
  
•  	
  Big	
  Data	
  implementa:on	
  –	
  sharded	
  cluster	
  (TB+)	
  
                  	
  
•  	
  Buffering	
  Layer	
  -­‐	
  high	
  memory	
  	
  
                  	
  	
  	
  	
  (load	
  all	
  data	
  and	
  index	
  files	
  into	
  RAM)	
  
	
  
•      	
  Write	
  Heavy	
  -­‐	
  u:lize	
  sharding	
  to	
  op:mize	
  for	
  writes	
  
       	
  
•      	
  Read	
  Heavy	
  -­‐	
  3+n	
  replica	
  set	
  configura:on	
  for	
  rapid	
  read	
  scaling	
  
       	
  	
  	
  	
  (up	
  to	
  12	
  nodes)	
  
Implementation Patterns

•      	
  In	
  the	
  cloud,	
  tune	
  the	
  instance	
  type	
  to	
  the	
  mongo	
  
       implementa:on	
  
	
  
•      	
  On	
  iron,	
  plan	
  carefully	
  and	
  dedicate	
  servers	
  completely	
  to	
  mongo	
  
       to	
  avoid	
  memory	
  map	
  conten:on	
  
	
  
•      	
  For	
  DR,	
  spin	
  up	
  a	
  delayed,	
  hidden	
  replica	
  node	
  (preferably	
  in	
  a	
  
       different	
  datacenter)	
  
	
  
•      	
  Aggrega:on	
  framework	
  can	
  be	
  used	
  in	
  myriad	
  ways,	
  including	
  
       bridging	
  the	
  gap	
  to	
  SQL	
  data	
  warehousing	
  via	
  ETL.	
  
	
  
•      	
  Automate	
  install	
  paYerns	
  for	
  rapid	
  development,	
  prototyping,	
  
       and	
  infrastructure	
  scaling.	
  
Operational Automation
( example of automated mongodb install via puppet )
Replica Set Expansion


•    MongoDB	
  is	
  “replica:on	
  made	
  elegant”	
  
•    Ridiculously	
  simple	
  to	
  add	
  addi:onal	
  members	
  
•    Be	
  sure	
  to	
  run	
  Ini:alSync	
  from	
  a	
  secondary!	
  
     	
  
     rs.add(	
  “host”	
  :	
  “livetrack_db09”,	
  “ini:alSync”	
  :	
  {	
  “state”	
  :	
  2	
  }	
  )	
  
•    Both	
  rs.add()	
  and	
  rs.remove()	
  can	
  be	
  scripted	
  and	
  connected	
  to	
  
     Monitoring	
  systems	
  for	
  autoscaling	
  
Monitoring and Introspection

•    	
  MMS,	
  10gen's	
  cloud-­‐based	
  monitoring	
  service	
  (best	
  
     available)	
  
     	
  
•    	
  Supported	
  by	
  Zabbix,	
  Nagios,	
  Munin,	
  Server	
  Density,	
  etc	
  
     	
  
•    	
  mongostat,	
  mongotop,	
  REST	
  interface,	
  database	
  profiler	
  
     	
  
•    	
  Monitoring	
  system	
  triggers	
  can	
  ini:ate	
  node	
  addi:ons,	
  
     	
  	
  removals,	
  service	
  restarts,	
  etc	
  
     	
  
•    	
  In	
  addi:on	
  to	
  service-­‐level	
  monitoring,	
  use	
  more	
  advanced	
  
     	
  	
  tests	
  to	
  check	
  for	
  and	
  alert	
  on	
  query	
  latency	
  spikes	
  
     	
  
     	
  
10gen's MMS
(the one-stop shop for mongdb metrics)
Mongo in Zabbix
( Mikoomi Plugins: http://code.google.com/p/mikoomi )
mongostat
( Very useful for real-time troubleshooting )
Operational Automation
( example of automated mongodb restart action )
Security Considerations

•    	
  MongoDB	
  provides	
  authen:ca:on	
  support	
  and	
  basic	
  
     permissions	
  
     	
  
•    	
  Auth	
  is	
  turned	
  off	
  by	
  default	
  to	
  allow	
  for	
  op:mal	
  performance	
  	
  
     	
  
•    	
  Always	
  run	
  databases	
  in	
  a	
  trusted	
  network	
  environment	
  
     	
  
•    	
  Lock	
  down	
  host	
  based	
  firewalls	
  to	
  limit	
  access	
  to	
  required	
  
     clients	
  	
  
     	
  
•    	
  Automate	
  iptables	
  with	
  puppet	
  or	
  chef,	
  in	
  EC2	
  use	
  security	
  
     groups	
  
Network Security Automation

## Puppet Pattern for Mongodb network security


class iptables::public {

      iptables::add_rule { '001 MongoDB established':
          rule => '-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT'
      }

      iptables::add_rule { '002 MongoDB':
          rule => '-A RH-Firewall-1-INPUT -i eth1 -p tcp -m tcp --dport 27017 -j ACCEPT'
      }

      iptables::add_rule { '003 MongoDB MMF Phase II Network':
          rule => '-A RH-Firewall-1-INPUT -i eth0 -s 172.16.16.0/20 -p tcp -m tcp --dport 27017 -j ACCEPT'
      }

      iptables::add_rule { '004 MongoDB MMF Cloud Network':
          rule => '-A RH-Firewall-1-INPUT -i eth0 -s 10.178.52.0/24 -p tcp -m tcp --dport 27017 -j ACCEPT'
      }

  }
Security Considerations

•    	
  Use	
  the	
  rule	
  of	
  least-­‐privilege	
  to	
  allow	
  access	
  to	
  environments	
  	
  
     	
  
•    	
  Data	
  sensi:vity	
  should	
  determine	
  the	
  extent	
  of	
  security	
  
     measures	
  
     	
  
•    	
  For	
  non-­‐sensi:ve	
  data,	
  good	
  network	
  security	
  can	
  be	
  sufficient	
  	
  
     	
  
•    	
  In	
  open	
  environments,	
  be	
  sure	
  experience	
  matches	
  access	
  level	
  
     	
  
•    	
  Lack	
  of	
  granular	
  perms	
  allows	
  for	
  full	
  admin	
  access,	
  use	
  
     discre:on	
  
Maintenance

• 	
  Far	
  less	
  maintenance	
  required	
  than	
  tradi:onal	
  RDMBS	
  systems	
  
	
  
• 	
  Regularly	
  perform	
  query	
  profile	
  analysis	
  and	
  index	
  audi:ng	
  
	
  
• 	
  Rebuild	
  databases	
  to	
  reclaim	
  space	
  lost	
  due	
  to	
  fragmenta:on	
  
	
  
• 	
  Automate	
  checks	
  of	
  log	
  files	
  for	
  known	
  red-­‐flags	
  
	
  
• 	
  Regularly	
  review	
  data	
  throughput	
  rate,	
  storage	
  growth	
  rate,	
  and	
  
	
  	
  overall	
  business	
  growth	
  graphs	
  to	
  inform	
  capacity	
  planning.	
  
	
  
• 	
  For	
  HA	
  tes:ng,	
  periodically	
  step-­‐down	
  the	
  primary	
  to	
  force	
  failover	
  
Indexing Patterns or “Know Your App”

•    Proper	
  indexing	
  cri:cal	
  to	
  performance	
  at	
  scale	
  
     (monitor	
  slow	
  queries	
  to	
  catch	
  non-­‐performant	
  requests)	
  
•    MongoDB	
  is	
  ul:mately	
  flexible,	
  being	
  schemaless	
  
     (mongo	
  gives	
  you	
  enough	
  rope	
  to	
  hang	
  yourself,	
  choose	
  wisely)	
  
•    Avoid	
  un-­‐indexed	
  queries	
  at	
  all	
  costs	
  	
  
     (it's	
  quickest	
  way	
  to	
  crater	
  your	
  app...	
  consider	
  -­‐-­‐notablescan)	
  
•    Onus	
  on	
  DevOps	
  to	
  match	
  applica:on	
  to	
  indexes	
  
     (know	
  your	
  query	
  profile,	
  never	
  assume)	
  
•    Shoot	
  for	
  'covered	
  queries'	
  wherever	
  possible	
  
     (answer	
  can	
  be	
  obtained	
  from	
  indexes	
  only)	
  
Capped Collections


•  Use	
  standard	
  capped	
  collec:ons	
  for	
  retaining	
  a	
  fixed	
  amount	
  
   of	
  data.	
  	
  Uses	
  a	
  FIFO	
  strategy	
  for	
  pruning.	
  
       (based	
  on	
  data	
  size,	
  not	
  number	
  of	
  rows)	
  
       	
  
•  TTL	
  Collec:ons	
  (2.2)	
  age	
  out	
  data	
  based	
  on	
  a	
  reten:on	
  :me	
  
   configura:on.	
  	
  	
  
       (great	
  for	
  data	
  reten:on	
  requirements	
  of	
  all	
  types)	
  
	
  
       Gotcha!	
  
       	
  
       Explicitly	
  create	
  the	
  capped	
  collec:on	
  before	
  any	
  data	
  is	
  put	
  
       into	
  the	
  system	
  to	
  avoid	
  auto-­‐crea:on	
  of	
  collec:on	
  
Lessons Learned

•    	
  Mongo	
  2.2	
  upgrade	
  containing	
  a	
  capped	
  collec:on	
  created	
  in	
  1.8.4.	
  	
  This	
  severely	
  
     impacted	
  replica:on	
  (RC:	
  no	
  "_id"	
  index,	
  	
  FIX:	
  add	
  "_id"	
  index)	
  	
  
     	
  
•    	
  Never	
  start	
  mongo	
  when	
  a	
  mount	
  point	
  is	
  missing	
  or	
  incorrectly	
  configured.	
  Mongo	
  
     may	
  decide	
  to	
  take	
  maYers	
  into	
  it's	
  own	
  hands	
  and	
  resync	
  itself	
  with	
  the	
  replica	
  set.	
  	
  
     Make	
  sure	
  your	
  devops	
  and	
  your	
  hos0ng	
  provider	
  admins	
  are	
  aware	
  of	
  this	
  
     	
  
•    	
  Some	
  drivers	
  that	
  use	
  connec:on	
  pooling	
  can	
  freak	
  the	
  freaky	
  freak	
  when	
  the	
  primary	
  
     member	
  changes	
  (older	
  pymongo).	
  	
  Kicking	
  the	
  applica:on	
  can	
  fix,	
  also:	
  upgrade	
  drivers	
  
     	
  
•    	
  High	
  locked	
  %	
  is	
  a	
  big	
  red-­‐flag,	
  and	
  can	
  be	
  caused	
  by	
  a	
  large	
  number	
  of	
  simultaneous	
  
     dml	
  ac:ons	
  (high	
  insert	
  rate,	
  high	
  update	
  rate).	
  Consider	
  this	
  in	
  the	
  design	
  phase.	
  
     	
  
•    	
  Be	
  wary	
  of	
  automa:on	
  that	
  can	
  change	
  the	
  state	
  of	
  a	
  node	
  during	
  maintenance	
  mode.	
  	
  
     Disable	
  automa:on	
  agents	
  for	
  reduced	
  risk	
  during	
  cri:cal	
  administra:ve	
  opera:ons	
  
     (filesystem	
  maint,	
  etc)	
  
9/14/12	
  

Contenu connexe

Tendances

Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
DataStax
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
DataStax
 

Tendances (20)

Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
 
Tuning Linux for MongoDB
Tuning Linux for MongoDBTuning Linux for MongoDB
Tuning Linux for MongoDB
 
Webinar: MongoDB Management Service (MMS): Session 02 - Backing up Data
Webinar: MongoDB Management Service (MMS): Session 02 - Backing up DataWebinar: MongoDB Management Service (MMS): Session 02 - Backing up Data
Webinar: MongoDB Management Service (MMS): Session 02 - Backing up Data
 
Multi-tenant Apache Storm as a service
Multi-tenant Apache Storm as a serviceMulti-tenant Apache Storm as a service
Multi-tenant Apache Storm as a service
 
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
 
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
Performance Tuning -  Memory leaks, Thread deadlocks, JDK toolsPerformance Tuning -  Memory leaks, Thread deadlocks, JDK tools
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
 
Performance Analysis and Troubleshooting Methodologies for Databases
Performance Analysis and Troubleshooting Methodologies for DatabasesPerformance Analysis and Troubleshooting Methodologies for Databases
Performance Analysis and Troubleshooting Methodologies for Databases
 
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...
 
High Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of ViewHigh Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of View
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
 
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 
1 Million Writes per second on 60 nodes with Cassandra and EBS
1 Million Writes per second on 60 nodes with Cassandra and EBS1 Million Writes per second on 60 nodes with Cassandra and EBS
1 Million Writes per second on 60 nodes with Cassandra and EBS
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Instaclustr introduction to managing cassandra
Instaclustr introduction to managing cassandraInstaclustr introduction to managing cassandra
Instaclustr introduction to managing cassandra
 
Cloud Performance Benchmarking
Cloud Performance BenchmarkingCloud Performance Benchmarking
Cloud Performance Benchmarking
 
Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB Clusters
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
 
Rit 2011 ats
Rit 2011 atsRit 2011 ats
Rit 2011 ats
 

Similaire à MongoDB at MapMyFitness

Similaire à MongoDB at MapMyFitness (20)

MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDBMongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
 
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaPrometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
 
MongoDB: How We Did It – Reanimating Identity at AOL
MongoDB: How We Did It – Reanimating Identity at AOLMongoDB: How We Did It – Reanimating Identity at AOL
MongoDB: How We Did It – Reanimating Identity at AOL
 
MongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness PlatformMongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness Platform
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
 
Manta Unleashed BigDataSG talk 2 July 2013
Manta Unleashed BigDataSG talk 2 July 2013Manta Unleashed BigDataSG talk 2 July 2013
Manta Unleashed BigDataSG talk 2 July 2013
 
Run MongoDB with Confidence Using MongoDB Management Service (MMS)
Run MongoDB with Confidence Using MongoDB Management Service (MMS)Run MongoDB with Confidence Using MongoDB Management Service (MMS)
Run MongoDB with Confidence Using MongoDB Management Service (MMS)
 
How to secure your web applications with NGINX
How to secure your web applications with NGINXHow to secure your web applications with NGINX
How to secure your web applications with NGINX
 
Build cloud native solution using open source
Build cloud native solution using open source Build cloud native solution using open source
Build cloud native solution using open source
 
Webinar: Best Practices for Upgrading to MongoDB 3.2
Webinar: Best Practices for Upgrading to MongoDB 3.2Webinar: Best Practices for Upgrading to MongoDB 3.2
Webinar: Best Practices for Upgrading to MongoDB 3.2
 
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
 
Cloudify workshop at CCCEU 2014
Cloudify workshop at CCCEU 2014 Cloudify workshop at CCCEU 2014
Cloudify workshop at CCCEU 2014
 
Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling StoryPHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

MongoDB at MapMyFitness

  • 2. Route & Elevation data example (Lost on the way to MongoSeattle)
  • 3. Implementation Patterns •   Standard  Datastore  -­‐  3  member  replica  set          (small  to  med  implementa:ons)     •   Big  Data  implementa:on  –  sharded  cluster  (TB+)     •   Buffering  Layer  -­‐  high  memory            (load  all  data  and  index  files  into  RAM)     •   Write  Heavy  -­‐  u:lize  sharding  to  op:mize  for  writes     •   Read  Heavy  -­‐  3+n  replica  set  configura:on  for  rapid  read  scaling          (up  to  12  nodes)  
  • 4. Implementation Patterns •   In  the  cloud,  tune  the  instance  type  to  the  mongo   implementa:on     •   On  iron,  plan  carefully  and  dedicate  servers  completely  to  mongo   to  avoid  memory  map  conten:on     •   For  DR,  spin  up  a  delayed,  hidden  replica  node  (preferably  in  a   different  datacenter)     •   Aggrega:on  framework  can  be  used  in  myriad  ways,  including   bridging  the  gap  to  SQL  data  warehousing  via  ETL.     •   Automate  install  paYerns  for  rapid  development,  prototyping,   and  infrastructure  scaling.  
  • 5. Operational Automation ( example of automated mongodb install via puppet )
  • 6. Replica Set Expansion •  MongoDB  is  “replica:on  made  elegant”   •  Ridiculously  simple  to  add  addi:onal  members   •  Be  sure  to  run  Ini:alSync  from  a  secondary!     rs.add(  “host”  :  “livetrack_db09”,  “ini:alSync”  :  {  “state”  :  2  }  )   •  Both  rs.add()  and  rs.remove()  can  be  scripted  and  connected  to   Monitoring  systems  for  autoscaling  
  • 7. Monitoring and Introspection •   MMS,  10gen's  cloud-­‐based  monitoring  service  (best   available)     •   Supported  by  Zabbix,  Nagios,  Munin,  Server  Density,  etc     •   mongostat,  mongotop,  REST  interface,  database  profiler     •   Monitoring  system  triggers  can  ini:ate  node  addi:ons,      removals,  service  restarts,  etc     •   In  addi:on  to  service-­‐level  monitoring,  use  more  advanced      tests  to  check  for  and  alert  on  query  latency  spikes      
  • 8. 10gen's MMS (the one-stop shop for mongdb metrics)
  • 9. Mongo in Zabbix ( Mikoomi Plugins: http://code.google.com/p/mikoomi )
  • 10. mongostat ( Very useful for real-time troubleshooting )
  • 11. Operational Automation ( example of automated mongodb restart action )
  • 12. Security Considerations •   MongoDB  provides  authen:ca:on  support  and  basic   permissions     •   Auth  is  turned  off  by  default  to  allow  for  op:mal  performance       •   Always  run  databases  in  a  trusted  network  environment     •   Lock  down  host  based  firewalls  to  limit  access  to  required   clients       •   Automate  iptables  with  puppet  or  chef,  in  EC2  use  security   groups  
  • 13. Network Security Automation ## Puppet Pattern for Mongodb network security class iptables::public { iptables::add_rule { '001 MongoDB established': rule => '-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT' } iptables::add_rule { '002 MongoDB': rule => '-A RH-Firewall-1-INPUT -i eth1 -p tcp -m tcp --dport 27017 -j ACCEPT' } iptables::add_rule { '003 MongoDB MMF Phase II Network': rule => '-A RH-Firewall-1-INPUT -i eth0 -s 172.16.16.0/20 -p tcp -m tcp --dport 27017 -j ACCEPT' } iptables::add_rule { '004 MongoDB MMF Cloud Network': rule => '-A RH-Firewall-1-INPUT -i eth0 -s 10.178.52.0/24 -p tcp -m tcp --dport 27017 -j ACCEPT' } }
  • 14. Security Considerations •   Use  the  rule  of  least-­‐privilege  to  allow  access  to  environments       •   Data  sensi:vity  should  determine  the  extent  of  security   measures     •   For  non-­‐sensi:ve  data,  good  network  security  can  be  sufficient       •   In  open  environments,  be  sure  experience  matches  access  level     •   Lack  of  granular  perms  allows  for  full  admin  access,  use   discre:on  
  • 15. Maintenance •   Far  less  maintenance  required  than  tradi:onal  RDMBS  systems     •   Regularly  perform  query  profile  analysis  and  index  audi:ng     •   Rebuild  databases  to  reclaim  space  lost  due  to  fragmenta:on     •   Automate  checks  of  log  files  for  known  red-­‐flags     •   Regularly  review  data  throughput  rate,  storage  growth  rate,  and      overall  business  growth  graphs  to  inform  capacity  planning.     •   For  HA  tes:ng,  periodically  step-­‐down  the  primary  to  force  failover  
  • 16. Indexing Patterns or “Know Your App” •  Proper  indexing  cri:cal  to  performance  at  scale   (monitor  slow  queries  to  catch  non-­‐performant  requests)   •  MongoDB  is  ul:mately  flexible,  being  schemaless   (mongo  gives  you  enough  rope  to  hang  yourself,  choose  wisely)   •  Avoid  un-­‐indexed  queries  at  all  costs     (it's  quickest  way  to  crater  your  app...  consider  -­‐-­‐notablescan)   •  Onus  on  DevOps  to  match  applica:on  to  indexes   (know  your  query  profile,  never  assume)   •  Shoot  for  'covered  queries'  wherever  possible   (answer  can  be  obtained  from  indexes  only)  
  • 17. Capped Collections •  Use  standard  capped  collec:ons  for  retaining  a  fixed  amount   of  data.    Uses  a  FIFO  strategy  for  pruning.   (based  on  data  size,  not  number  of  rows)     •  TTL  Collec:ons  (2.2)  age  out  data  based  on  a  reten:on  :me   configura:on.       (great  for  data  reten:on  requirements  of  all  types)     Gotcha!     Explicitly  create  the  capped  collec:on  before  any  data  is  put   into  the  system  to  avoid  auto-­‐crea:on  of  collec:on  
  • 18. Lessons Learned •   Mongo  2.2  upgrade  containing  a  capped  collec:on  created  in  1.8.4.    This  severely   impacted  replica:on  (RC:  no  "_id"  index,    FIX:  add  "_id"  index)       •   Never  start  mongo  when  a  mount  point  is  missing  or  incorrectly  configured.  Mongo   may  decide  to  take  maYers  into  it's  own  hands  and  resync  itself  with  the  replica  set.     Make  sure  your  devops  and  your  hos0ng  provider  admins  are  aware  of  this     •   Some  drivers  that  use  connec:on  pooling  can  freak  the  freaky  freak  when  the  primary   member  changes  (older  pymongo).    Kicking  the  applica:on  can  fix,  also:  upgrade  drivers     •   High  locked  %  is  a  big  red-­‐flag,  and  can  be  caused  by  a  large  number  of  simultaneous   dml  ac:ons  (high  insert  rate,  high  update  rate).  Consider  this  in  the  design  phase.     •   Be  wary  of  automa:on  that  can  change  the  state  of  a  node  during  maintenance  mode.     Disable  automa:on  agents  for  reduced  risk  during  cri:cal  administra:ve  opera:ons   (filesystem  maint,  etc)