SlideShare une entreprise Scribd logo
1  sur  38
Télécharger pour lire hors ligne
MongoDB vs MySQL
 A DevOps point of view.




Pierre Baillet <oct@fotopedia.com> @octplane
Mathieu Poumeyrol <kali@fotopedia.com>
Summary
«The question is, which is to be master»       Humpty Dumpty




 Who we are
 Context and constraints
 High availability and day-to-day operations
 Scalability
Who we are

Fotopedia (and not photopedia or fotolia)
Created in 2006, Paris based
around 20 people, some Apple ex-employees
Pictures for humanity, cross-breed between flickr and
wikipedia
What we do
Website http://www.fotopedia.com (100% free website,
become member and show us your best photos)
2011 Crunchies award for Best Tablet Application
Some statistics

 150 Millions photos views
 1 MySQL database
 4 MongoDB ‘clusters’ (spread on 4 servers)
 Around 500GB of structured data
Context and Constraints
«La nuit ne peut qu'empirer mille fois»   Roméo & Juliette




 24/7 Website and web-services
 Continuous deployment
 Current Infrastructure
 What we expect from our NoSQL DBMS and our
 compromise
24/7

Several million of users around the world. Between 300
and several thousands at once connected.
Using either the website or one of the 7 available iOS
applications. Business Critical
When the website is down at the application level,
everything starts to fail gradually
We cannot stop the website completely. Ever.
Overall activity on the main HTTP entry point
Continuous deployment
Git-based development flow with several active
branches
development branch deployed every wednesday
an average of 3 minor hot-fixes every workday
agile: any developer can push its hot fixs in production,
at any time
We cannot easily schedule migrations. They should be
as transparent as possible.
Our infrastructure


 Software stack
 Monitoring tools
 Hosting platform
Software Stack
RoR Website
Multiple OpenSource software used in the web stack:
HAProxy, Nginx, Unicorn, mongo-resque, ...
Some well known NoSQL tools:
  MySQL used to manage what was the core of our
  data
  MongoDB in production since September 2009, now
  managing more than 70% of our data
Monitoring tools

 Munin, Nagios
 Custom log feeder build around MongoDB (cf
 slideshare presentation: "mongodb as a log collector")
 MongoDB is also used to store slow transactions,
 exceptions and profiling traces for later inspection
Hosting Platform: 100% AWS

 Instances are not highly reliable
   but they are both abundant and disposable
 Disk is abundant and disposable too
 Use AWS RDS for MySQL hosting. Cheap and easy to
 setup but very shaky failover process (DNS based).
 We cannot rely too much on the hardware
What we expect from our NoSQL DBMS and our
compromise

 No downtime:
   High availability
   No migration cost
 Easy to deploy, redeploy, replicate, reconfigure
 Quietly losing seconds of writes is preferable to
   weekly minutes-long maintenances periods
   minutes-long unscheduled downtime and manual
   failover in case of hardware failure
High Availability,
Day to Day operations
«Au fond de la cave, Paraît qu'il y a pas de sots métiers»   Le poinçonneur de lilas




  Development environment
  Operations cycle
  Fit for the DevOps
Dev’ Cycle

Data locality
Data migration
  Alter table
  Index creation
Data backup/restoration
Dev’ Data Locality

 In MongoDB, a collection will typically replace 2 or 3
 SQL tables
 The physical proximity, locality, enables faster, simpler
 and more complete data retrieval from the application
 point of view. Less requests, more data.
Dev’ Data Migration: ALTER

ALTER TABLE is nightmarish
  leads to various forms of model abusing strategy:
    reuse of fields
    flag fields (binary encoded), blob fields (json/xml
    encoded), ...
MongoDB solution: free form data storage, extensible.
Defensive strategy

 Application code aware of possible inconsistencies:
   gracefully failing view layer
   self-healing data access layer
   routine data checking and fixing batch
Dev’ Data Migration: INDICES


 Indices creation leads to table-wide lock in MySQL.
 Renders part of the Cluster unavailable
 MongoDB solution: Background indices creation, slows
 access a tiny bit, but do not lock !
Dev’ Backup/Restore

MongoDB ability to dump a db/collection empowers
developer
Possible to restore part of the production dataset
simply on a development box
Backup a MongoDB by collections in S3, recover on
dev’ platform in a matter of minutes
Ops Cycle

MongoDB, small is beautiful
Cornerstone: the Replica Set
High availability
Backup and data import/export
Hardware migration
Ops, MongoDB, small and beautiful


 Young software, relatively compact (around 150,000 of
 C++ code)
 Builds out of the box on modern distributions
 Distros Package made by 10gen
 Drivers for most popular languages are also provided
 and maintained by 10gen staff. (although quality varies)
Ops, Replica Set
A set of machine sharing the same data
Only one Primary, several Secondaries
All writes go to Primary, routed to secondaries.
Reads can be routed to primary or secondary at the
application choice
With the combination of AWS, Replica Set are very
powerful. MongoDB Loves the Cloud !
Ops, Master/Slave reloaded
Client libraries are replica set aware
  connect to any node(s), the configuration and current
  layout is discovered
Database semantics are preserved
Incredibly easy to setup
  Priority between nodes can be dynamically changed
  It’s possible to prevent a node from ever becoming
  master (slow-disk server used as a «hot backup»)
Ops, High Availability Strengths
 Primary step down can be triggered. Lead to election
 of a new Primary.
 a new Primary is picked when the Primary becomes
 unreachable
 clients will transparently connect to the new Primary
 MongoDB Arbiter ensure split brain will not happen
 Config. Server contains the sharding information. 1 or 3
 config servers with internal failover mechanism
Ops, High Availability compromise




 Switch over will take 20 to 25 seconds
   Some queries in the interval may crash
   Some writes may reach a split primary
Ops, Backup and exports

Stop the secondary and do whatever you need done.
Easy to backup a single collection or a whole database


As a matter of fact, we just dumbly «mongodump»
every collection of interest separately.
Ops, Hardware migration

Optionally possible to «preload» with a FS or block level
snapshot
Add the brand new node to the replica set
Wait for synchro
Change RS rules to get your new server primary
Remove the old hardware
Fit for the DevOps

 In the modern sense of DevOps, MongoDB provides
 the Agility and Ease of use required
 It provides working tools for developers
 And is much more confortable than MySQL in its daily
 usage
 Truly a DevOps-friendly tool.
Scalability
«Accroche toi au pinceau, j’enlève le shell.»   Entendu @fotopedia




 Cloud Limitations
 Sharding and Replica Set
 Performance
    Reading
    Writing
 Storage
 Scalable from the ground up
Cloud Limitations
 Virtual Hardware
   Neighbors can eat all you I/O
   No precise control and overview of this situation
   Largest VM cannot compare to largest Metal
 Hardware issue means zero-notice before instance
 retirement (Metal has same issue though). Need to be
 flexible
 Scaling-out is the way to scale on the Cloud
Sharding

Use a business key to part your data
Each shard is typically a replica set
Access is provided via the MongoS servers
Configuration is stored and managed in the Config
servers
Reading
Without Sharding
  Reading is performed on a master by default to
  perserve read-your-own-writes. Can be
  programmatically allowed on a slave.
  To scale up reads, add Replica Set nodes
With Sharding
  Reading is performed in parallel across data nodes
  To scale up reads: ensure most queries will reach
  only one single shard
Writing


 Writes are always performed on the Primary node, so
 replica set does not help.
 Sharding distributes the write among the cluster
Storage

Replica Set and shards can be moved on as many
servers as needed.
To get more space
  scale up by migrating your Replica Set to bigger
  hardware
  scale out by sharding existing the collection
Scalable from the ground up
MongoDB is scalable as soon as you need it to.
No complex configuration for replication
Beautiful ability to handle replica set and shards out of
the box
MongoS / Config Server / Shards allows more complex
setup
Cloud Friendly
Questions ?




  Oh, and btw, we hire in Paris !

Contenu connexe

Tendances

MongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB
 
Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2MongoDB
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDBDATAVERSITY
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBMongoDB
 
MongoATL: How Sourceforge is Using MongoDB
MongoATL: How Sourceforge is Using MongoDBMongoATL: How Sourceforge is Using MongoDB
MongoATL: How Sourceforge is Using MongoDBRick Copeland
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsSteven Francia
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBWilliam LaForest
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDBElieHannouch
 
Securing Your MongoDB Deployment
Securing Your MongoDB DeploymentSecuring Your MongoDB Deployment
Securing Your MongoDB DeploymentMongoDB
 
Maximizing MongoDB Performance on AWS
Maximizing MongoDB Performance on AWSMaximizing MongoDB Performance on AWS
Maximizing MongoDB Performance on AWSMongoDB
 
Webinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDBWebinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDBMongoDB
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101MongoDB
 
MongoDB Operations for Developers
MongoDB Operations for DevelopersMongoDB Operations for Developers
MongoDB Operations for DevelopersMongoDB
 
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDBMongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDBMongoDB
 
MongoDB and RDBMS: Using Polyglot Persistence at Equifax
MongoDB and RDBMS: Using Polyglot Persistence at Equifax MongoDB and RDBMS: Using Polyglot Persistence at Equifax
MongoDB and RDBMS: Using Polyglot Persistence at Equifax MongoDB
 
Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
 
Cignex mongodb-sharding-mongodbdays
Cignex mongodb-sharding-mongodbdaysCignex mongodb-sharding-mongodbdays
Cignex mongodb-sharding-mongodbdaysMongoDB APAC
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB ClusterMongoDB
 

Tendances (20)

MongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB and AWS Best Practices
MongoDB and AWS Best Practices
 
NoSQL benchmarking
NoSQL benchmarkingNoSQL benchmarking
NoSQL benchmarking
 
Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDB
 
MongoATL: How Sourceforge is Using MongoDB
MongoATL: How Sourceforge is Using MongoDBMongoATL: How Sourceforge is Using MongoDB
MongoATL: How Sourceforge is Using MongoDB
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS Applications
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDB
 
Securing Your MongoDB Deployment
Securing Your MongoDB DeploymentSecuring Your MongoDB Deployment
Securing Your MongoDB Deployment
 
Maximizing MongoDB Performance on AWS
Maximizing MongoDB Performance on AWSMaximizing MongoDB Performance on AWS
Maximizing MongoDB Performance on AWS
 
Webinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDBWebinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDB
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101
 
MongoDB Operations for Developers
MongoDB Operations for DevelopersMongoDB Operations for Developers
MongoDB Operations for Developers
 
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDBMongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
 
MongoDB and RDBMS: Using Polyglot Persistence at Equifax
MongoDB and RDBMS: Using Polyglot Persistence at Equifax MongoDB and RDBMS: Using Polyglot Persistence at Equifax
MongoDB and RDBMS: Using Polyglot Persistence at Equifax
 
Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8
 
Cignex mongodb-sharding-mongodbdays
Cignex mongodb-sharding-mongodbdaysCignex mongodb-sharding-mongodbdays
Cignex mongodb-sharding-mongodbdays
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB Cluster
 
How and when to use NoSQL
How and when to use NoSQLHow and when to use NoSQL
How and when to use NoSQL
 

Similaire à MongoDB vs Mysql. A devops point of view

Pros and Cons of MongoDB in Web Development
Pros and Cons of MongoDB in Web DevelopmentPros and Cons of MongoDB in Web Development
Pros and Cons of MongoDB in Web DevelopmentNirvana Canada
 
Management and Automation of MongoDB Clusters - Slides
Management and Automation of MongoDB Clusters - SlidesManagement and Automation of MongoDB Clusters - Slides
Management and Automation of MongoDB Clusters - SlidesSeveralnines
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop User Group
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
 
UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015Christopher Curtin
 
2010 mongo berlin-scaling
2010 mongo berlin-scaling2010 mongo berlin-scaling
2010 mongo berlin-scalingMongoDB
 
La creación de una capa operacional con MongoDB
La creación de una capa operacional con MongoDBLa creación de una capa operacional con MongoDB
La creación de una capa operacional con MongoDBMongoDB
 
Mongo db transcript
Mongo db transcriptMongo db transcript
Mongo db transcriptfoliba
 
Q con london2011-matthewwall-whyichosemongodbforguardiancouk
Q con london2011-matthewwall-whyichosemongodbforguardiancoukQ con london2011-matthewwall-whyichosemongodbforguardiancouk
Q con london2011-matthewwall-whyichosemongodbforguardiancoukRoger Xia
 
Percona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWSPercona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWSPythian
 
Using MongoDB to Build a Fast and Scalable Content Repository
Using MongoDB to Build a Fast and Scalable Content RepositoryUsing MongoDB to Build a Fast and Scalable Content Repository
Using MongoDB to Build a Fast and Scalable Content RepositoryMongoDB
 
Growing MongoDB on AWS
Growing MongoDB on AWSGrowing MongoDB on AWS
Growing MongoDB on AWScolinthehowe
 
Why Wordnik went non-relational
Why Wordnik went non-relationalWhy Wordnik went non-relational
Why Wordnik went non-relationalTony Tam
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archroyans
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archguest18a0f1
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archmclee
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupesh Bansal
 
Why we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.ukWhy we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.ukGraham Tackley
 
The Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb ClusterThe Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb ClusterChris Henry
 

Similaire à MongoDB vs Mysql. A devops point of view (20)

Pros and Cons of MongoDB in Web Development
Pros and Cons of MongoDB in Web DevelopmentPros and Cons of MongoDB in Web Development
Pros and Cons of MongoDB in Web Development
 
Management and Automation of MongoDB Clusters - Slides
Management and Automation of MongoDB Clusters - SlidesManagement and Automation of MongoDB Clusters - Slides
Management and Automation of MongoDB Clusters - Slides
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015
 
2010 mongo berlin-scaling
2010 mongo berlin-scaling2010 mongo berlin-scaling
2010 mongo berlin-scaling
 
MongoDB
MongoDBMongoDB
MongoDB
 
La creación de una capa operacional con MongoDB
La creación de una capa operacional con MongoDBLa creación de una capa operacional con MongoDB
La creación de una capa operacional con MongoDB
 
Mongo db transcript
Mongo db transcriptMongo db transcript
Mongo db transcript
 
Q con london2011-matthewwall-whyichosemongodbforguardiancouk
Q con london2011-matthewwall-whyichosemongodbforguardiancoukQ con london2011-matthewwall-whyichosemongodbforguardiancouk
Q con london2011-matthewwall-whyichosemongodbforguardiancouk
 
Percona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWSPercona Live 2014 - Scaling MySQL in AWS
Percona Live 2014 - Scaling MySQL in AWS
 
Using MongoDB to Build a Fast and Scalable Content Repository
Using MongoDB to Build a Fast and Scalable Content RepositoryUsing MongoDB to Build a Fast and Scalable Content Repository
Using MongoDB to Build a Fast and Scalable Content Repository
 
Growing MongoDB on AWS
Growing MongoDB on AWSGrowing MongoDB on AWS
Growing MongoDB on AWS
 
Why Wordnik went non-relational
Why Wordnik went non-relationalWhy Wordnik went non-relational
Why Wordnik went non-relational
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
Why we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.ukWhy we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.uk
 
The Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb ClusterThe Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb Cluster
 

Dernier

Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024BookNet Canada
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023
THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023
THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023Joshua Flannery
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Dynamical Context introduction word sensibility orientation
Dynamical Context introduction word sensibility orientationDynamical Context introduction word sensibility orientation
Dynamical Context introduction word sensibility orientationBuild Intuit
 
The Critical Role of Spatial Data in Today's Data Ecosystem
The Critical Role of Spatial Data in Today's Data EcosystemThe Critical Role of Spatial Data in Today's Data Ecosystem
The Critical Role of Spatial Data in Today's Data EcosystemSafe Software
 
WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneUiPathCommunity
 
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...BookNet Canada
 
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdfHCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdfROWELL MARQUINA
 
full stack practical assignment msc cs.pdf
full stack practical assignment msc cs.pdffull stack practical assignment msc cs.pdf
full stack practical assignment msc cs.pdfHulkTheDevil
 
A PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptxA PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptxatharvdev2010
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Arti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdfArti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdfwill854175
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Women in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automationWomen in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automationDianaGray10
 

Dernier (20)

Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023
THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023
THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Dynamical Context introduction word sensibility orientation
Dynamical Context introduction word sensibility orientationDynamical Context introduction word sensibility orientation
Dynamical Context introduction word sensibility orientation
 
The Critical Role of Spatial Data in Today's Data Ecosystem
The Critical Role of Spatial Data in Today's Data EcosystemThe Critical Role of Spatial Data in Today's Data Ecosystem
The Critical Role of Spatial Data in Today's Data Ecosystem
 
WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyone
 
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
 
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdfHCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
 
full stack practical assignment msc cs.pdf
full stack practical assignment msc cs.pdffull stack practical assignment msc cs.pdf
full stack practical assignment msc cs.pdf
 
A PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptxA PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptx
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Arti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdfArti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdf
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Women in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automationWomen in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automation
 

MongoDB vs Mysql. A devops point of view

  • 1. MongoDB vs MySQL A DevOps point of view. Pierre Baillet <oct@fotopedia.com> @octplane Mathieu Poumeyrol <kali@fotopedia.com>
  • 2. Summary «The question is, which is to be master» Humpty Dumpty Who we are Context and constraints High availability and day-to-day operations Scalability
  • 3. Who we are Fotopedia (and not photopedia or fotolia) Created in 2006, Paris based around 20 people, some Apple ex-employees Pictures for humanity, cross-breed between flickr and wikipedia
  • 4. What we do Website http://www.fotopedia.com (100% free website, become member and show us your best photos) 2011 Crunchies award for Best Tablet Application
  • 5. Some statistics 150 Millions photos views 1 MySQL database 4 MongoDB ‘clusters’ (spread on 4 servers) Around 500GB of structured data
  • 6. Context and Constraints «La nuit ne peut qu'empirer mille fois» Roméo & Juliette 24/7 Website and web-services Continuous deployment Current Infrastructure What we expect from our NoSQL DBMS and our compromise
  • 7. 24/7 Several million of users around the world. Between 300 and several thousands at once connected. Using either the website or one of the 7 available iOS applications. Business Critical When the website is down at the application level, everything starts to fail gradually We cannot stop the website completely. Ever.
  • 8. Overall activity on the main HTTP entry point
  • 9. Continuous deployment Git-based development flow with several active branches development branch deployed every wednesday an average of 3 minor hot-fixes every workday agile: any developer can push its hot fixs in production, at any time We cannot easily schedule migrations. They should be as transparent as possible.
  • 10. Our infrastructure Software stack Monitoring tools Hosting platform
  • 11. Software Stack RoR Website Multiple OpenSource software used in the web stack: HAProxy, Nginx, Unicorn, mongo-resque, ... Some well known NoSQL tools: MySQL used to manage what was the core of our data MongoDB in production since September 2009, now managing more than 70% of our data
  • 12. Monitoring tools Munin, Nagios Custom log feeder build around MongoDB (cf slideshare presentation: "mongodb as a log collector") MongoDB is also used to store slow transactions, exceptions and profiling traces for later inspection
  • 13. Hosting Platform: 100% AWS Instances are not highly reliable but they are both abundant and disposable Disk is abundant and disposable too Use AWS RDS for MySQL hosting. Cheap and easy to setup but very shaky failover process (DNS based). We cannot rely too much on the hardware
  • 14. What we expect from our NoSQL DBMS and our compromise No downtime: High availability No migration cost Easy to deploy, redeploy, replicate, reconfigure Quietly losing seconds of writes is preferable to weekly minutes-long maintenances periods minutes-long unscheduled downtime and manual failover in case of hardware failure
  • 15. High Availability, Day to Day operations «Au fond de la cave, Paraît qu'il y a pas de sots métiers» Le poinçonneur de lilas Development environment Operations cycle Fit for the DevOps
  • 16. Dev’ Cycle Data locality Data migration Alter table Index creation Data backup/restoration
  • 17. Dev’ Data Locality In MongoDB, a collection will typically replace 2 or 3 SQL tables The physical proximity, locality, enables faster, simpler and more complete data retrieval from the application point of view. Less requests, more data.
  • 18. Dev’ Data Migration: ALTER ALTER TABLE is nightmarish leads to various forms of model abusing strategy: reuse of fields flag fields (binary encoded), blob fields (json/xml encoded), ... MongoDB solution: free form data storage, extensible.
  • 19. Defensive strategy Application code aware of possible inconsistencies: gracefully failing view layer self-healing data access layer routine data checking and fixing batch
  • 20. Dev’ Data Migration: INDICES Indices creation leads to table-wide lock in MySQL. Renders part of the Cluster unavailable MongoDB solution: Background indices creation, slows access a tiny bit, but do not lock !
  • 21. Dev’ Backup/Restore MongoDB ability to dump a db/collection empowers developer Possible to restore part of the production dataset simply on a development box Backup a MongoDB by collections in S3, recover on dev’ platform in a matter of minutes
  • 22. Ops Cycle MongoDB, small is beautiful Cornerstone: the Replica Set High availability Backup and data import/export Hardware migration
  • 23. Ops, MongoDB, small and beautiful Young software, relatively compact (around 150,000 of C++ code) Builds out of the box on modern distributions Distros Package made by 10gen Drivers for most popular languages are also provided and maintained by 10gen staff. (although quality varies)
  • 24. Ops, Replica Set A set of machine sharing the same data Only one Primary, several Secondaries All writes go to Primary, routed to secondaries. Reads can be routed to primary or secondary at the application choice With the combination of AWS, Replica Set are very powerful. MongoDB Loves the Cloud !
  • 25. Ops, Master/Slave reloaded Client libraries are replica set aware connect to any node(s), the configuration and current layout is discovered Database semantics are preserved Incredibly easy to setup Priority between nodes can be dynamically changed It’s possible to prevent a node from ever becoming master (slow-disk server used as a «hot backup»)
  • 26. Ops, High Availability Strengths Primary step down can be triggered. Lead to election of a new Primary. a new Primary is picked when the Primary becomes unreachable clients will transparently connect to the new Primary MongoDB Arbiter ensure split brain will not happen Config. Server contains the sharding information. 1 or 3 config servers with internal failover mechanism
  • 27. Ops, High Availability compromise Switch over will take 20 to 25 seconds Some queries in the interval may crash Some writes may reach a split primary
  • 28. Ops, Backup and exports Stop the secondary and do whatever you need done. Easy to backup a single collection or a whole database As a matter of fact, we just dumbly «mongodump» every collection of interest separately.
  • 29. Ops, Hardware migration Optionally possible to «preload» with a FS or block level snapshot Add the brand new node to the replica set Wait for synchro Change RS rules to get your new server primary Remove the old hardware
  • 30. Fit for the DevOps In the modern sense of DevOps, MongoDB provides the Agility and Ease of use required It provides working tools for developers And is much more confortable than MySQL in its daily usage Truly a DevOps-friendly tool.
  • 31. Scalability «Accroche toi au pinceau, j’enlève le shell.» Entendu @fotopedia Cloud Limitations Sharding and Replica Set Performance Reading Writing Storage Scalable from the ground up
  • 32. Cloud Limitations Virtual Hardware Neighbors can eat all you I/O No precise control and overview of this situation Largest VM cannot compare to largest Metal Hardware issue means zero-notice before instance retirement (Metal has same issue though). Need to be flexible Scaling-out is the way to scale on the Cloud
  • 33. Sharding Use a business key to part your data Each shard is typically a replica set Access is provided via the MongoS servers Configuration is stored and managed in the Config servers
  • 34. Reading Without Sharding Reading is performed on a master by default to perserve read-your-own-writes. Can be programmatically allowed on a slave. To scale up reads, add Replica Set nodes With Sharding Reading is performed in parallel across data nodes To scale up reads: ensure most queries will reach only one single shard
  • 35. Writing Writes are always performed on the Primary node, so replica set does not help. Sharding distributes the write among the cluster
  • 36. Storage Replica Set and shards can be moved on as many servers as needed. To get more space scale up by migrating your Replica Set to bigger hardware scale out by sharding existing the collection
  • 37. Scalable from the ground up MongoDB is scalable as soon as you need it to. No complex configuration for replication Beautiful ability to handle replica set and shards out of the box MongoS / Config Server / Shards allows more complex setup Cloud Friendly
  • 38. Questions ? Oh, and btw, we hire in Paris !

Notes de l'éditeur

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n