SlideShare une entreprise Scribd logo
1  sur  35
NoSQL Now 2011Why Wordnik went Non-Relational Tony Tam @fehguy
What this Talk is About 5 Key reasons why Wordnik migrated into a Non-Relational database Process for selection, migration Optimizations and tips from living survivors of the battle field
Why Should You Care? MongoDB user for almost 2 years Lessons learned, analysis, benefits from process We migrated from MySQL to MongoDB with no downtime We have interesting/challenging data needs, likely relevant to you
More on Wordnik World’s fastest updating English dictionary Based on input of text up to 8k words/second Word Graph as basis to our analysis Synchronous & asynchronous processing 10’s of Billions of documents in NR storage 20M daily REST API calls, billions served Powered by Swagger OSS API framework swagger.wordnik.com Powered API
Architectural History 2008: Wordnik was born as a LAMP AWS EC2 stack 2009: Introduced public REST API, powered wordnik.com, partner APIs 2009: drank NoSQL cool-aid 2010: Scala 2011: Micro SOA
Non-relational by Necessity Moved to NR because of “4S” Speed Stability Scaling Simplicity But… MySQL can go a LONG way Takes right team, right reasons (+ patience) NR offerings simply too compelling to focus on scaling MySQL
Wordnik’s 5 Whys for NoSQL
Why #1: Speed bumps with MySQL Inserting data fast (50k recs/second) caused MySQL mayhem Maintaining indexes largely to blame Operations for consistency unnecessary but "cannot be turned off” Devised twisted schemes to avoid client blocking Aka the “master/slave tango”
Why #2: Retrieval Complexity Objects typically mapped to tables Object Hierarchy always => inner + outer joins Lots of static data, so why join? “Noun”is not getting renamed in my code’s lifetime! Logic like this is probably in application logic Since storage is cheap I’ll choose speed
Why #2: Retrieval Complexity One definition = 10+ joins  50 requests per second!
Why #2: Retrieval Complexity Embed objects in rows “sort of works” Filtering gets really nasty Native XML in MySQL? If a full table-scan is OK… OK, then cache it! Layers of caching introduced layers of complexity Stale data/corruption Object versionitis Cache stampedes
Why #3: Object Modeling Object models being compromised for sake of persistence This is backwards! Extra abstraction for the wrong reason OK, then performance suffers In-application joins across objects “Who ran the fetch all query against production?!” –any sysadmin “My zillionth ORM layer that only I understand” (and can maintain)
Why #4: Scaling Needed "cloud friendly storage" Easy up, easy down! Startup: Sync your data, and announce to clients when ready for business Shutdown: Announce your departure and leave Adding MySQL instances was a dance Snapshot + bin files mysql> change master to MASTER_HOST='db1', MASTER_USER='xxx', MASTER_PASSWORD='xxx', MASTER_LOG_FILE='master-relay.000431', MASTER_LOG_POS=1035435402;
Why #4: Scaling What about those VMs? So convenient!  But… they kind of suck Can the database succeed on a VM? VM Performance: Memory, CPU or I/O—Pick only one Can your database really reduce CPU or disk I/O with lots of RAM?
Why #5: Big Picture BI tools use relational constraints for discovery Is this the right reason for them? Can we work around this? Let’s have a BI tool revolution, too! True service architecture makes relational constraints impractical/impossible Distributed sharding makes relational constraints impractical/impossible
Why #5: Big Picture Is your app smarter than your database? The logic line is probably blurry! What does count(*)really mean when you add 5k records/sec? Maybe eventual consistency is not so bad… 2PC?  Do some reading and decide! http://eaipatterns.com/docs/IEEE_Software_Design_2PC.pdf
Ok, I’m in! I thought deciding was easy!? Many quickly maturing products Divergent features tackle different needs Wordnik spent 8 weeks researching and testing NoSQL solutions This is a long time! (for a startup) Wrote ODM classes and migrated our data Surprise!  There were surprises Be prepared to compromise
Choice Made, Now What? We went with MongoDB *** Fastest to implement Most reliable Best community Why? Why #1: Fast loading/retrieval Why #2: Fast ODM (50 tps => 1000 tps!) Why #3: Document Models === Object models Why #4: MMF => Kernel-managed memory + RS Why #5: It’s 2011, is there no progress?
More on Why MongoDB Testing, testing, testing Used our migration tools to load test Read from MySQL, write to MongoDB We loaded 5+ billion documents, many times over In the end, one server could… Insert 100k records/sec sustained Read 250k records/sec sustained Support concurrent loading/reading
Migration & Testing Iterated ODM mapping multiple times Some issues Type Safety cur.next.get("iWasAnIntOnce").asInstanceOf[Long] Dates as Strings obj.put("a_date", "2011-12-31") !=  obj.put("a_date", new Date("2011-12-31")) Storage Size obj.put("very_long_field_name", true) >>  obj.put("vsfn", true)
Migration & Testing Expect data model iterations Wordnik migrated table to Mongo collection "as-is” Easier to migrate, test _id field used same MySQL PK Auto Increment? Used MySQL to “check-out” sequences One row per mongo collection Run out of sequences => get more Need exclusive locks here!
Migration & Testing Sequence generator in-process SequenceGenerator.checkout("doc_metadata,100") Sequence generator as web service Centralized UID management
Migration & Testing Expect data access pattern iterations So much more flexibility! Reach into objects > db.dictionary_entry.find({"hdr.sr":"cmu"}) Access to a whole object tree at query time Overwrite a whole object at once… when desired Not always! This clobbers the whole record > db.foo.save({_id:18727353,foo:"bar"}) Update a single field: > db.foo.update({_id:18727353},{$set:{foo:"bar"}})
Flip the Switch Migrate production with zero downtime We temporarily halted loading data Added a switch to flip between MySQL/MongoDB Instrument, monitor, flip it, analyze, flip back Profiling your code is key What is slow? Build this in your app from day 1
Flip the Switch
Flip the Switch Storage selected at runtime valh = shouldUseMongoDb match { case true => new MongoDbSentenceDAO 	case _ => new MySQLDbSentenceDAO } h.find(...) Hot-swappable storage via configuration It worked!
Then What? Watch our deployment, many iterations to mapping layer Settled on in-house, type-safe mapper  https://github.com/fehguy/mongodb-benchmark-tools Some gotchas (of course) Locking issues on long-running updates (more in a minute) We want more of this! Migrated shared files to Mongo GridFS Easy-IT
Performance + Optimization Loading data is fast! Fixed collection padding, similarly-sized records Tail of collection is always in memory Append faster than MySQL in every case tested But... random access started getting slow Indexes in RAM?  Yes Data in RAM?  No, > 2TB per server Limited by disk I/O /seek performance EC2 + EBS for storage?
Performance + Optimization Moved to physical data center DAS & 72GB RAM => great uncached performance Good move?  Depends on use case If “access anything anytime”, not many options You want to support this?
Performance + Optimization Inserts are fast, how about updates? Well… update => find object, update it, save Lock acquired at “find”, released after “save” If hitting disk, lock time could be large Easy answer, pre-fetch on update Oh, and NEVER do “update all records” against a large collection
Performance + Optimization Indexes Can't always keep index in ram. MMF "does it's thing" Right-balanced b-tree keeps necessary index hot Indexes hit disk => mute your pager 17 15 27
More Mongo, Please! We modeled our word graph in mongo ,[object Object]
80M Edges
80mS edge fetch,[object Object]
What’s next Liberate our models stop worrying about how to store them (for the most part) New features almost always NR Some MySQL left Less on each release

Contenu connexe

Tendances

Ruby performance - The low hanging fruit
Ruby performance - The low hanging fruitRuby performance - The low hanging fruit
Ruby performance - The low hanging fruitBruce Werdschinski
 
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDBZapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDBOren Eini
 
MongoDB .local Bengaluru 2019: Lift & Shift MongoDB to Atlas
MongoDB .local Bengaluru 2019: Lift & Shift MongoDB to AtlasMongoDB .local Bengaluru 2019: Lift & Shift MongoDB to Atlas
MongoDB .local Bengaluru 2019: Lift & Shift MongoDB to AtlasMongoDB
 
Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)Brian Brazil
 
Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)Oren Eini
 
Introduction to MERN Stack
Introduction to MERN StackIntroduction to MERN Stack
Introduction to MERN StackSurya937648
 
MongoDB .local Bengaluru 2019: Becoming an Ops Manager Backup Superhero!
MongoDB .local Bengaluru 2019: Becoming an Ops Manager Backup Superhero!MongoDB .local Bengaluru 2019: Becoming an Ops Manager Backup Superhero!
MongoDB .local Bengaluru 2019: Becoming an Ops Manager Backup Superhero!MongoDB
 
MongoDB .local Bengaluru 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Bengaluru 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Bengaluru 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Bengaluru 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
GlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js DevelopersGlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js DevelopersRob Tweed
 
MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...
MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...
MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...MongoDB
 
What's new in MongoDB 2.6 at India event by company
What's new in MongoDB 2.6 at India event by companyWhat's new in MongoDB 2.6 at India event by company
What's new in MongoDB 2.6 at India event by companyMongoDB APAC
 
MongoDB .local Bengaluru 2019: Realm: The Secret Sauce for Better Mobile Apps
MongoDB .local Bengaluru 2019: Realm: The Secret Sauce for Better Mobile AppsMongoDB .local Bengaluru 2019: Realm: The Secret Sauce for Better Mobile Apps
MongoDB .local Bengaluru 2019: Realm: The Secret Sauce for Better Mobile AppsMongoDB
 
Scaling Marketplace to 10,000 Add-Ons - Arun Bhalla
Scaling Marketplace to 10,000 Add-Ons  - Arun BhallaScaling Marketplace to 10,000 Add-Ons  - Arun Bhalla
Scaling Marketplace to 10,000 Add-Ons - Arun BhallaAtlassian
 
Mtn view sql server nov 2014
Mtn view sql server nov 2014Mtn view sql server nov 2014
Mtn view sql server nov 2014EspressoLogic
 
Internet scaleservice
Internet scaleserviceInternet scaleservice
Internet scaleserviceDaeMyung Kang
 
WordPress Speed & Performance from Pagely's CTO
WordPress Speed & Performance from Pagely's CTOWordPress Speed & Performance from Pagely's CTO
WordPress Speed & Performance from Pagely's CTOLizzie Kardon
 
NodeSummit - MEAN Stack
NodeSummit - MEAN StackNodeSummit - MEAN Stack
NodeSummit - MEAN StackValeri Karpov
 
All the reasons for choosing react js that you didn't know about - Avi Marcus...
All the reasons for choosing react js that you didn't know about - Avi Marcus...All the reasons for choosing react js that you didn't know about - Avi Marcus...
All the reasons for choosing react js that you didn't know about - Avi Marcus...Codemotion Tel Aviv
 
RavenDB embedded at massive scales
RavenDB embedded at massive scalesRavenDB embedded at massive scales
RavenDB embedded at massive scalesOren Eini
 

Tendances (20)

Ruby performance - The low hanging fruit
Ruby performance - The low hanging fruitRuby performance - The low hanging fruit
Ruby performance - The low hanging fruit
 
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDBZapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
 
MongoDB .local Bengaluru 2019: Lift & Shift MongoDB to Atlas
MongoDB .local Bengaluru 2019: Lift & Shift MongoDB to AtlasMongoDB .local Bengaluru 2019: Lift & Shift MongoDB to Atlas
MongoDB .local Bengaluru 2019: Lift & Shift MongoDB to Atlas
 
Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)
 
Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)
 
Introduction to MERN Stack
Introduction to MERN StackIntroduction to MERN Stack
Introduction to MERN Stack
 
RavenDB 4.0
RavenDB 4.0RavenDB 4.0
RavenDB 4.0
 
MongoDB .local Bengaluru 2019: Becoming an Ops Manager Backup Superhero!
MongoDB .local Bengaluru 2019: Becoming an Ops Manager Backup Superhero!MongoDB .local Bengaluru 2019: Becoming an Ops Manager Backup Superhero!
MongoDB .local Bengaluru 2019: Becoming an Ops Manager Backup Superhero!
 
MongoDB .local Bengaluru 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Bengaluru 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Bengaluru 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Bengaluru 2019: MongoDB Atlas Data Lake Technical Deep Dive
 
GlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js DevelopersGlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js Developers
 
MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...
MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...
MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...
 
What's new in MongoDB 2.6 at India event by company
What's new in MongoDB 2.6 at India event by companyWhat's new in MongoDB 2.6 at India event by company
What's new in MongoDB 2.6 at India event by company
 
MongoDB .local Bengaluru 2019: Realm: The Secret Sauce for Better Mobile Apps
MongoDB .local Bengaluru 2019: Realm: The Secret Sauce for Better Mobile AppsMongoDB .local Bengaluru 2019: Realm: The Secret Sauce for Better Mobile Apps
MongoDB .local Bengaluru 2019: Realm: The Secret Sauce for Better Mobile Apps
 
Scaling Marketplace to 10,000 Add-Ons - Arun Bhalla
Scaling Marketplace to 10,000 Add-Ons  - Arun BhallaScaling Marketplace to 10,000 Add-Ons  - Arun Bhalla
Scaling Marketplace to 10,000 Add-Ons - Arun Bhalla
 
Mtn view sql server nov 2014
Mtn view sql server nov 2014Mtn view sql server nov 2014
Mtn view sql server nov 2014
 
Internet scaleservice
Internet scaleserviceInternet scaleservice
Internet scaleservice
 
WordPress Speed & Performance from Pagely's CTO
WordPress Speed & Performance from Pagely's CTOWordPress Speed & Performance from Pagely's CTO
WordPress Speed & Performance from Pagely's CTO
 
NodeSummit - MEAN Stack
NodeSummit - MEAN StackNodeSummit - MEAN Stack
NodeSummit - MEAN Stack
 
All the reasons for choosing react js that you didn't know about - Avi Marcus...
All the reasons for choosing react js that you didn't know about - Avi Marcus...All the reasons for choosing react js that you didn't know about - Avi Marcus...
All the reasons for choosing react js that you didn't know about - Avi Marcus...
 
RavenDB embedded at massive scales
RavenDB embedded at massive scalesRavenDB embedded at massive scales
RavenDB embedded at massive scales
 

En vedette

En vedette (6)

ACP Cup 2013
ACP Cup 2013ACP Cup 2013
ACP Cup 2013
 
Tactical motifs 2
Tactical motifs 2Tactical motifs 2
Tactical motifs 2
 
TEI 4
TEI 4TEI 4
TEI 4
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDB
 
Migrating from MySQL to MongoDB at Wordnik
Migrating from MySQL to MongoDB at WordnikMigrating from MySQL to MongoDB at Wordnik
Migrating from MySQL to MongoDB at Wordnik
 
Futureled fish rebel
Futureled fish rebelFutureled fish rebel
Futureled fish rebel
 

Similaire à Why Wordnik went non-relational

A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?DATAVERSITY
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop User Group
 
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewPierre Baillet
 
Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Amazon Web Services
 
From MySQL to MongoDB at Wordnik (Tony Tam)
From MySQL to MongoDB at Wordnik (Tony Tam)From MySQL to MongoDB at Wordnik (Tony Tam)
From MySQL to MongoDB at Wordnik (Tony Tam)MongoSF
 
UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015Christopher Curtin
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
SQL Server It Just Runs Faster
SQL Server It Just Runs FasterSQL Server It Just Runs Faster
SQL Server It Just Runs FasterBob Ward
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community EngineCommunity Engine
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community enginemathraq
 
Lessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistLessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistJeremy Zawodny
 
Mysql 2007 Tech At Digg V3
Mysql 2007 Tech At Digg V3Mysql 2007 Tech At Digg V3
Mysql 2007 Tech At Digg V3epee
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCCal Henderson
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQLKonstantin Gredeskoul
 
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EUBuilding Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EUYaron Haviv
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Making it fast: Zotonic & Performance
Making it fast: Zotonic & PerformanceMaking it fast: Zotonic & Performance
Making it fast: Zotonic & PerformanceArjan
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...javier ramirez
 

Similaire à Why Wordnik went non-relational (20)

A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of view
 
Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)
 
From MySQL to MongoDB at Wordnik (Tony Tam)
From MySQL to MongoDB at Wordnik (Tony Tam)From MySQL to MongoDB at Wordnik (Tony Tam)
From MySQL to MongoDB at Wordnik (Tony Tam)
 
UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
SQL Server It Just Runs Faster
SQL Server It Just Runs FasterSQL Server It Just Runs Faster
SQL Server It Just Runs Faster
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community Engine
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
 
Lessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistLessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at Craigslist
 
Mysql 2007 Tech At Digg V3
Mysql 2007 Tech At Digg V3Mysql 2007 Tech At Digg V3
Mysql 2007 Tech At Digg V3
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL
 
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EUBuilding Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Making it fast: Zotonic & Performance
Making it fast: Zotonic & PerformanceMaking it fast: Zotonic & Performance
Making it fast: Zotonic & Performance
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 

Plus de Tony Tam

A Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification LinksA Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification LinksTony Tam
 
API Design first with Swagger
API Design first with SwaggerAPI Design first with Swagger
API Design first with SwaggerTony Tam
 
Developing Faster with Swagger
Developing Faster with SwaggerDeveloping Faster with Swagger
Developing Faster with SwaggerTony Tam
 
Writer APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger InflectorWriter APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger InflectorTony Tam
 
Fastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + SwaggerFastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + SwaggerTony Tam
 
Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)Tony Tam
 
Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)Tony Tam
 
Swagger for-your-api
Swagger for-your-apiSwagger for-your-api
Swagger for-your-apiTony Tam
 
Swagger for startups
Swagger for startupsSwagger for startups
Swagger for startupsTony Tam
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQLTony Tam
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without InterferenceTony Tam
 
Keeping MongoDB Data Safe
Keeping MongoDB Data SafeKeeping MongoDB Data Safe
Keeping MongoDB Data SafeTony Tam
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's ArchitectureTony Tam
 
Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swaggerTony Tam
 
Running MongoDB in the Cloud
Running MongoDB in the CloudRunning MongoDB in the Cloud
Running MongoDB in the CloudTony Tam
 
Scala & Swagger at Wordnik
Scala & Swagger at WordnikScala & Swagger at Wordnik
Scala & Swagger at WordnikTony Tam
 
Introducing Swagger
Introducing SwaggerIntroducing Swagger
Introducing SwaggerTony Tam
 
Building a Directed Graph with MongoDB
Building a Directed Graph with MongoDBBuilding a Directed Graph with MongoDB
Building a Directed Graph with MongoDBTony Tam
 

Plus de Tony Tam (18)

A Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification LinksA Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification Links
 
API Design first with Swagger
API Design first with SwaggerAPI Design first with Swagger
API Design first with Swagger
 
Developing Faster with Swagger
Developing Faster with SwaggerDeveloping Faster with Swagger
Developing Faster with Swagger
 
Writer APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger InflectorWriter APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger Inflector
 
Fastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + SwaggerFastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + Swagger
 
Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)
 
Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)
 
Swagger for-your-api
Swagger for-your-apiSwagger for-your-api
Swagger for-your-api
 
Swagger for startups
Swagger for startupsSwagger for startups
Swagger for startups
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without Interference
 
Keeping MongoDB Data Safe
Keeping MongoDB Data SafeKeeping MongoDB Data Safe
Keeping MongoDB Data Safe
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's Architecture
 
Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swagger
 
Running MongoDB in the Cloud
Running MongoDB in the CloudRunning MongoDB in the Cloud
Running MongoDB in the Cloud
 
Scala & Swagger at Wordnik
Scala & Swagger at WordnikScala & Swagger at Wordnik
Scala & Swagger at Wordnik
 
Introducing Swagger
Introducing SwaggerIntroducing Swagger
Introducing Swagger
 
Building a Directed Graph with MongoDB
Building a Directed Graph with MongoDBBuilding a Directed Graph with MongoDB
Building a Directed Graph with MongoDB
 

Dernier

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Dernier (20)

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

Why Wordnik went non-relational

  • 1. NoSQL Now 2011Why Wordnik went Non-Relational Tony Tam @fehguy
  • 2. What this Talk is About 5 Key reasons why Wordnik migrated into a Non-Relational database Process for selection, migration Optimizations and tips from living survivors of the battle field
  • 3. Why Should You Care? MongoDB user for almost 2 years Lessons learned, analysis, benefits from process We migrated from MySQL to MongoDB with no downtime We have interesting/challenging data needs, likely relevant to you
  • 4. More on Wordnik World’s fastest updating English dictionary Based on input of text up to 8k words/second Word Graph as basis to our analysis Synchronous & asynchronous processing 10’s of Billions of documents in NR storage 20M daily REST API calls, billions served Powered by Swagger OSS API framework swagger.wordnik.com Powered API
  • 5. Architectural History 2008: Wordnik was born as a LAMP AWS EC2 stack 2009: Introduced public REST API, powered wordnik.com, partner APIs 2009: drank NoSQL cool-aid 2010: Scala 2011: Micro SOA
  • 6. Non-relational by Necessity Moved to NR because of “4S” Speed Stability Scaling Simplicity But… MySQL can go a LONG way Takes right team, right reasons (+ patience) NR offerings simply too compelling to focus on scaling MySQL
  • 7. Wordnik’s 5 Whys for NoSQL
  • 8. Why #1: Speed bumps with MySQL Inserting data fast (50k recs/second) caused MySQL mayhem Maintaining indexes largely to blame Operations for consistency unnecessary but "cannot be turned off” Devised twisted schemes to avoid client blocking Aka the “master/slave tango”
  • 9. Why #2: Retrieval Complexity Objects typically mapped to tables Object Hierarchy always => inner + outer joins Lots of static data, so why join? “Noun”is not getting renamed in my code’s lifetime! Logic like this is probably in application logic Since storage is cheap I’ll choose speed
  • 10. Why #2: Retrieval Complexity One definition = 10+ joins 50 requests per second!
  • 11. Why #2: Retrieval Complexity Embed objects in rows “sort of works” Filtering gets really nasty Native XML in MySQL? If a full table-scan is OK… OK, then cache it! Layers of caching introduced layers of complexity Stale data/corruption Object versionitis Cache stampedes
  • 12. Why #3: Object Modeling Object models being compromised for sake of persistence This is backwards! Extra abstraction for the wrong reason OK, then performance suffers In-application joins across objects “Who ran the fetch all query against production?!” –any sysadmin “My zillionth ORM layer that only I understand” (and can maintain)
  • 13. Why #4: Scaling Needed "cloud friendly storage" Easy up, easy down! Startup: Sync your data, and announce to clients when ready for business Shutdown: Announce your departure and leave Adding MySQL instances was a dance Snapshot + bin files mysql> change master to MASTER_HOST='db1', MASTER_USER='xxx', MASTER_PASSWORD='xxx', MASTER_LOG_FILE='master-relay.000431', MASTER_LOG_POS=1035435402;
  • 14. Why #4: Scaling What about those VMs? So convenient! But… they kind of suck Can the database succeed on a VM? VM Performance: Memory, CPU or I/O—Pick only one Can your database really reduce CPU or disk I/O with lots of RAM?
  • 15. Why #5: Big Picture BI tools use relational constraints for discovery Is this the right reason for them? Can we work around this? Let’s have a BI tool revolution, too! True service architecture makes relational constraints impractical/impossible Distributed sharding makes relational constraints impractical/impossible
  • 16. Why #5: Big Picture Is your app smarter than your database? The logic line is probably blurry! What does count(*)really mean when you add 5k records/sec? Maybe eventual consistency is not so bad… 2PC? Do some reading and decide! http://eaipatterns.com/docs/IEEE_Software_Design_2PC.pdf
  • 17. Ok, I’m in! I thought deciding was easy!? Many quickly maturing products Divergent features tackle different needs Wordnik spent 8 weeks researching and testing NoSQL solutions This is a long time! (for a startup) Wrote ODM classes and migrated our data Surprise! There were surprises Be prepared to compromise
  • 18. Choice Made, Now What? We went with MongoDB *** Fastest to implement Most reliable Best community Why? Why #1: Fast loading/retrieval Why #2: Fast ODM (50 tps => 1000 tps!) Why #3: Document Models === Object models Why #4: MMF => Kernel-managed memory + RS Why #5: It’s 2011, is there no progress?
  • 19. More on Why MongoDB Testing, testing, testing Used our migration tools to load test Read from MySQL, write to MongoDB We loaded 5+ billion documents, many times over In the end, one server could… Insert 100k records/sec sustained Read 250k records/sec sustained Support concurrent loading/reading
  • 20. Migration & Testing Iterated ODM mapping multiple times Some issues Type Safety cur.next.get("iWasAnIntOnce").asInstanceOf[Long] Dates as Strings obj.put("a_date", "2011-12-31") != obj.put("a_date", new Date("2011-12-31")) Storage Size obj.put("very_long_field_name", true) >> obj.put("vsfn", true)
  • 21. Migration & Testing Expect data model iterations Wordnik migrated table to Mongo collection "as-is” Easier to migrate, test _id field used same MySQL PK Auto Increment? Used MySQL to “check-out” sequences One row per mongo collection Run out of sequences => get more Need exclusive locks here!
  • 22. Migration & Testing Sequence generator in-process SequenceGenerator.checkout("doc_metadata,100") Sequence generator as web service Centralized UID management
  • 23. Migration & Testing Expect data access pattern iterations So much more flexibility! Reach into objects > db.dictionary_entry.find({"hdr.sr":"cmu"}) Access to a whole object tree at query time Overwrite a whole object at once… when desired Not always! This clobbers the whole record > db.foo.save({_id:18727353,foo:"bar"}) Update a single field: > db.foo.update({_id:18727353},{$set:{foo:"bar"}})
  • 24. Flip the Switch Migrate production with zero downtime We temporarily halted loading data Added a switch to flip between MySQL/MongoDB Instrument, monitor, flip it, analyze, flip back Profiling your code is key What is slow? Build this in your app from day 1
  • 26. Flip the Switch Storage selected at runtime valh = shouldUseMongoDb match { case true => new MongoDbSentenceDAO case _ => new MySQLDbSentenceDAO } h.find(...) Hot-swappable storage via configuration It worked!
  • 27. Then What? Watch our deployment, many iterations to mapping layer Settled on in-house, type-safe mapper https://github.com/fehguy/mongodb-benchmark-tools Some gotchas (of course) Locking issues on long-running updates (more in a minute) We want more of this! Migrated shared files to Mongo GridFS Easy-IT
  • 28. Performance + Optimization Loading data is fast! Fixed collection padding, similarly-sized records Tail of collection is always in memory Append faster than MySQL in every case tested But... random access started getting slow Indexes in RAM? Yes Data in RAM? No, > 2TB per server Limited by disk I/O /seek performance EC2 + EBS for storage?
  • 29. Performance + Optimization Moved to physical data center DAS & 72GB RAM => great uncached performance Good move? Depends on use case If “access anything anytime”, not many options You want to support this?
  • 30. Performance + Optimization Inserts are fast, how about updates? Well… update => find object, update it, save Lock acquired at “find”, released after “save” If hitting disk, lock time could be large Easy answer, pre-fetch on update Oh, and NEVER do “update all records” against a large collection
  • 31. Performance + Optimization Indexes Can't always keep index in ram. MMF "does it's thing" Right-balanced b-tree keeps necessary index hot Indexes hit disk => mute your pager 17 15 27
  • 32.
  • 34.
  • 35. What’s next Liberate our models stop worrying about how to store them (for the most part) New features almost always NR Some MySQL left Less on each release
  • 36. Questions? See more about Wordnik APIs http://developer.wordnik.com Migrating from MySQL to MongoDB http://www.slideshare.net/fehguy/migrating-from-mysql-to-mongodb-at-wordnik Maintaining your MongoDB Installation http://www.slideshare.net/fehguy/mongo-sv-tony-tam Swagger API Framework http://swagger.wordnik.com Mapping Benchmark https://github.com/fehguy/mongodb-benchmark-tools Wordnik OSS Tools https://github.com/wordnik/wordnik-oss

Notes de l'éditeur

  1. Moving to a json-based mapper, 10k/second. Moving to direct mapping, 35k/second