SlideShare une entreprise Scribd logo
1  sur  50
Télécharger pour lire hors ligne
Migrating from
     PostgreSQL to MySQL
      ... without downtime

Matthew Graham
Percona Live NYC 2011
Global marketplace for
         buying and selling handmade goods

      Total                               April 2011
•   8 million members              •   $38.7 million goods sold

•   800k active shops              •   2 million items sold

•   1 billion page views / month   •   1.8 million new items listed

                                   •   390k new members joined
Global marketplace for
         buying and selling handmade goods

      Total                               April 2011
•   8 million members              •   $38.7 million = $900 / min

•   800k active shops              •   2 million items sold

•   1 billion page views / month   •   1.8 million new items listed

                                   •   390k new members joined
Downtime = -$
$320m

$240m

$160m

$80m

 $0m
        2006   2007    2008      2009    2010

               Gross Merchandise Sales
Reasons To Migrate

  • Horizontal Scaling
  • Reduce Types of Databases
  • Licensing Costs
  • Functional Partitioning
  • Schema Refactor
Deciding to Switch
 expected quality
     - transition
-----------------
      net quality > actual quality
Source   Migration   Target
Why   ?
Foundational Principles

 • Migrate One Table at a Time
 • Progressive Ramp Up
 • Data Duplication During Transition
5 Steps Per Table
  1. Create Target Tables
  2. Tee Writes
  3. Backfill
  4. Read from Target Tables
  5. Wrap Up
Create the Target
Merge into the Target
Separate the Target
Source Only


Source                   Target




           Application
            Writes
Teed Writes


Source                   Target




           Application
            Writes
Generating IDs
        Source                   Target
ID         Name           ID       Name
1           Adam           1        Adam

2            Bill          2         Bill

3          Charlie         3        David

4           David          4       Charlie

     Ticket Server: http://bit.ly/dbtickets
Backfill
Application Code Backfill
function backfill_user(user_id) {
    old_row = find_old_user(user_id);
    new_row = create_new_user();
    new_row.id = old_row.id;
    ...
    store_new_user(new_row);
}
ETL Backfill

Source                   Target


                            Load
   Extract


             Transform
Don’t Overload
 • Script bulk inserts 100-2000 row batches
INSERT INTO account (id, firstname, lastname)
VALUES
(1, ‘Alan’, ‘Alda’),
(2, ‘Barry’, ‘Bonds’),
(3, ‘Charlie’, ‘Chaplin’);
Backfill Speed
Application Code      ETL
•   Easier to Write   •   Faster Run Time
Backfill Extract
Application Code             ETL
•   Easier to Write          •   Faster Run Time

•   Less Likely to Get Out   •   Needs an Extract Source
    Of Sync
Backfill Reruns
Application Code              ETL
•   Easier to Write           •   Faster Run Time

•   Less Likely to Get Out    •   Needs an Extract Source
    Of Sync

•   Handles Duplicates from   •   REPLACE and INSERT
    Multiple Executions           ON DUPLICATE KEY
                                  UPDATE
Teed Writes then Backfill
           or
Backfill then Teed Writes
Verification by Diff

$ diff_user 111222333
target user row is missing

$ diff_user 123456789
- source.user.address_state = ‘CA’
+ target.user.address_state = ‘NY’
Verification by Diff
COMPARING 200 ROWS From: 111222197

User Id: 111222333
target user row is missing

User Id: 111222345
- source.user.address_state = ‘CA’
+ target.user.address_state = ‘NY’

SUMMARY: total rows with errors: 2/200
Read from Target
Progressive Ramp Up

          100%      0%
Source                       Target



         Application Reads
Progressive Ramp Up

           99%      1%
Source                       Target



         Application Reads
Progressive Ramp Up

           95%      5%
Source                       Target



         Application Reads
Progressive Ramp Up

           75%     25%
Source                       Target



         Application Reads
Progressive Ramp Up

            0%    100%
Source                       Target



         Application Reads
Ramp Up Example
# global configuration
use_new_tables:
    employees: true
    percent: 1

# application code
if (enabled(‘use_new_tables’)) {
    $result = read_from_target();
} else {
    $result = read_from_source();
}
Enabled or Disabled?

• Check cookies if user already assigned
• $enabled = configured threshold > random %
• Store $enabled in a cookie for future requests
Continuous Deployment

   • Ramp Up / Ramp Down
   • Backfill Fixes
   • Need Code Running on Prod
     to Proceed
   • Makes it Easier
No Foreign Keys
Wrapping Up
Analytical Data
Teed Writes


Source                   Target




           Application
            Writes
Target Only


Source                   Target




           Application
            Writes
Things to Remove

• Code to use old tables
• Code to switch on configuration
• Configuration
• Drop the old tables... eventually
5 Steps Per Table
  1. Create Target Tables
  2. Tee Writes
  3. Backfill
  4. Read from Target Tables
  5. Wrap Up
Foundational Principles

 • Migrate One Table at a Time
 • Progressive Ramp Up
 • Data Duplication During Transition
Questions?
    • Yes, we’re hiring! Specifically MySQL Ops
      http://bit.ly/etsywantsawesome
    • http://codeascraft.etsy.com
    • http://twitter.com/lapsu
Matthew Graham
Percona Live NYC 2011

Contenu connexe

En vedette

Scaling Etsy: What Went Wrong, What Went Right
Scaling Etsy: What Went Wrong, What Went RightScaling Etsy: What Went Wrong, What Went Right
Scaling Etsy: What Went Wrong, What Went RightRoss Snyder
 
Transforming Search in the Digital Marketplace
Transforming Search in the Digital MarketplaceTransforming Search in the Digital Marketplace
Transforming Search in the Digital MarketplaceJason Davis
 
Data mining for_product_search
Data mining for_product_searchData mining for_product_search
Data mining for_product_searchAaron Beppu
 
Emphemeral hadoop clusters in the cloud
Emphemeral hadoop clusters in the cloudEmphemeral hadoop clusters in the cloud
Emphemeral hadoop clusters in the cloudgfodor
 
Metrics-Driven Engineering
Metrics-Driven EngineeringMetrics-Driven Engineering
Metrics-Driven EngineeringMike Brittain
 
Responding to Outages Maturely
Responding to Outages MaturelyResponding to Outages Maturely
Responding to Outages MaturelyJohn Allspaw
 
Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013
Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013
Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013Gregg Donovan
 
A Whirlwind Tour of Etsy's Monitoring Stack
A Whirlwind Tour of Etsy's Monitoring StackA Whirlwind Tour of Etsy's Monitoring Stack
A Whirlwind Tour of Etsy's Monitoring StackDaniel Schauenberg
 
Resilient Response In Complex Systems
Resilient Response In Complex SystemsResilient Response In Complex Systems
Resilient Response In Complex SystemsJohn Allspaw
 
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanSolr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanGregg Donovan
 
Outages, PostMortems, and Human Error
Outages, PostMortems, and Human ErrorOutages, PostMortems, and Human Error
Outages, PostMortems, and Human ErrorJohn Allspaw
 
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012Nick Galbreath
 
Code as Craft: Building a Strong Engineering Culture at Etsy
Code as Craft: Building a Strong Engineering Culture at EtsyCode as Craft: Building a Strong Engineering Culture at Etsy
Code as Craft: Building a Strong Engineering Culture at EtsyChad Dickerson
 

En vedette (16)

Scaling Etsy: What Went Wrong, What Went Right
Scaling Etsy: What Went Wrong, What Went RightScaling Etsy: What Went Wrong, What Went Right
Scaling Etsy: What Went Wrong, What Went Right
 
Scaling Deployment at Etsy
Scaling Deployment at EtsyScaling Deployment at Etsy
Scaling Deployment at Etsy
 
Solr @ Etsy - Apache Lucene Eurocon
Solr @ Etsy - Apache Lucene EuroconSolr @ Etsy - Apache Lucene Eurocon
Solr @ Etsy - Apache Lucene Eurocon
 
Transforming Search in the Digital Marketplace
Transforming Search in the Digital MarketplaceTransforming Search in the Digital Marketplace
Transforming Search in the Digital Marketplace
 
Data mining for_product_search
Data mining for_product_searchData mining for_product_search
Data mining for_product_search
 
Emphemeral hadoop clusters in the cloud
Emphemeral hadoop clusters in the cloudEmphemeral hadoop clusters in the cloud
Emphemeral hadoop clusters in the cloud
 
Metrics-Driven Engineering
Metrics-Driven EngineeringMetrics-Driven Engineering
Metrics-Driven Engineering
 
Responding to Outages Maturely
Responding to Outages MaturelyResponding to Outages Maturely
Responding to Outages Maturely
 
Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013
Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013
Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013
 
DevTools at Etsy
DevTools at EtsyDevTools at Etsy
DevTools at Etsy
 
A Whirlwind Tour of Etsy's Monitoring Stack
A Whirlwind Tour of Etsy's Monitoring StackA Whirlwind Tour of Etsy's Monitoring Stack
A Whirlwind Tour of Etsy's Monitoring Stack
 
Resilient Response In Complex Systems
Resilient Response In Complex SystemsResilient Response In Complex Systems
Resilient Response In Complex Systems
 
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanSolr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
 
Outages, PostMortems, and Human Error
Outages, PostMortems, and Human ErrorOutages, PostMortems, and Human Error
Outages, PostMortems, and Human Error
 
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
 
Code as Craft: Building a Strong Engineering Culture at Etsy
Code as Craft: Building a Strong Engineering Culture at EtsyCode as Craft: Building a Strong Engineering Culture at Etsy
Code as Craft: Building a Strong Engineering Culture at Etsy
 

Similaire à Migrating from PostgreSQL to MySQL Without Downtime

Treasure Data Summer Internship 2016
Treasure Data Summer Internship 2016Treasure Data Summer Internship 2016
Treasure Data Summer Internship 2016Yuta Iwama
 
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...InfluxData
 
Flink SQL: The Challenges to Build a Streaming SQL Engine
Flink SQL: The Challenges to Build a Streaming SQL EngineFlink SQL: The Challenges to Build a Streaming SQL Engine
Flink SQL: The Challenges to Build a Streaming SQL EngineHostedbyConfluent
 
Modern Web Technologies — Jerusalem Web Professionals, January 2011
Modern Web Technologies — Jerusalem Web Professionals, January 2011Modern Web Technologies — Jerusalem Web Professionals, January 2011
Modern Web Technologies — Jerusalem Web Professionals, January 2011Reuven Lerner
 
Modern Web technologies (and why you should care): Megacomm, Jerusalem, Febru...
Modern Web technologies (and why you should care): Megacomm, Jerusalem, Febru...Modern Web technologies (and why you should care): Megacomm, Jerusalem, Febru...
Modern Web technologies (and why you should care): Megacomm, Jerusalem, Febru...Reuven Lerner
 
Top Tips and Tricks For Supporting Your Oracle Health Science Application Users
Top Tips and Tricks For Supporting Your Oracle Health Science Application UsersTop Tips and Tricks For Supporting Your Oracle Health Science Application Users
Top Tips and Tricks For Supporting Your Oracle Health Science Application UsersPerficient
 
Learning to run
Learning to runLearning to run
Learning to rundominion
 
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxData
 
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroDevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroGaurav "GP" Pal
 
The inherent complexity of stream processing
The inherent complexity of stream processingThe inherent complexity of stream processing
The inherent complexity of stream processingnathanmarz
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...SignalFx
 
The 30-Month Migration
The 30-Month MigrationThe 30-Month Migration
The 30-Month Migrationglvdb
 
Lotuscript for large systems
Lotuscript for large systemsLotuscript for large systems
Lotuscript for large systemsBill Buchan
 
Top Java Performance Problems and Metrics To Check in Your Pipeline
Top Java Performance Problems and Metrics To Check in Your PipelineTop Java Performance Problems and Metrics To Check in Your Pipeline
Top Java Performance Problems and Metrics To Check in Your PipelineAndreas Grabner
 
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxData
 
BADCamp 2008 DB Sync
BADCamp 2008 DB SyncBADCamp 2008 DB Sync
BADCamp 2008 DB SyncShaun Haber
 

Similaire à Migrating from PostgreSQL to MySQL Without Downtime (20)

Refactoring
RefactoringRefactoring
Refactoring
 
Lua pitfalls
Lua pitfallsLua pitfalls
Lua pitfalls
 
Treasure Data Summer Internship 2016
Treasure Data Summer Internship 2016Treasure Data Summer Internship 2016
Treasure Data Summer Internship 2016
 
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
 
Flink SQL: The Challenges to Build a Streaming SQL Engine
Flink SQL: The Challenges to Build a Streaming SQL EngineFlink SQL: The Challenges to Build a Streaming SQL Engine
Flink SQL: The Challenges to Build a Streaming SQL Engine
 
Modern Web Technologies — Jerusalem Web Professionals, January 2011
Modern Web Technologies — Jerusalem Web Professionals, January 2011Modern Web Technologies — Jerusalem Web Professionals, January 2011
Modern Web Technologies — Jerusalem Web Professionals, January 2011
 
Modern Web technologies (and why you should care): Megacomm, Jerusalem, Febru...
Modern Web technologies (and why you should care): Megacomm, Jerusalem, Febru...Modern Web technologies (and why you should care): Megacomm, Jerusalem, Febru...
Modern Web technologies (and why you should care): Megacomm, Jerusalem, Febru...
 
From * to Symfony2
From * to Symfony2From * to Symfony2
From * to Symfony2
 
Top Tips and Tricks For Supporting Your Oracle Health Science Application Users
Top Tips and Tricks For Supporting Your Oracle Health Science Application UsersTop Tips and Tricks For Supporting Your Oracle Health Science Application Users
Top Tips and Tricks For Supporting Your Oracle Health Science Application Users
 
Learning to run
Learning to runLearning to run
Learning to run
 
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
 
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroDevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
 
The inherent complexity of stream processing
The inherent complexity of stream processingThe inherent complexity of stream processing
The inherent complexity of stream processing
 
Message passing
Message passingMessage passing
Message passing
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...
 
The 30-Month Migration
The 30-Month MigrationThe 30-Month Migration
The 30-Month Migration
 
Lotuscript for large systems
Lotuscript for large systemsLotuscript for large systems
Lotuscript for large systems
 
Top Java Performance Problems and Metrics To Check in Your Pipeline
Top Java Performance Problems and Metrics To Check in Your PipelineTop Java Performance Problems and Metrics To Check in Your Pipeline
Top Java Performance Problems and Metrics To Check in Your Pipeline
 
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
 
BADCamp 2008 DB Sync
BADCamp 2008 DB SyncBADCamp 2008 DB Sync
BADCamp 2008 DB Sync
 

Dernier

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Dernier (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

Migrating from PostgreSQL to MySQL Without Downtime

  • 1. Migrating from PostgreSQL to MySQL ... without downtime Matthew Graham Percona Live NYC 2011
  • 2. Global marketplace for buying and selling handmade goods Total April 2011 • 8 million members • $38.7 million goods sold • 800k active shops • 2 million items sold • 1 billion page views / month • 1.8 million new items listed • 390k new members joined
  • 3. Global marketplace for buying and selling handmade goods Total April 2011 • 8 million members • $38.7 million = $900 / min • 800k active shops • 2 million items sold • 1 billion page views / month • 1.8 million new items listed • 390k new members joined
  • 5.
  • 6. $320m $240m $160m $80m $0m 2006 2007 2008 2009 2010 Gross Merchandise Sales
  • 7.
  • 8. Reasons To Migrate • Horizontal Scaling • Reduce Types of Databases • Licensing Costs • Functional Partitioning • Schema Refactor
  • 9. Deciding to Switch expected quality - transition ----------------- net quality > actual quality
  • 10.
  • 11.
  • 12. Source Migration Target
  • 13. Why ?
  • 14.
  • 15. Foundational Principles • Migrate One Table at a Time • Progressive Ramp Up • Data Duplication During Transition
  • 16. 5 Steps Per Table 1. Create Target Tables 2. Tee Writes 3. Backfill 4. Read from Target Tables 5. Wrap Up
  • 18. Merge into the Target
  • 20. Source Only Source Target Application Writes
  • 21. Teed Writes Source Target Application Writes
  • 22. Generating IDs Source Target ID Name ID Name 1 Adam 1 Adam 2 Bill 2 Bill 3 Charlie 3 David 4 David 4 Charlie Ticket Server: http://bit.ly/dbtickets
  • 24. Application Code Backfill function backfill_user(user_id) { old_row = find_old_user(user_id); new_row = create_new_user(); new_row.id = old_row.id; ... store_new_user(new_row); }
  • 25. ETL Backfill Source Target Load Extract Transform
  • 26. Don’t Overload • Script bulk inserts 100-2000 row batches INSERT INTO account (id, firstname, lastname) VALUES (1, ‘Alan’, ‘Alda’), (2, ‘Barry’, ‘Bonds’), (3, ‘Charlie’, ‘Chaplin’);
  • 27. Backfill Speed Application Code ETL • Easier to Write • Faster Run Time
  • 28. Backfill Extract Application Code ETL • Easier to Write • Faster Run Time • Less Likely to Get Out • Needs an Extract Source Of Sync
  • 29. Backfill Reruns Application Code ETL • Easier to Write • Faster Run Time • Less Likely to Get Out • Needs an Extract Source Of Sync • Handles Duplicates from • REPLACE and INSERT Multiple Executions ON DUPLICATE KEY UPDATE
  • 30. Teed Writes then Backfill or Backfill then Teed Writes
  • 31. Verification by Diff $ diff_user 111222333 target user row is missing $ diff_user 123456789 - source.user.address_state = ‘CA’ + target.user.address_state = ‘NY’
  • 32. Verification by Diff COMPARING 200 ROWS From: 111222197 User Id: 111222333 target user row is missing User Id: 111222345 - source.user.address_state = ‘CA’ + target.user.address_state = ‘NY’ SUMMARY: total rows with errors: 2/200
  • 34. Progressive Ramp Up 100% 0% Source Target Application Reads
  • 35. Progressive Ramp Up 99% 1% Source Target Application Reads
  • 36. Progressive Ramp Up 95% 5% Source Target Application Reads
  • 37. Progressive Ramp Up 75% 25% Source Target Application Reads
  • 38. Progressive Ramp Up 0% 100% Source Target Application Reads
  • 39. Ramp Up Example # global configuration use_new_tables: employees: true percent: 1 # application code if (enabled(‘use_new_tables’)) { $result = read_from_target(); } else { $result = read_from_source(); }
  • 40. Enabled or Disabled? • Check cookies if user already assigned • $enabled = configured threshold > random % • Store $enabled in a cookie for future requests
  • 41. Continuous Deployment • Ramp Up / Ramp Down • Backfill Fixes • Need Code Running on Prod to Proceed • Makes it Easier
  • 45. Teed Writes Source Target Application Writes
  • 46. Target Only Source Target Application Writes
  • 47. Things to Remove • Code to use old tables • Code to switch on configuration • Configuration • Drop the old tables... eventually
  • 48. 5 Steps Per Table 1. Create Target Tables 2. Tee Writes 3. Backfill 4. Read from Target Tables 5. Wrap Up
  • 49. Foundational Principles • Migrate One Table at a Time • Progressive Ramp Up • Data Duplication During Transition
  • 50. Questions? • Yes, we’re hiring! Specifically MySQL Ops http://bit.ly/etsywantsawesome • http://codeascraft.etsy.com • http://twitter.com/lapsu Matthew Graham Percona Live NYC 2011