SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
Big Tables and You
Keeping DDL operations fast
So You Want To Add a New Column
class AddFoo < ActiveRecord::Migration
def self.change
add_column :foo, :bar, :string
end
end
ALTER TABLE foo ADD COLUMN bar varchar(256);
What is this doing?
“ALTER TABLE makes a temporary copy of the original
table...waits for other operations that are modifying the
table...incorporates the alteration into the copy, deletes
the original table, and renames the new one. While ALTER
TABLE is executing, the original table is readable…
...writes to the table that begin after the ALTER TABLE
operation begins are stalled until the new table is
ready…”
Where Did Production Go?!
Whats wrong with this approach?
Write operations are stalled and you’ve just crashed
production
Multiple ALTER statements are applied separately making
the time to execute T(n*rows)
Worse with indexes
Demo!
ruby> File.open('/tmp/foo','w') {|f| (1..10_000_000).to_a.
each{|r|f.puts(r)} } # 10 million rows
mysql> CREATE DATABASE temp_table_demo;
mysql> USE temp_table_demo;
mysql> CREATE TABLE foo (id int PRIMARY KEY
AUTO_INCREMENT, bar VARCHAR(256));
mysql> LOAD DATA INFILE "/tmp/foo" INTO TABLE foo;
Demo! (Continued)
mysql> ALTER TABLE foo ADD COLUMN baz varchar
(256);
Query OK, 10000000 rows affected (42.97 sec)
Records: 10000000 Duplicates: 0 Warnings: 0
mysql> SHOW PROCESSLIST;
“State” => “copy to tmp table” ~90% of the execution time
Rethinking DDL Changes
“ALTER TABLE makes a temporary copy of the original
table...waits for other operations that are modifying the
table...incorporates the alteration into the copy, deletes
the original table, and renames the new one. While ALTER
TABLE is executing, the original table is readable…
We can 1) make a temporary copy 2) incorporate
changes 3) sync 4) delete 5) rename
DDL Plan of Attack
CREATE TABLE foo_temp LIKE foo;
ALTER TABLE foo_temp ADD COLUMN baz varchar
(256);
INSERT INTO foo_temp (id,bar) SELECT * FROM foo;
# Syncing checks here for records modified during change
DROP TABLE foo;
RENAME TABLE foo_temp TO foo;
What Changes?
90% of the time in “copy to tmp table”
to
90% of our time in “Sending data” (non
blocking)
This means records can be inserted, updated,
deleted without waiting for table metadata lock
Enter MySQL Big Table Migration
A Rails plugin that adds methods to
ActiveRecord::Migration to allow columns
and indexes to be added to and removed from
large tables with millions of
rows in MySQL, without leaving processes
seemingly stalled in state "copy
to tmp table".
Example
class AddBazToFoo < ActiveRecord::Migration
def self.up
add_column_using_tmp_table :foo, :baz, :string
end
end
Additional Methods
● add_column_using_tmp_table
● remove_column_using_tmp_table
● rename_column_using_tmp_table
● change_column_using_tmp_table
● add_index_using_tmp_table
● remove_index_using_tmp_table
When Should This Be Used?
A good rule of thumb is any table already in
production
Another rule of thumb is any table with more
than 1 million rows
Not necessary for small, or new tables
The “Meat”
def with_tmp_table(table_name)
say "Creating temporary table #{new_table_name} like #
{table_name}..."
# DDL operations performed on temp table
say "Inserting into temporary table in batches of #{batch_size}..."
say "Replacing source table with temporary table..."
say "Cleaning up, checking for rows created/updated during migration,
dropping old table..."
end
Demo!
rails new temp_table_demo
# Gemfile
gem 'mysql_big_table_migration', git: 'git@github.com:
thickpaddy/mysql_big_table_migration.git'
Run DDLs with and without temp table pattern
Questions from the Audience
Q&A

Contenu connexe

Similaire à Big tables and you - Keeping DDL operatations fast

Instant add column for inno db in mariadb 10.3+ (fosdem 2018, second draft)
Instant add column for inno db in mariadb 10.3+ (fosdem 2018, second draft)Instant add column for inno db in mariadb 10.3+ (fosdem 2018, second draft)
Instant add column for inno db in mariadb 10.3+ (fosdem 2018, second draft)Valerii Kravchuk
 
Redefining tables online without surprises
Redefining tables online without surprisesRedefining tables online without surprises
Redefining tables online without surprisesNelson Calero
 
Data Definition Language (DDL)
Data Definition Language (DDL) Data Definition Language (DDL)
Data Definition Language (DDL) Mohd Tousif
 
Myth busters - performance tuning 103 2008
Myth busters - performance tuning 103 2008Myth busters - performance tuning 103 2008
Myth busters - performance tuning 103 2008paulguerin
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slidesmetsarin
 
Hive @ Bucharest Java User Group
Hive @ Bucharest Java User GroupHive @ Bucharest Java User Group
Hive @ Bucharest Java User GroupRemus Rusanu
 
MS SQL SERVER: Manipulating Database
MS SQL SERVER: Manipulating DatabaseMS SQL SERVER: Manipulating Database
MS SQL SERVER: Manipulating Databasesqlserver content
 
MS Sql Server: Manipulating Database
MS Sql Server: Manipulating DatabaseMS Sql Server: Manipulating Database
MS Sql Server: Manipulating DatabaseDataminingTools Inc
 
MS SQLSERVER:Manipulating Database
MS SQLSERVER:Manipulating DatabaseMS SQLSERVER:Manipulating Database
MS SQLSERVER:Manipulating Databasesqlserver content
 
Clase 11 manejo tablas modificada
Clase 11 manejo tablas   modificadaClase 11 manejo tablas   modificada
Clase 11 manejo tablas modificadaTitiushko Jazz
 
Clase 11 manejo tablas modificada
Clase 11 manejo tablas   modificadaClase 11 manejo tablas   modificada
Clase 11 manejo tablas modificadaTitiushko Jazz
 

Similaire à Big tables and you - Keeping DDL operatations fast (20)

Instant add column for inno db in mariadb 10.3+ (fosdem 2018, second draft)
Instant add column for inno db in mariadb 10.3+ (fosdem 2018, second draft)Instant add column for inno db in mariadb 10.3+ (fosdem 2018, second draft)
Instant add column for inno db in mariadb 10.3+ (fosdem 2018, second draft)
 
Redefining tables online without surprises
Redefining tables online without surprisesRedefining tables online without surprises
Redefining tables online without surprises
 
Columnrename9i
Columnrename9iColumnrename9i
Columnrename9i
 
Data Definition Language (DDL)
Data Definition Language (DDL) Data Definition Language (DDL)
Data Definition Language (DDL)
 
Myth busters - performance tuning 103 2008
Myth busters - performance tuning 103 2008Myth busters - performance tuning 103 2008
Myth busters - performance tuning 103 2008
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slides
 
MySQL Essential Training
MySQL Essential TrainingMySQL Essential Training
MySQL Essential Training
 
Migration
MigrationMigration
Migration
 
Etl2
Etl2Etl2
Etl2
 
8. sql
8. sql8. sql
8. sql
 
Apache TAJO
Apache TAJOApache TAJO
Apache TAJO
 
Hive @ Bucharest Java User Group
Hive @ Bucharest Java User GroupHive @ Bucharest Java User Group
Hive @ Bucharest Java User Group
 
MS SQL SERVER: Manipulating Database
MS SQL SERVER: Manipulating DatabaseMS SQL SERVER: Manipulating Database
MS SQL SERVER: Manipulating Database
 
MS Sql Server: Manipulating Database
MS Sql Server: Manipulating DatabaseMS Sql Server: Manipulating Database
MS Sql Server: Manipulating Database
 
MS SQLSERVER:Manipulating Database
MS SQLSERVER:Manipulating DatabaseMS SQLSERVER:Manipulating Database
MS SQLSERVER:Manipulating Database
 
IR SQLite Session #3
IR SQLite Session #3IR SQLite Session #3
IR SQLite Session #3
 
Clase 11 manejo tablas modificada
Clase 11 manejo tablas   modificadaClase 11 manejo tablas   modificada
Clase 11 manejo tablas modificada
 
Clase 11 manejo tablas modificada
Clase 11 manejo tablas   modificadaClase 11 manejo tablas   modificada
Clase 11 manejo tablas modificada
 
Vertica-Database
Vertica-DatabaseVertica-Database
Vertica-Database
 
DML, DDL, DCL ,DRL/DQL and TCL Statements in SQL with Examples
DML, DDL, DCL ,DRL/DQL and TCL Statements in SQL with ExamplesDML, DDL, DCL ,DRL/DQL and TCL Statements in SQL with Examples
DML, DDL, DCL ,DRL/DQL and TCL Statements in SQL with Examples
 

Plus de thehoagie

Pair programming
Pair programmingPair programming
Pair programmingthehoagie
 
Docker presentation
Docker presentationDocker presentation
Docker presentationthehoagie
 
Database 101
Database 101Database 101
Database 101thehoagie
 
Git Pro Tips
Git Pro TipsGit Pro Tips
Git Pro Tipsthehoagie
 
Null object pattern
Null object patternNull object pattern
Null object patternthehoagie
 
Angular.js - An introduction for the unitiated
Angular.js - An introduction for the unitiatedAngular.js - An introduction for the unitiated
Angular.js - An introduction for the unitiatedthehoagie
 
Regular expression presentation for the HUB
Regular expression presentation for the HUBRegular expression presentation for the HUB
Regular expression presentation for the HUBthehoagie
 
Converting your JS library to a jQuery plugin
Converting your JS library to a jQuery pluginConverting your JS library to a jQuery plugin
Converting your JS library to a jQuery pluginthehoagie
 
Active records before_type_cast
Active records before_type_castActive records before_type_cast
Active records before_type_castthehoagie
 

Plus de thehoagie (11)

Pair programming
Pair programmingPair programming
Pair programming
 
Docker presentation
Docker presentationDocker presentation
Docker presentation
 
Database 101
Database 101Database 101
Database 101
 
Testing
TestingTesting
Testing
 
Hubot
HubotHubot
Hubot
 
Git Pro Tips
Git Pro TipsGit Pro Tips
Git Pro Tips
 
Null object pattern
Null object patternNull object pattern
Null object pattern
 
Angular.js - An introduction for the unitiated
Angular.js - An introduction for the unitiatedAngular.js - An introduction for the unitiated
Angular.js - An introduction for the unitiated
 
Regular expression presentation for the HUB
Regular expression presentation for the HUBRegular expression presentation for the HUB
Regular expression presentation for the HUB
 
Converting your JS library to a jQuery plugin
Converting your JS library to a jQuery pluginConverting your JS library to a jQuery plugin
Converting your JS library to a jQuery plugin
 
Active records before_type_cast
Active records before_type_castActive records before_type_cast
Active records before_type_cast
 

Dernier

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Dernier (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Big tables and you - Keeping DDL operatations fast

  • 1. Big Tables and You Keeping DDL operations fast
  • 2. So You Want To Add a New Column class AddFoo < ActiveRecord::Migration def self.change add_column :foo, :bar, :string end end ALTER TABLE foo ADD COLUMN bar varchar(256);
  • 3. What is this doing? “ALTER TABLE makes a temporary copy of the original table...waits for other operations that are modifying the table...incorporates the alteration into the copy, deletes the original table, and renames the new one. While ALTER TABLE is executing, the original table is readable… ...writes to the table that begin after the ALTER TABLE operation begins are stalled until the new table is ready…”
  • 5. Whats wrong with this approach? Write operations are stalled and you’ve just crashed production Multiple ALTER statements are applied separately making the time to execute T(n*rows) Worse with indexes
  • 6. Demo! ruby> File.open('/tmp/foo','w') {|f| (1..10_000_000).to_a. each{|r|f.puts(r)} } # 10 million rows mysql> CREATE DATABASE temp_table_demo; mysql> USE temp_table_demo; mysql> CREATE TABLE foo (id int PRIMARY KEY AUTO_INCREMENT, bar VARCHAR(256)); mysql> LOAD DATA INFILE "/tmp/foo" INTO TABLE foo;
  • 7. Demo! (Continued) mysql> ALTER TABLE foo ADD COLUMN baz varchar (256); Query OK, 10000000 rows affected (42.97 sec) Records: 10000000 Duplicates: 0 Warnings: 0 mysql> SHOW PROCESSLIST; “State” => “copy to tmp table” ~90% of the execution time
  • 8. Rethinking DDL Changes “ALTER TABLE makes a temporary copy of the original table...waits for other operations that are modifying the table...incorporates the alteration into the copy, deletes the original table, and renames the new one. While ALTER TABLE is executing, the original table is readable… We can 1) make a temporary copy 2) incorporate changes 3) sync 4) delete 5) rename
  • 9. DDL Plan of Attack CREATE TABLE foo_temp LIKE foo; ALTER TABLE foo_temp ADD COLUMN baz varchar (256); INSERT INTO foo_temp (id,bar) SELECT * FROM foo; # Syncing checks here for records modified during change DROP TABLE foo; RENAME TABLE foo_temp TO foo;
  • 10. What Changes? 90% of the time in “copy to tmp table” to 90% of our time in “Sending data” (non blocking) This means records can be inserted, updated, deleted without waiting for table metadata lock
  • 11. Enter MySQL Big Table Migration A Rails plugin that adds methods to ActiveRecord::Migration to allow columns and indexes to be added to and removed from large tables with millions of rows in MySQL, without leaving processes seemingly stalled in state "copy to tmp table".
  • 12. Example class AddBazToFoo < ActiveRecord::Migration def self.up add_column_using_tmp_table :foo, :baz, :string end end
  • 13. Additional Methods ● add_column_using_tmp_table ● remove_column_using_tmp_table ● rename_column_using_tmp_table ● change_column_using_tmp_table ● add_index_using_tmp_table ● remove_index_using_tmp_table
  • 14. When Should This Be Used? A good rule of thumb is any table already in production Another rule of thumb is any table with more than 1 million rows Not necessary for small, or new tables
  • 15. The “Meat” def with_tmp_table(table_name) say "Creating temporary table #{new_table_name} like # {table_name}..." # DDL operations performed on temp table say "Inserting into temporary table in batches of #{batch_size}..." say "Replacing source table with temporary table..." say "Cleaning up, checking for rows created/updated during migration, dropping old table..." end
  • 16. Demo! rails new temp_table_demo # Gemfile gem 'mysql_big_table_migration', git: 'git@github.com: thickpaddy/mysql_big_table_migration.git' Run DDLs with and without temp table pattern
  • 17. Questions from the Audience Q&A