Making schema changes to large tables in MySQL can become a major issue as data grows. This talk outlines a method to change large MySQL tables with little impact to running production services and replication.
2. mapmyfitness
+ Founded in 2006
+ Fitness tracking: web and mobile apps
+ 17 million + registered users
+ 123 million routes
+ 156 million workouts
+ We use MongoDB, Postgres, and MySQL
!2
3. MySQL at mapmyfitness
+ Master-Master with Slaves
+ The primary database is “mapmyfitness”
+ MySQL DB has grown from 340 Gb to 500+ Gb in the last year
+ We have some large tables
+ Routes 88Gb
+ Workouts 85 Gb
+ User 17 Gb
4. Schema Changes
+ Add/delete a column
+ Add an index
+ Change a column datatype
+ add/delete a foreign key
+ Move columns around
!4
5. Default MySQL Behavior
+ Create a temporary table with the schema changes
+ Put a table level write lock on the original table
+ Insert “select *” from the original table to the temporary table
+ Rename the temporary table to the original table
+ Drop the original table.
!5
6. A Real World Example:
+ Go for it...during a maintenance window
+ PINGDOM? page down!!!
!6
+ nutrition_foodlog needs a new index
+ 22 million rows
+ File size ~6 Gb
+ 10 minute execution time in a dev environment
7. Shadow Table Migration
+ Essentially the same process as MySQL
+ Create a “shadow” table with the new structure/index
+ Create a stored procedure to copy data
+ Create insert, update and delete triggers
+ Setup and run a batch process to to apply “fake” updates to every
row on the original table
+ Run an atomic rename of the tables
+ Drop the original table, triggers, and stored procedure
!7
8. Create the Shadow Table
!8
+ Create the shadow table with the new index
CREATE TABLE `mapmyfitness`.`nutrition_foodlog_shadow` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`privacy_setting` smallint(6) NOT NULL,
...
KEY `XIE2nutrition_foodlog` (`user_id`, `consume_date`, `id`),
...
) ENGINE=InnoDB AUTO_INCREMENT=38304235 DEFAULT CHARSET=utf8;
9. Create Stored Procedure
!9
DROP PROCEDURE IF EXISTS nutrition_foodlog_update;
delimiter ;;
CREATE PROCEDURE nutrition_foodlog_update(
a_id int(11),
...
a_total_serving_grams decimal(15,7))
BEGIN
INSERT INTO nutrition_foodlog_shadow
SET id = a_id,
privacy_setting = a_privacy_setting,
...
total_serving_grams = a_total_serving_grams
ON DUPLICATE KEY UPDATE
id = a_id,
privacy_setting = a_privacy_setting,
...
total_serving_grams = a_total_serving_grams;
END;
;;
delimiter ;
10. Insert After Trigger
!10
delimiter ;;
CREATE TRIGGER nutrition_foodlog_insert_after_trigger
AFTER INSERT
ON mapmyfitness.nutrition_foodlog
FOR EACH ROW
BEGIN
CALL nutrition_foodlog_update(
new.id,
new.privacy_setting,
new.privacy_limit_list,
new.user_id,
new.meal_type_id,
new.consume_date,
new.serving_count,
new.food_id,
new.updated_date,
new.created_date,
new.serving_id,
new.total_serving_grams);
END;
;;
11. Update Before Trigger
!11
delimiter ;;
CREATE TRIGGER nutrition_foodlog_update_before_trigger
BEFORE UPDATE
ON mapmyfitness.nutrition_foodlog
FOR EACH ROW
BEGIN
CALL nutrition_foodlog_update(
new.id,
new.privacy_setting,
new.privacy_limit_list,
new.user_id,
new.meal_type_id,
new.consume_date,
new.serving_count,
new.food_id,
new.updated_date,
new.created_date,
new.serving_id,
new.total_serving_grams);
END;
;;
12. Delete After Trigger
!12
delimiter ;;
CREATE TRIGGER nutrition_foodlog_delete_after_trigger
AFTER DELETE
ON mapmyfitness.nutrition_foodlog
FOR EACH ROW
BEGIN
DELETE FROM mapmyfitness.nutrition_foodlog_shadow
WHERE id = old.id
LIMIT 1;
END;
;;
13. “Fake” Updates
!13
select concat('update mapmyfitness.nutrition_foodlog set id = ',
id,
' where id = ',
id,
' limit 1;') as seql
from mapmyfitness.nutrition_foodlog
order by id;
update mapmyfitness.nutrition_foodlog set id = 2 where id = 2 limit 1;
update mapmyfitness.nutrition_foodlog set id = 4 where id = 4 limit 1;
update mapmyfitness.nutrition_foodlog set id = 6 where id = 6 limit 1;
update mapmyfitness.nutrition_foodlog set id = 8 where id = 8 limit 1;
update mapmyfitness.nutrition_foodlog set id = 12 where id = 12 limit 1;
update mapmyfitness.nutrition_foodlog set id = 14 where id = 14 limit 1;
update mapmyfitness.nutrition_foodlog set id = 16 where id = 16 limit 1;
update mapmyfitness.nutrition_foodlog set id = 20 where id = 20 limit 1;
14. Rename of Tables
!14
-- rename the tables
-- RENAME TABLE table1 to table1_old, table1_shadow to table1;
RENAME TABLE mapmyfitness.nutrition_foodlog
to mapmyfitness.nutrition_foodlog_old,
mapmyfitness.nutrition_foodlog_shadow
to mapmyfitness.nutrition_foodlog;
+ Should be an atomic operation
+ All tables get renamed or none get renamed
+ Run it as one statement
15. Clean-up
+ After confirming all is good with the new table clear the cruft!
!15
-- drop triggers
DROP TRIGGER IF EXISTS nutrition_foodlog_update_before_trigger;
DROP TRIGGER IF EXISTS nutrition_foodlog_insert_after_trigger;
DROP TRIGGER IF EXISTS nutrition_foodlog_delete_after_trigger;
-- drop the update stored procedure
DROP PROCEDURE IF EXISTS nutrition_foodlog_update;
-- truncate and drop the "old" table
TRUNCATE TABLE mapmyfitness.nutrition_foodlog_old;
DROP TABLE mapmyfitness.nutrition_foodlog_old;
MySQL is where all of our meta data is stored, like user profile data, route ids, start and stop points, workout duration, calories etc. We use master-master with slaves replication topology, but only write to one server. The second master is a hot failover and a member of the read array. Slave read array of 5 additional servers connections are managed through haproxy Split out reads and writes: 97% read 3% writes: 4000 reads per second vs 150 writes per second mapmyfitnes db: In the year and a half that I’ve been working at MMF it’s grown from 340 Gb to ~500 Gb: add about 30-40k new registered users every day add about 300-400k workouts per day add about 250-350k routes per day large denormalized wide tables: user, workout, route, nutrition_foodlog Many of these tables have multiple single column indexes, few have compound indexes that might give the mysql optimizer a better query plan to get the results of those 4000 qps back to the users faster. These stats are approximate since they are from the information_schema tables. table_name rows size_Mb route 123754906 88496.3 workout 161760035 85441.8 user 16580651 16681.0 nutrition_foodlog 22167873 5974.5 auth_user 16882244 5368.2
This applies to v5.1 and v5.5. v5.6 is a little different. The problem here is the table level lock: how long does it last, how much will other activities on the DB get impacted by this lock and it’s duration The second problem is replication: once this completes on the master it gets run on a slave and continues downstream, affecting each slave for ~ the same duration.
File size only 6 Gb, we should just go for it during a maintenance window. Let’s test on a dev environment. tested...works. That didn’t work. I was running innotop in another window and as soon as I kicked this operation off the screen turned red and insert/updates/and deletes to the nutrition_foodlog table were piling up in innotop. well that was to be expected. The our nginx and uwisgi latency alerts started firing...PINGDOM! page down...time to kill the transaction and go back to the drawing board.
Do the same process that MySQL does but with less impact to running applications and replication Create the shadow table: new table structure, added indexes, added/removed columns FKs, etc The update stored proc inserts a row into the shadow table from the original table insert trigger, fire on insert: calls the update stored proc to insert the “new” row to the shadow table update trigger, fire on update: calls the update stored proc to insert the “new” column values to the shadow table delete trigger, fire on delete: just delete the row from the shadow table if it exits. The batch process, just updates the primary key value to the same value, but it kicks off the update stored proc to copy the original row to the shadow table. This is how all of the old rows get moved to the new structure. The atomic rename is basically running two table renames in the same transaction so they happen all at once or not at all.
One thing to note: since you are creating a stored procedure to handle these inserts you can do much more then just copy the data, you can do validation, check for duplicates, split the data into multiple tables, etc.
These updates just trigger the update stored procedure to copy the data from the original table to the shadow table. You can just do it in a script, or put the sql into a table and select from it...however you feel comfortable running them. I output these to files containing 100k updates per file and ran those with a 3 second sleep between each batch of 100k. That was so that I could kill/stop the process easily and it also throttles the update statements to the server so that replication can keep up.
If something goes wrong you can rename the tables again and apply any missing transactions from the bin log.