1. Migrating from PostgreSQL to MySQL at Cocolog Naoto Yokoyama, NIFTY Corporation Garth Webb, Six Apart Lisa Phillips, Six Apart Credits: Kenji Hirohama, Sumisho Computer Systems Corp.
14. Phase1 2003/12 ~ (Entry: 0.04 Million ) Register Postgre SQL NAS WEB Static contents Published Before DBP 10servers TypePad
15. Phase2 2004/12 ~ (Entry: 7 Million ) Podcast Portal Profile Etc.. Rich template Publish Book Tel Operator Support NAS WEB Static contents Published Postgre SQL Register TypePad Before DBP 50servers 2004/12 ~ 2005/5 ~
16.
17. Phase3 2006/3 ~ (Entry: 12 Million ) NAS WEB Static contents Published Web-API memcached Podcast Portal Profile Etc.. Postgre SQL Rich template Publish Book Tel Operator Support Register TypePad Before DBP 200servers
18. Phase4 2007/4 ~ (Entry: 16 Million ) Web-API NAS WEB Static contents Published memcached Atom Mobile WEB Rich template Publish Book Tel Operator Support Register Typepad Postgre SQL Before DBP 300servers
19. Now 2008/4 ~ Web-API NAS WEB Static contents Published memcached Atom Mobile WEB Typepad Rich template Publish Book Tel Operator Support Register Multi MySQL After DBP 150servers
22. TypePad Overview (PreDBP) Storage Database (Postgres) Static Content (HTML, Images, etc) Application Server Web Server TypeCast Server ATOM Server MEMCACHED Data Caching servers to reduce DB load Dedicated Server for TypeCast (via ATOM) https(443) http(80) http(80) : atom api memcached(11211) postgres(5432) Mail Server Internet nfs(2049) ADMIN(CRON) Server smtp(25) / pop(110) Blog Readers Blog Owners Mobile Blog Readers smtp(25) / pop(110) Cron Server for periodic asynchronous tasks
23. Why Partition? TypePad TypePad TypePad Non- User Role TypePad User Role (User0) All inquires (access) go to one DB(Postgres) After DBP Current setup Inquiries (access) are divided among several DB(MySQL) TypePad TypePad TypePad TypePad Global Role Non-User Role User Role (User1) User Role (User2) User Role (User3)
24. Server Preparation Non- User Role TypePad User Role (User0) DB(PostgreSQL) User Role (User1) User Role (User2) User Role (User3) Global Role Non-User Role New expanded setup DB(MySQL) for partitioned data Current Setup Job Server + TypePad + Schwartz Schwartz DB User information is partitioned Maintains user mapping and primary key generation Stores job details Server for executing Jobs ※ Grey areas are not used in current steps Asynchronous Job Server Information that does not need to be partitioned (such as session information)
25. Global Write Creating the user map Non- User Role TypePad User Role (User0) DB(PostgreSQL) User Role (User1) User Role (User2) User Role (User3) Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB ① ② Explanation ①: For new registrations only, uniquely identifying user data is written to the global DB ②: This same data continues to be written to the existing DB DB(MySQL) for partitioned data Asynchronous Job Server Maintains user mapping and primary key generation ※ Grey areas are not used in current steps
26. Global Read Use the user map to find the user partition Non- User Role TypePad User Role (User0) DB(PostgreSQL) User Role (User1) User Role (User2) User Role (User3) Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation ①: Migrate existing user data to the global DB ②: At start of the request, the application queries global DB for the location of user data ③: The application then talks to this DB for all queries about this user. At this stage the global DB points to the user0 partition in all cases. DB(MySQL) for partitioned data Maintains user mapping and primary key generation ① Migrate existing user data Asynchronous Job Server ② ③ ※ Grey areas are not used in current steps
27. Move Sequence Migrating primary key generation Non- User Role TypePad User Role (User0) DB(PostgreSQL) User Role (User1) User Role (User2) User Role (User3) Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation ①: Postgres sequences (for generating unique primary keys) are migrated to tables on the global DB that act as “pseudo-sequences”. ② Application requests new primary keys from global DB rather than the user partition. DB(MySQL) for partitioned data Maintains user mapping and primary key generation ① ※ Grey areas are not used in current steps Migrate sequence management Asynchronous Job Server ②
28. User Data Move Moving user data to the new user-role partitions Non- User Role TypePad User Role (User0) DB(PostgreSQL) User Role (User1) User Role (User2) User Role (User3) Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation ①: Existing users that should be migrated by Job Server are submitted as new Schwartz jobs. User data is then migrated asynchronously ②: If a comment arrives while the user is being migrated, it is saved in the Schwartz DB to be published later. ③: After being migrated all user data will exist on the user-role DB partitions ④: Once all user data is migrated, only non-user data is on Postgres DB(MySQL) for partitioned data Stores job details Server for executing Jobs Maintains user mapping and primary key generation User information is partitioned ① ② ※ Grey areas are not used in current steps ③ Migrating each user data DB(MySQL) for partitioned data ④
29. New User Partition New registrations are created on one user role partition Non- User Role TypePad User Role (User0) DB(PostgreSQL) User Role (User1) User Role (User2) User Role (User3) Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation ①: When new users register, user data is written to a user role partition. ②: Non-user data continues to be served off Postgres DB(MySQL) for partitioned data Maintains user mapping and primary key generation User information is partitioned ① ② ※ Grey areas are not used in current steps Asynchronous Job Server
30. New User Strategy Pick a scheme for distributing new users Non- User Role TypePad User Role (User0) DB(PostgreSQL) User Role (User1) User Role (User2) User Role (User3) Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation ①: When new users register, user data is written to one of the user role partitions, depending on a set distribution method (round robin, random, etc) ②: Non-user data continues to be served off Postgres DB(MySQL) for partitioned data Maintains user mapping and primary key generation User information is partitioned ① ② ※ Grey areas are not used in current steps Asynchronous Job Server
31. Non User Data Move Migrate data that cannot be partitioned by user Non- User Role TypePad User Role (User0) DB(PostgreSQL) User Role (User1) User Role (User2) User Role (User3) Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation ①: Migrate non-user role data left on PostgreSQL to the MySQL side. DB(MySQL) for partitioned data Maintains user mapping and primary key generation User information is partitioned ① ※ Grey areas are not used in current steps Migrate non-User data Asynchronous Job Server Information that does not need to be partitioned (such as session information)
32. Data migration done Non- User Role TypePad User Role (User0) DB(Postgres) User Role (User1) User Role (User2) User Role (User3) Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation ①: All data access is now done through MySQL ②: Continue to use The Schwartz for asynchronous jobs DB(MySQL) for partitioned data Stores job details Server for executing Jobs Maintains user mapping and primary key generation User information is partitioned ① ※ Grey areas are not used in current steps ① ② Asynchronous Job Server Information that does not need to be partitioned (such as session information)
33. The New TypePad configuration Storage Database (MySQL) Static Content (HTML, Images, etc) Application Server Web Server TypeCast Server ATOM Server MEMCACHED Data Caching servers to reduce DB load Dedicated Server for TypeCast (via ATOM) https(443) http(80) http(80) : atom api memcached(11211) MySQL(3306) Mail Server Internet nfs(2049) ADMIN(CRON) Server smtp(25) / pop(110) Blog Readers Blog Owners (management interface) Mobile Blog Readers smtp(25) / pop(110) Cron Server for periodic asynchronous tasks Job Server TheSchwartz server for running ad-hoc jobs asynchronously
38. Scale out MySQL servers, After DBP PostgreSQL FibreChannel SAN DiskArray … heart beat TypePad Application MySQL Role3 MySQL Role2 MySQL Role1
39.
40. Scale out MySQL servers, After DBP PostgreSQL FibreChannel SAN DiskArray … heart beat MySQL BackupRole TypePad Application mysqld mysqld mysqld rep rep rep opc mysqld mysqld mysqld MySQL Role3 MySQL Role2 MySQL Role1