Migrating from PostgreSQL to MySQL at Cocolog
Cocolog, a blog community run by NIFTY Corporation, migrated its database from PostgreSQL to MySQL to improve scalability. It used a process called Database Partitioning (DBP) to partition user data across multiple MySQL servers for improved performance. First, it set up global and user role databases. Then it migrated user, sequence, and non-user data in stages to the new partitioning scheme while the application accessed both databases. After migration was complete, all data was accessed from the partitioned MySQL databases, improving Cocolog's ability to handle its growing user base.
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Migrating from PostgreSQL to MySQL at Cocolog
1. Migrating from PostgreSQL to MySQL
at Cocolog
Naoto Yokoyama, NIFTY Corporation
Garth Webb, Six Apart
Lisa Phillips, Six Apart
Credits:
Kenji Hirohama, Sumisho Computer Systems Corp.
2. Agenda
1. What is Cocolog
2. History of Cocolog
3. DBP: Database Partitioning
4. Migration From PostgreSQL to MySQL
4. What is Cocolog
NIFTY Corporation
Established in 1986
A Fujitsu Group Company
NIFTY-Serve (licensed and interconnected with CompuServe)
One of the largest ISPs in Japan
Cocolog
First blog community at a Japanese ISP
Based on TypePad technology by SixApart
Several hundred million PV/month
History
Dec/02/2003: Cocolog for ISP users launch
Nov/24/2005: Cocolog Free for free launch
April/05/2007: Cocolog for Mobile Phone launch
10. Technology at Cocolog
Core System
Linux 2.4/2.6
Apache 1.3/2.0/2.2 & mod_perl
Perl 5.8+CPAN
PostgreSQL 8.1
MySQL 5.0
memcached/TheSchwartz/cfengine
Eco System
LAMP,LAPP,Ruby+ActiveRecord, Capistrano
Etc...
11. Monitoring
Management Tool
Proprietary in-house development with PostgreSQL, PHP,
and Perl
Monitoring points (order of priority)
response time of each post
number of spam comments/trackbacks
number of comments/trackbacks
source IP address of spam
number of entries
number of comments via mobile devices
page views via mobile devices
time of batch completion
amount of API usage
bandwidth usage
DB
Disk I/O
Memory and CPU usage
time of VACUUM analyze
APP
number of active processes
CPU usage
Memory usage
Hard
DB
Service
APL
14. Podcast
Portal
Profile
Etc..
Phase2 2004/12~ (Entry: 7Million)
Rich templatePublish Book
Tel Operator
Support
NAS
WEB
Static contents
Published
Postgre
SQL
Register
TypePad2004/12~
2005/5~
Before DBP
50servers
15. Phase2 - Problems
The system is tightly coupled.
Database server is receiving from multiple
points.
It is difficult to change the system design and
database schema.
16. Phase3 2006/3~ (Entry: 12Million)
NAS
WEB
Static contents
Published
Web-API
memcached
Podcast
Portal
Profile
Etc..
Postgre
SQL
Rich templatePublish Book
Tel Operator
Support
Register
TypePad
Before DBP
200servers
17. Phase4 2007/4~ (Entry: 16Million)
Web-API
Static contents
Published
memcached
Atom
Mobile
WEB
Rich templatePublish Book
Tel Operator
Support
Register
Typepad
Postgre
SQL
Before DBP
300servers
20. Steps for Transitioning
• Server Preparation
Hardware and software setup
• Global Write
Write user information to the global DB
• Global Read
Read/write user information on the global DB
• Move Sequence
Table sequences served by global DB
• User Data Move
Move user data to user partitions
• New User Partition
All new users saved directly to user partition 1
• New User Strategy
Decide on a strategy for the new user partition
• Non User Data Move
Move all non-user owned data
21. Storage
TypePad Overview (PreDBP)
Database
(Postgres)
Static Content (HTML,
Images, etc)
Application
Server
Web
Server
TypeCast
Server
ATOM
Server
MEMCACHED
Data Caching servers to
reduce DB load
Dedicated Server for
TypeCast (via ATOM)
https(443)
http(80)
http(80) : atom apimemcached(11211)
postgres(5432)
Mail
Server
Internet
nfs(2049)
ADMIN(CRON)
Server
smtp(25) / pop(110)Blog Readers
Blog Owners
Mobile Blog
Readers
smtp(25) / pop(110)
Cron Server for periodic
asynchronous tasks
22. TypePad
TypePad
TypePad
Non-
User Role
Why Partition?
TypePad
User Role
(User0)
All inquires (access) go to one
DB(Postgres)
After DBPCurrent setup
Inquiries (access) are divided among
several DB(MySQL)
TypePad
TypePad
TypePad
TypePad
Global
Role
Non-User
Role
User Role
(User1)
User Role
(User2)
User Role
(User3)
23. Non-
User Role
Server Preparation
TypePad
User Role
(User0)
DB(PostgreSQL)
User Role
(User1)
User Role
(User2)
User Role
(User3)
Global
Role
Non-User
Role
New expanded setup
DB(MySQL) for partitioned data
Current Setup
Job Server
+ TypePad
+ Schwartz
Schwartz
DB
User information is
partitioned
Maintains user mapping
and primary key generation Stores job
details
Server for
executing Jobs
※Grey areas are not used in current
steps
Asynchronous Job Server
Information that does not
need to be partitioned
(such as session
information)
24. Global Write
Creating the user map
Non-
User Role
TypePad
User Role
(User0)
DB(PostgreSQL)
User Role
(User1)
User Role
(User2)
User Role
(User3)
Global
Role
Non-User
Role
Job Server
+ TypePad
+ Schwartz
Schwartz
DB
①
②
Explanation
①:For new registrations only, uniquely identifying user data is written to the global DB
②:This same data continues to be written to the existing DB
DB(MySQL) for partitioned data
Asynchronous Job Server
Maintains user mapping
and primary key generation
※Grey areas are not used in current steps
25. Global Read
Use the user map to find the user partition
Non-
User Role
TypePad
User Role
(User0)
DB(PostgreSQL)
User Role
(User1)
User Role
(User2)
User Role
(User3)
Global
Role
Non-User
Role
Job Server
+ TypePad
+ Schwartz
Schwartz
DB
Explanation
①:Migrate existing user data to the global DB
②:At start of the request, the application queries global DB for the location of user data
③:The application then talks to this DB for all queries about this user. At this stage the global DB points
to the user0 partition in all cases.
DB(MySQL) for partitioned data
Maintains user mapping
and primary key generation
①
Migrate existing
user data
Asynchronous Job Server
②
③
※Grey areas are not used in current steps
26. Move Sequence
Migrating primary key generation
Non-
User Role
TypePad
User Role
(User0)
DB(PostgreSQL)
User Role
(User1)
User Role
(User2)
User Role
(User3)
Global
Role
Non-User
Role
Job Server
+ TypePad
+ Schwartz
Schwartz
DB
Explanation
①:Postgres sequences (for generating unique primary keys) are migrated to tables on the global DB that
act as “pseudo-sequences”.
② Application requests new primary keys from global DB rather than the user partition.
DB(MySQL) for partitioned data
Maintains user mapping
and primary key generation
①
※Grey areas are not used in current steps
Migrate sequence
management
Asynchronous Job Server
②
27. User Data Move
Moving user data to the new user-role partitions
Non-
User Role
TypePad
User Role
(User0)
DB(PostgreSQL)
User Role
(User1)
User Role
(User2)
User Role
(User3)
Global
Role
Non-User
Role
Job Server
+ TypePad
+ Schwartz
Schwartz
DB
Explanation
①:Existing users that should be migrated by Job Server are submitted as new Schwartz jobs. User data is
then migrated asynchronously
②:If a comment arrives while the user is being migrated, it is saved in the Schwartz DB to be published later.
③:After being migrated all user data will exist on the user-role DB partitions
④:Once all user data is migrated, only non-user data is on Postgres
DB(MySQL) for partitioned data
Stores job
details
Server for
executing Jobs
Maintains user mapping
and primary key generation
User information is
partitioned
①
②
※Grey areas are not used in current steps
③
Migrating each
user data
DB(MySQL) for partitioned data
④
28. New User Partition
New registrations are created on one user role partition
Non-
User Role
TypePad
User Role
(User0)
DB(PostgreSQL)
User Role
(User1)
User Role
(User2)
User Role
(User3)
Global
Role
Non-User
Role
Job Server
+ TypePad
+ Schwartz
Schwartz
DB
Explanation
①:When new users register, user data is written to a user role partition.
②:Non-user data continues to be served off Postgres
DB(MySQL) for partitioned data
Maintains user mapping
and primary key generation
User information is
partitioned
①
②
※Grey areas are not used in current steps
Asynchronous Job Server
29. New User Strategy
Pick a scheme for distributing new users
Non-
User Role
TypePad
User Role
(User0)
DB(PostgreSQL)
User Role
(User1)
User Role
(User2)
User Role
(User3)
Global
Role
Non-User
Role
Job Server
+ TypePad
+ Schwartz
Schwartz
DB
Explanation
①:When new users register, user data is written to one of the user role partitions, depending on a set
distribution method (round robin, random, etc)
②:Non-user data continues to be served off Postgres
DB(MySQL) for partitioned data
Maintains user mapping
and primary key generation
User information is
partitioned
①
②
※Grey areas are not used in current steps
Asynchronous Job Server
30. Non User Data Move
Migrate data that cannot be partitioned by user
Non-
User Role
TypePad
User Role
(User0)
DB(PostgreSQL)
User Role
(User1)
User Role
(User2)
User Role
(User3)
Global
Role
Non-User
Role
Job Server
+ TypePad
+ Schwartz
Schwartz
DB
Explanation
①:Migrate non-user role data left on PostgreSQL to the MySQL side.
DB(MySQL) for partitioned data
Maintains user mapping
and primary key generation
User information is
partitioned
①
※Grey areas are not used in current steps
Migrate non-User
data
Asynchronous Job Server
Information that does not
need to be partitioned
(such as session
information)
31. Data migration done
Non-
User Role
TypePad
User Role
(User0)
DB(Postgres)
User Role
(User1)
User Role
(User2)
User Role
(User3)
Global
Role
Non-User
Role
Job Server
+ TypePad
+ Schwartz
Schwartz
DB
Explanation
①:All data access is now done through MySQL
②:Continue to use The Schwartz for asynchronous jobs
DB(MySQL) for partitioned data
Stores job
details
Server for
executing Jobs
Maintains user mapping
and primary key generation
User information is
partitioned
①
※Grey areas are not used in current steps
①
② Asynchronous Job Server
Information that does not
need to be partitioned
(such as session
information)
32. Storage
The New TypePad configuration
Database
(MySQL)
Static Content
(HTML,
Images, etc)
Application
Server
Web
Server
TypeCast
Server
ATOM
Server
MEMCACHED
Data Caching servers to
reduce DB load
Dedicated Server for
TypeCast (via ATOM)
https(443)
http(80)
http(80) : atom api
memcached(11211)
MySQL(3306)
Mail
Server
Internet
nfs(2049)
ADMIN(CRON)
Server
smtp(25) / pop(110)
Blog Readers
Blog Owners
(management
interface)
Mobile Blog
Readers
smtp(25) / pop(110)
Cron Server for periodic
asynchronous tasks
Job
Server
TheSchwartz server for
running ad-hoc jobs
asynchronously
34. DB Node Spec History
Time OS(RedHat) CPU Xeon MEM DiskArray
2003/12
2007/11
7.4(2.4.9) 1.8GHz/512k×1 1GB No
ES2.1(2.4.9) 3.2GHz/1M×2 4GB No
ES2.1(2.4.9) 3.2GHz/1M×2 4GB Yes
AS2.1(2.4.9) 3.2GHz/1M×4 12G
B
Yes
AS4 (2.6.9) 3.2GHz/1M×4 12G
B
Yes
AS4 (2.6.9) MP3.3GHz/1M×4
〔2Core×4〕
16G
B
Yes
History of scale up PostgreSQL server, Before DBP
35. DB DiskArray Spec
[FUJITSU ETERNUS8000]
Best I/O transaction performance in the world
146GB (15 krpm) * 32disk with RAID - 10
MultiPath FibreChannel 4Gbps
QuickOPC (One Point Copy)
OPC copy functions let you create a duplicate copy
of any data from the original at any chosen time.
http://www.computers.us.fujitsu.com/www/pro
ducts_storage.shtml?products/storage/fujitsu/
e8000/e8000
History of scale up PostgreSQL server, Before DBP
36. Scale out MySQL servers, After DBP
A role configuration
Each role is configured as HA cluster
HA Software: NEC ClusterPro
Shared Storage
37. Scale out MySQL servers, After DBP
Postgre
SQL
FibreChannel SAN
DiskArray
…
heart beat
MySQL
Role3
MySQL
Role2
MySQL
Role1
TypePad
Application
38. Scale out MySQL servers, After DBP
Backup
Replication w/ Hot backup
39. Scale out MySQL servers, After DBP
Postgre
SQL
FibreChannel SAN
DiskArray
…
heart beat
MySQL
Role3
MySQL
Role2
MySQL
Role1
MySQL
BackupRole
TypePad
Application
mysqld mysqld mysqld
rep rep rep
opc
mysqld
mysqld
mysqld
40. Troubles with PostreSQL 7.4 – 8.1
Data size
over 100 GB
40% is index
Severe Data Fragmentation
VACUUM
“VACUUM analyze” cause the performance problem
Takes too long to VACUUM large amounts of data
dump/restore is the only solution for de-fragmentation
Auto VACUUM
We don’t use Auto VACUUM since we are worried about
latent response time
41. Troubles with PostgreSQL 7.4 – 8.1
Character set
PostgreSQL allow the out of boundary UTF-8
Japanese extended character sets and multi
bytes character sets which normally should
come back with an error - instead of
accepting them.
42. “Cleaning” data
Removing characters set that are out of the
boundries UTF-8 character sets.
Steps
PostgreSQL.dumpALL
Split for Piconv
UTF8 -> UCS2 -> UTF8 & Merge
PostgreSQL.restore
dump Split UTF8->UCS2->UTF8 Mergerestore
43. TypePadTypePad
Migration from PostgreSQL to MySQL using TypePad script
Steps
PostgreSQL -> PerlObject & tmp publish
-> MySQL -> PerlObject & last publish
diff tmp & last Object (data check)
diff tmp & last publish (file check)
PostgreSQL
Document
Object
tmp
Document
Object
last
File check
data check
44. Troubles with MySQL
convert_tz function
doesn't support the input value outside the
scope of Unix Time
sort order
different sort order without “order by” clause
46. Consulting by
Sumisho Computer Systems Corp.
System Integrator
first and best partner of MySQL in Japan
since 2003
provide MySQL consulting, support, training
service
HA
Maintenance
online backup
Japanese character support