This document discusses migrating data from MySQL to Amazon Redshift. It describes MySQL and Redshift, and some of the challenges of migrating between the two systems, such as incompatible schemas and manual processes. The proposed solution is to use a cloud data lake with schema-on-read to store JSON event data, which can then be loaded into Redshift, a cloud data warehouse with schema-on-write, providing an automated way to migrate data between different systems and schemas.
4. • Opensource relational database system
• World’s third most widely used RDBMS
• >100 million installations.
• Part of LAMP stack
What’s
MySQL?
5. What’s
Redshift?
• Massively Parallel Processing (MPP) database
• Cloud-based, pay as you go
• Migrate to Redshift from:
-On-Premises data warehouses
-Sharded MySQL/PostgreSQL
10. What are a few
migration
challenges
?
REDSHIFT
1. STORAGE?
2. SCHEMA COMPATIBILITY?
3. AUTOMATION?
11. Manually exporting my sql to redshift
Create a Redshift cluster
Export MySQL data and split them into multiple files
Upload the load files to Amazon S3
Run a create table command
Run a COPY command to load the table
Verify that the data was loaded correctly
1
2
3
4
5
6
12. What are the solutions?
Scheduling
8
Cloud storageSchema on read
13. Solution: Cloud Data Lake + Redshift
JSON Event Data Cloud Data Lake
(schema-on-read)
Cloud Data Warehouse
(schema-on-write)