The document describes the steps to migrate a legacy movie ratings application to a serverless architecture using Apache Kafka and KSQL. The legacy application uses a monolithic, database-centric design to calculate average movie ratings. The migration plan involves: 1) Using Kafka Connect to extract rating data to Kafka topics, 2) Setting up a Confluent Cloud Kafka cluster, 3) Replicating data to the cloud cluster, 4) Processing data with KSQL queries, 5) Building microservices powered by KSQL output, and 6) Decommissioning the monolith once migration is complete.
3. Our Movie Rating Service, Legacy Edition
• Users rate movies, ratings go into Kafka
• Monolithic, database-centric application calculates averages
• Serves them to users through a web UI and API
Movie
Ratings
Users
Movies
Top Rated Movies
My Favorite Movie
Moviegoers
4. Kafka served well
• Decouples event input from processing
• Easily understood abstraction for event processing
• Not exactly a pleasure to operate, but we can’t complain
Movie
Ratings
Users
Movies
Top Rated Movies
My Favorite Movie
Moviegoers
6. Our Monolith’s Problems
• It will do a bad job managing complexity as our service grows
• The Kafka Consumer code is bespoke
• It is a textbook pre-cloud architecture
• We cannot trivially scale to larger message volumes
Movie
Ratings
Users
Movies
Top Rated Movies
My Favorite Movie
Moviegoers
8. Our Refactoring Plan
• Capture Users and Movies as Kafka topics
• Migrate all topics to Confluent Cloud using Confluent Replicator
• Refactor monolith to microservices
• Keep web UI nearly untouched
• Never touch the on-prem system until the migration is complete
Movie
Ratings
Users
Movies
Top Rated Movies
My Favorite Movie
Moviegoers
9. Step One: Fewer Databases
• Use Kafka Connect to extract Users and Movies tables to Kafka topics
Movie
Ratings
Moviegoers
Movies
Users
Kafka
Connect
10. Step Two: Spin up a Confluent Cloud cluster
• We want to get out of the business of managing Kafka ourselves
Movie
Ratings
Movies
Users
11. Step Three: Deploy Confluent Replicator
• Use Kafka Connect to extract Users and Movies tables to Kafka topics
Movie
Ratings
Movies
Users
Movie
Ratings
Movies
Users
Replicator
Replicator
Replicator
12. Step Four: Convert to KSQL
• Bespoke Consumer code implements non-differentiated functionality
Movie
Ratings
Movies
Users
CREATE TABLE movie_ratings AS
SELECT title,
SUM(rating)/COUNT(rating) AS avg_rating,
COUNT(rating) AS num_ratings
FROM ratings
LEFT OUTER JOIN movies
ON ratings.movie_id = movies.movie_id
GROUP BY title;
13. Step Four: Convert to KSQL
• The rating averaging query
Movie
Ratings
Movies
Users
Rated
Movies
KSQL
magic goes here
14. Step Four: Convert to KSQL
• The user favorite query
Movie
Ratings
Movies
Users
Rated
Movies
KSQL
magic goes here
more
KSQL
magic goes here
User
Favorites
15. Step Five: Extract the rating average service
• Now serve rating averages from KSQL output
• Monolith no longer serves these results
Rating Averages
Rated
Movies
16. Step Six: Extract the user favorite service
• Now serve rating averages from KSQL output
• Monolith no longer serves these results
User Favorites User
Favorites
17. Step Seven: Stand down the monolith
• Now serve rating averages from KSQL output
• Monolith no longer serves these results
User Favorites
Rated Movies
User
Favorites
Movie
Ratings
Moviegoers
Rated
Movies
so much
KSQL
magic
Movies
Users
18. Step Eight: Stand down Replicator
• All data is in Confluent Cloud now
• For hybrid on-prem/cloud deployment