Migrating from One Cloud Provider to Another (Without Losing Your Data or Your Sanity) (Oguz Kayral, Unity Technologies) Kafka Summit 2020

GenerativeArt—MadewithUnity
Migrating from One
Cloud Provider to
Another
(without Losing Your Data or Your Sanity)
Oguz Kayral | Unity Technologies | August 24, 2020

About Unity
2
— Millions of Creators, Billions of Gamers
— Half of Top 1000 Mobile Games
— Products include Analytics, Monetization, Crash and
Performance Reporting, Asset Store, Collaborate, Cloud
Build
— Beyond Game Development:
– Automotive, Transportation & Manufacturing
– Film, Animation & Cinematics
– Architecture, Engineering & Construction

Apache Kafka at Unity
3
— In production since Kafka 0.8
— 10s of Billions of events every day
— Served as the backbone of a massive cloud migration

5
“Migrations are the only mechanism to
effectively manage technical debt as your
company and code grows. If you don't get
effective at software and system
migrations, you'll end up languishing in
technical debt.”
— Will Larson, “An Elegant Puzzle: Systems of Engineering Management”

Why migrations are hard?
7
— Can we stop the world?

8

9
– Synchronizing starting state is trying to hit a moving target.

10
— How about double writes and double reads?

11

12
– Too much organizational complexity involved.

13
— Why solve the same problem twice? (or 100 times)
– Every team will need their own pipeline for their own purposes.

14
– Every team will need their own pipeline for their own purposes.
— Are we even speaking the same language?
– Legacy systems might not be compatible with newer
cloud-based application.

15
Event driven architecture
to the rescue

Event driven architecture to the rescue
16
– Stream processing is the perfect tool to deal with changing state
– Kafka Connect or CDC solutions can act as the bridge
– Confluent Replicator acts as the single integration pipeline
— Are we even speaking the same language?
– Kafka client libraries and Connect make sure every system can
be connected

17

18
— Event driven architecture enables using hybrid or
multi-cloud deployments to continue operating during
migrations
— MirrorMaker, uReplicator, Confluent Replicator

Hybrid Architecture
19
Cloud migration without a central pipeline Hybrid event driven architecture migration
using Kafka and Replicator

Preparing for the migration
20
— You can’t overdesign
– A migration planned for the happy path will fail
– Try to uncover harder paths and edge cases as early as possible
— Pre-mortem everything
– If you think it can break, it will.
— Minimize one-way doors
– Plan a way to revert as many operations as possible

21
— “Festina lente”
– Make haste slowly. Tooling and documentation built before
kicking off the migration will act as a force multiplier
— It’s OK to ask for help
– Get aligned with other internal teams who are subject matter
experts on infrastructure, network, project management…
– Get external help when necessary. Professional services,
trainings etc.

22
— Make sure Kafka is sufficiently resourced on both sides
– More memory is better (at least 32GB
– Multiple disks (we had 8, ext4 or XFS
– Uniform nodes
– Network… read fallacies of distributed computing
— Use a tool to simplify Data Center Interconnect
— Install the Replicator Monitoring Extension
— Don’t trust the network. Don’t trust ZooKeeper (KIP500

23
— Make sure JVM is properly configured
— Make sure Kafka and producers are configured to
minimize chance of data loss
-Xms6g -Xmx6g -XX:MetaspaceSize=96m -XX:+UseG1GC -XX:MaxGCPauseMillis=20
-XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M
-XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80
unclean.leader.election.enable=false
default.replication.factor=3
min.insync.replicas=2
acks=all

Replicator
24
— Prevents cyclic replication of topics
– Enables 2 Kafka clusters to run in active-active mode enabling
producing and consuming on both sides
— Timestamp Preservation
– Replicator will preserve the timestamp of the message from the
source cluster on the target cluster
— Consumer Offset Translation
– Replicator automatically translates offsets using timestamps so
that consumers can start consuming data in the destination
cluster where they left off in the origin cluster.

Finalizing the preparation
25
— By this point:
– If you don’t have a good idea of what objectives and timelines
look like (total migration count etc.), go back to planning. You
can’t hit a target you can’t see.
– If a majority of teams are not willing to prioritize the migration,
go back to clarifying the objectives or reframing the discussion.
– If a team can not run their migration using documentation and
self-serve tooling provided by the migration owner. Go back to
improving them.

Running the migration
26
— The best migration is one you don’t have to do
– If the preparation steps went well, by this point most simple
migrations can be automated
— Track the migration separately from the workload of the
team
– Separate board, separate meetings, separate reporting, DRI etc.
– Only progress is completed migrations
– Report is “completed migrations / total migrations”

Running the migration
27
— Thresholds and alerts are your friend
– Link pagers with Control Center
— Remember that you’re paying for both sides during
migration
– Migrations are only successful at 100%
— Wrap it up
– In edge cases where automated or self-serve tooling is not
enough and service team can’t prioritize the migration, step in
and push it over the finish line

Finishing the migration
28
— Recognize and celebrate
— Be diligent in shutting down old infra (remember one-way
doors)
— Start tracking new tech debt caused by shortcuts taken
during the migration

Wrap-up
29
— Plan deeply. Resources and configuration require careful
thought.
— Go slow to go fast.
— Automation, self-serve tooling and documentation will
lead to success.
— Replicator enables many use cases.
— Migrations are only successful at 100%.

GenerativeArt—MadewithUnity
Thank you.

Migrating from One Cloud Provider to Another (Without Losing Your Data or Your Sanity) (Oguz Kayral, Unity Technologies) Kafka Summit 2020

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Migrating from One Cloud Provider to Another (Without Losing Your Data or Your Sanity) (Oguz Kayral, Unity Technologies) Kafka Summit 2020

Similaire à Migrating from One Cloud Provider to Another (Without Losing Your Data or Your Sanity) (Oguz Kayral, Unity Technologies) Kafka Summit 2020 (20)

Plus de HostedbyConfluent

Plus de HostedbyConfluent (20)

Dernier

Dernier (20)

Migrating from One Cloud Provider to Another (Without Losing Your Data or Your Sanity) (Oguz Kayral, Unity Technologies) Kafka Summit 2020