SlideShare a Scribd company logo
1 of 22
Download to read offline
Live migrating a container:
pros, cons and gotchas
Pavel Emelyanov
Principal engineer @ Virtuozzo
AgendaAgenda
• Why you might want to live migrate a container
• Why (and how) to avoid live migration
• Why is container live migration so complex
2
Migration in a nutshelMigration in a nutshel
• Save state
• Copy state
• Restore from state
3
Why you might want to live migrate a containerWhy you might want to live migrate a container
• Spectacular
• Load balancing
• Updating kernel
– Can avoid live migration, just C/R
• Updaring or replacing hardware
4
Why to avoid live migrationWhy to avoid live migration
5
How to avoid live migrationHow to avoid live migration
• Balance network traffic
• Microservices
• Crash-driven updates
• Planned downtime
6
Making live migration liveMaking live migration live
• State saving, transfering and restoring happens with tasks frozen
• (Big) memory transfer should not be done at that time
• Memory pre-copy
• Memory post-copy
7
Pre-copyPre-copy
• Track memory changes,
copy memory while tasks are running, goto again
• Pros:
– Safe: once migrated, source node can disappear
• Cons:
– Unpredictable: iterations may take long
– Non-guaranteed: “dirty” memory next round may remain big
8
Post-copyPost-copy
• Migrate all but memory, turn on “network swap” on destination
• Pros:
– Predictable: time to migrate can be well estimated
• Cons:
– Unsafe: src node death means death of container on destination
9
Live migration at lengthLive migration at length
• Memory pre-copy (iteratively, optional)
• Freeze + Save state
• Copy state
• Restore from state + Unfreeze and resume
• Memory post-copy (optional)
10
GotchasGotchas
11
VS
Things to work withThings to work with
• VM
– Environment: virtual hardware, paravirt
– CPU
– Memory
• Container
– Environment: cgroups, namespaces
– Processes and other animals
– Memory
12
Memory pre-copyMemory pre-copy
• VM
– All memory at hands
– Plain address space
• Container
– Memory
●
is scatered over the processes
●
can be (or can be not) shared
●
can be (or can be not) mapped to disk files
13
Save stateSave state
• VM
– Hardware state
●
Tree of ~100 objects
●
Fixed amount of data per each
• Container
– State of all objects
●
Graph of up to ~1000 objects
●
All have different amount of data, different reading API
14
Restore from stateRestore from state
• VM
– Copy memory in place, write state into devices
• Container
– Creation of many small objects
– Not all have sane API for creation
●
Creation sequence can be non-trivial
15
Memory post-copyMemory post-copy
• UserfaultFD from Andrea Archangeli
• VM
– Merged into 4.2
• Container
– Non-cooperative work of uffd monitor and client,
need further patching
16
And we also need this, this and this!And we also need this, this and this!
• Check for CPUs compatibility
• Check and load necessary kernel modules (iptables, filesystems)
• Non-shared filesystem should be copied
• Roll-back on source node if something fails in between
– Keep tasks frozen after dump, kill after restore
17
ImplementationImplementation
• CRIU
– Save & restore state
– Memory pre/post copy
• P.Haul
– Checks
– Orchestrate all C/R steps
– Deal with filesystem
18
P.Haul goalsP.Haul goals
• Provide engine for containers live miration using CRIU
• Perform necessary pre-checks (e.g. CPU compatibility)
• Organize memory pre-copy and/or post-copy
• Take care of file-system migration (if needed)
19
Under the hoodUnder the hood
20
CRIU CRIUp.haul p.hauldocker -d docker -d
migrate
src dst
check (CPUs, kernels)
pre-dump
memory
dump
other images
restore
memory
lazy mem
FS
FS copy
done
pre-copypost-copy
kill
freeze
time
More infoMore info
• http://criu.org
• http://criu.org/P.Haul
• criu@openvz.org
• +CriuOrg / @__criu__
• https://github.com/xemul/(criu|p.haul)
21
Thank you!
Pavel Emelyanov
@__criu__
xemul@openvz.org

More Related Content

What's hot

バックアップとリストアの基礎
バックアップとリストアの基礎バックアップとリストアの基礎
バックアップとリストアの基礎
Kazuki Takai
 
What is Virtualization
What is VirtualizationWhat is Virtualization
What is Virtualization
Israel Marcus
 

What's hot (20)

HTTP/3 시대의 웹 성능 최적화 기술 이해하기
HTTP/3 시대의 웹 성능 최적화 기술 이해하기HTTP/3 시대의 웹 성능 최적화 기술 이해하기
HTTP/3 시대의 웹 성능 최적화 기술 이해하기
 
NDC 2018 디지털 신원 확인 101
NDC 2018 디지털 신원 확인 101NDC 2018 디지털 신원 확인 101
NDC 2018 디지털 신원 확인 101
 
1.Introduction to virtualization
1.Introduction to virtualization1.Introduction to virtualization
1.Introduction to virtualization
 
Kubernetesのワーカーノードを自動修復するために必要だったこと
Kubernetesのワーカーノードを自動修復するために必要だったことKubernetesのワーカーノードを自動修復するために必要だったこと
Kubernetesのワーカーノードを自動修復するために必要だったこと
 
(DEV204) Building High-Performance Native Cloud Apps In C++
(DEV204) Building High-Performance Native Cloud Apps In C++(DEV204) Building High-Performance Native Cloud Apps In C++
(DEV204) Building High-Performance Native Cloud Apps In C++
 
バックアップとリストアの基礎
バックアップとリストアの基礎バックアップとリストアの基礎
バックアップとリストアの基礎
 
Hyper v ネットワークの基本
Hyper v ネットワークの基本Hyper v ネットワークの基本
Hyper v ネットワークの基本
 
Converting Scene Data to DOTS – Unite Copenhagen 2019
Converting Scene Data to DOTS – Unite Copenhagen 2019Converting Scene Data to DOTS – Unite Copenhagen 2019
Converting Scene Data to DOTS – Unite Copenhagen 2019
 
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
 
쉽고 강력한 모바일 백엔드 Parse-server
쉽고 강력한 모바일 백엔드 Parse-server쉽고 강력한 모바일 백엔드 Parse-server
쉽고 강력한 모바일 백엔드 Parse-server
 
Virtualization
VirtualizationVirtualization
Virtualization
 
What is Virtualization
What is VirtualizationWhat is Virtualization
What is Virtualization
 
virtualization-vs-containerization-paas
virtualization-vs-containerization-paasvirtualization-vs-containerization-paas
virtualization-vs-containerization-paas
 
OpenStack を 拡張する NetApp Unified Driver の使い方 Vol.001
OpenStack を 拡張する NetApp Unified Driver の使い方 Vol.001OpenStack を 拡張する NetApp Unified Driver の使い方 Vol.001
OpenStack を 拡張する NetApp Unified Driver の使い方 Vol.001
 
프레임레이트 향상을 위한 공간분할 및 오브젝트 컬링 기법
프레임레이트 향상을 위한 공간분할 및 오브젝트 컬링 기법프레임레이트 향상을 위한 공간분할 및 오브젝트 컬링 기법
프레임레이트 향상을 위한 공간분할 및 오브젝트 컬링 기법
 
Microsoft Windows Server 2012 R2 Hyper V server overview
Microsoft Windows Server 2012 R2 Hyper V server overviewMicrosoft Windows Server 2012 R2 Hyper V server overview
Microsoft Windows Server 2012 R2 Hyper V server overview
 
CI/CDって何が良いの?〜言うてるオレもわからんわ〜 #DevKan
CI/CDって何が良いの?〜言うてるオレもわからんわ〜 #DevKanCI/CDって何が良いの?〜言うてるオレもわからんわ〜 #DevKan
CI/CDって何が良いの?〜言うてるオレもわからんわ〜 #DevKan
 
A split screen-viable UI event system - Unite Copenhagen 2019
A split screen-viable UI event system - Unite Copenhagen 2019A split screen-viable UI event system - Unite Copenhagen 2019
A split screen-viable UI event system - Unite Copenhagen 2019
 
What's Coming In CloudStack 4.18
What's Coming In CloudStack 4.18What's Coming In CloudStack 4.18
What's Coming In CloudStack 4.18
 
今さら聞けない人のためのKubernetes超入門
今さら聞けない人のためのKubernetes超入門今さら聞けない人のためのKubernetes超入門
今さら聞けない人のためのKubernetes超入門
 

Viewers also liked

Systems Migration
Systems MigrationSystems Migration
Systems Migration
richchihlee
 

Viewers also liked (7)

Large Scale Migration from WebLogic to JBoss
Large Scale Migration from WebLogic to JBossLarge Scale Migration from WebLogic to JBoss
Large Scale Migration from WebLogic to JBoss
 
Systems Migration
Systems MigrationSystems Migration
Systems Migration
 
T44u 2015, content migration
T44u 2015, content migrationT44u 2015, content migration
T44u 2015, content migration
 
Modular Enterprise Systems - An Introduction
Modular Enterprise Systems - An IntroductionModular Enterprise Systems - An Introduction
Modular Enterprise Systems - An Introduction
 
Seminar - JBoss Migration
Seminar - JBoss MigrationSeminar - JBoss Migration
Seminar - JBoss Migration
 
A Roadmap to Data Migration Success
A Roadmap to Data Migration SuccessA Roadmap to Data Migration Success
A Roadmap to Data Migration Success
 
Preparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guidePreparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guide
 

Similar to Live migrating a container: pros, cons and gotchas

Live migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel EmelyanovLive migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel Emelyanov
OpenVZ
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture Performance
Enkitec
 
NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5
UniFabric
 
OGG Architecture Performance
OGG Architecture PerformanceOGG Architecture Performance
OGG Architecture Performance
Enkitec
 
Buiding a better Userspace - The current and future state of QEMU and KVM int...
Buiding a better Userspace - The current and future state of QEMU and KVM int...Buiding a better Userspace - The current and future state of QEMU and KVM int...
Buiding a better Userspace - The current and future state of QEMU and KVM int...
aliguori
 
Overview of sheepdog
Overview of sheepdogOverview of sheepdog
Overview of sheepdog
Liu Yuan
 
Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemaker
kuchinskaya
 

Similar to Live migrating a container: pros, cons and gotchas (20)

Live migrating a container: pros, cons and gotchas -- Pavel Emelyanov
Live migrating a container: pros, cons and gotchas -- Pavel EmelyanovLive migrating a container: pros, cons and gotchas -- Pavel Emelyanov
Live migrating a container: pros, cons and gotchas -- Pavel Emelyanov
 
Live migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel EmelyanovLive migration: pros, cons and gotchas -- Pavel Emelyanov
Live migration: pros, cons and gotchas -- Pavel Emelyanov
 
CRIU: Time and Space Travel for Linux Containers
CRIU: Time and Space Travel for Linux ContainersCRIU: Time and Space Travel for Linux Containers
CRIU: Time and Space Travel for Linux Containers
 
CRIU: time and space travel for Linux containers -- Kir Kolyshkin
CRIU: time and space travel for Linux containers -- Kir KolyshkinCRIU: time and space travel for Linux containers -- Kir Kolyshkin
CRIU: time and space travel for Linux containers -- Kir Kolyshkin
 
Docking postgres
Docking postgresDocking postgres
Docking postgres
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture Performance
 
NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5
 
The Proper Care and Feeding of a MySQL Database for Busy Linux Admins -- SCaL...
The Proper Care and Feeding of a MySQL Database for Busy Linux Admins -- SCaL...The Proper Care and Feeding of a MySQL Database for Busy Linux Admins -- SCaL...
The Proper Care and Feeding of a MySQL Database for Busy Linux Admins -- SCaL...
 
Proper Care and Feeding of a MySQL Database for Busy Linux Administrators
Proper Care and Feeding of a MySQL Database for Busy Linux AdministratorsProper Care and Feeding of a MySQL Database for Busy Linux Administrators
Proper Care and Feeding of a MySQL Database for Busy Linux Administrators
 
OGG Architecture Performance
OGG Architecture PerformanceOGG Architecture Performance
OGG Architecture Performance
 
Spil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NLSpil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NL
 
Buiding a better Userspace - The current and future state of QEMU and KVM int...
Buiding a better Userspace - The current and future state of QEMU and KVM int...Buiding a better Userspace - The current and future state of QEMU and KVM int...
Buiding a better Userspace - The current and future state of QEMU and KVM int...
 
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
 
Overview of sheepdog
Overview of sheepdogOverview of sheepdog
Overview of sheepdog
 
Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemaker
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
Lec 9-os-review
Lec 9-os-reviewLec 9-os-review
Lec 9-os-review
 
oVirt 3.5 Storage Features Overview
oVirt 3.5 Storage Features OverviewoVirt 3.5 Storage Features Overview
oVirt 3.5 Storage Features Overview
 
Magento Imagine 2015 - Aspirin For Your MySQL Headaches
Magento Imagine 2015 - Aspirin For Your MySQL HeadachesMagento Imagine 2015 - Aspirin For Your MySQL Headaches
Magento Imagine 2015 - Aspirin For Your MySQL Headaches
 
RHEVM - Live Storage Migration
RHEVM - Live Storage MigrationRHEVM - Live Storage Migration
RHEVM - Live Storage Migration
 

More from Docker, Inc.

Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 

More from Docker, Inc. (20)

Containerize Your Game Server for the Best Multiplayer Experience
Containerize Your Game Server for the Best Multiplayer Experience Containerize Your Game Server for the Best Multiplayer Experience
Containerize Your Game Server for the Best Multiplayer Experience
 
How to Improve Your Image Builds Using Advance Docker Build
How to Improve Your Image Builds Using Advance Docker BuildHow to Improve Your Image Builds Using Advance Docker Build
How to Improve Your Image Builds Using Advance Docker Build
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
 
Securing Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINXSecuring Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINX
 
How To Build and Run Node Apps with Docker and Compose
How To Build and Run Node Apps with Docker and ComposeHow To Build and Run Node Apps with Docker and Compose
How To Build and Run Node Apps with Docker and Compose
 
Hands-on Helm
Hands-on Helm Hands-on Helm
Hands-on Helm
 
Distributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at SalesforceDistributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at Salesforce
 
The First 10M Pulls: Building The Official Curl Image for Docker Hub
The First 10M Pulls: Building The Official Curl Image for Docker HubThe First 10M Pulls: Building The Official Curl Image for Docker Hub
The First 10M Pulls: Building The Official Curl Image for Docker Hub
 
Monitoring in a Microservices World
Monitoring in a Microservices WorldMonitoring in a Microservices World
Monitoring in a Microservices World
 
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
 
Predicting Space Weather with Docker
Predicting Space Weather with DockerPredicting Space Weather with Docker
Predicting Space Weather with Docker
 
Become a Docker Power User With Microsoft Visual Studio Code
Become a Docker Power User With Microsoft Visual Studio CodeBecome a Docker Power User With Microsoft Visual Studio Code
Become a Docker Power User With Microsoft Visual Studio Code
 
How to Use Mirroring and Caching to Optimize your Container Registry
How to Use Mirroring and Caching to Optimize your Container RegistryHow to Use Mirroring and Caching to Optimize your Container Registry
How to Use Mirroring and Caching to Optimize your Container Registry
 
Monolithic to Microservices + Docker = SDLC on Steroids!
Monolithic to Microservices + Docker = SDLC on Steroids!Monolithic to Microservices + Docker = SDLC on Steroids!
Monolithic to Microservices + Docker = SDLC on Steroids!
 
Kubernetes at Datadog Scale
Kubernetes at Datadog ScaleKubernetes at Datadog Scale
Kubernetes at Datadog Scale
 
Labels, Labels, Labels
Labels, Labels, Labels Labels, Labels, Labels
Labels, Labels, Labels
 
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment ModelUsing Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
 
Build & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWSBuild & Deploy Multi-Container Applications to AWS
Build & Deploy Multi-Container Applications to AWS
 
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
 
Developing with Docker for the Arm Architecture
Developing with Docker for the Arm ArchitectureDeveloping with Docker for the Arm Architecture
Developing with Docker for the Arm Architecture
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Live migrating a container: pros, cons and gotchas

  • 1. Live migrating a container: pros, cons and gotchas Pavel Emelyanov Principal engineer @ Virtuozzo
  • 2. AgendaAgenda • Why you might want to live migrate a container • Why (and how) to avoid live migration • Why is container live migration so complex 2
  • 3. Migration in a nutshelMigration in a nutshel • Save state • Copy state • Restore from state 3
  • 4. Why you might want to live migrate a containerWhy you might want to live migrate a container • Spectacular • Load balancing • Updating kernel – Can avoid live migration, just C/R • Updaring or replacing hardware 4
  • 5. Why to avoid live migrationWhy to avoid live migration 5
  • 6. How to avoid live migrationHow to avoid live migration • Balance network traffic • Microservices • Crash-driven updates • Planned downtime 6
  • 7. Making live migration liveMaking live migration live • State saving, transfering and restoring happens with tasks frozen • (Big) memory transfer should not be done at that time • Memory pre-copy • Memory post-copy 7
  • 8. Pre-copyPre-copy • Track memory changes, copy memory while tasks are running, goto again • Pros: – Safe: once migrated, source node can disappear • Cons: – Unpredictable: iterations may take long – Non-guaranteed: “dirty” memory next round may remain big 8
  • 9. Post-copyPost-copy • Migrate all but memory, turn on “network swap” on destination • Pros: – Predictable: time to migrate can be well estimated • Cons: – Unsafe: src node death means death of container on destination 9
  • 10. Live migration at lengthLive migration at length • Memory pre-copy (iteratively, optional) • Freeze + Save state • Copy state • Restore from state + Unfreeze and resume • Memory post-copy (optional) 10
  • 12. Things to work withThings to work with • VM – Environment: virtual hardware, paravirt – CPU – Memory • Container – Environment: cgroups, namespaces – Processes and other animals – Memory 12
  • 13. Memory pre-copyMemory pre-copy • VM – All memory at hands – Plain address space • Container – Memory ● is scatered over the processes ● can be (or can be not) shared ● can be (or can be not) mapped to disk files 13
  • 14. Save stateSave state • VM – Hardware state ● Tree of ~100 objects ● Fixed amount of data per each • Container – State of all objects ● Graph of up to ~1000 objects ● All have different amount of data, different reading API 14
  • 15. Restore from stateRestore from state • VM – Copy memory in place, write state into devices • Container – Creation of many small objects – Not all have sane API for creation ● Creation sequence can be non-trivial 15
  • 16. Memory post-copyMemory post-copy • UserfaultFD from Andrea Archangeli • VM – Merged into 4.2 • Container – Non-cooperative work of uffd monitor and client, need further patching 16
  • 17. And we also need this, this and this!And we also need this, this and this! • Check for CPUs compatibility • Check and load necessary kernel modules (iptables, filesystems) • Non-shared filesystem should be copied • Roll-back on source node if something fails in between – Keep tasks frozen after dump, kill after restore 17
  • 18. ImplementationImplementation • CRIU – Save & restore state – Memory pre/post copy • P.Haul – Checks – Orchestrate all C/R steps – Deal with filesystem 18
  • 19. P.Haul goalsP.Haul goals • Provide engine for containers live miration using CRIU • Perform necessary pre-checks (e.g. CPU compatibility) • Organize memory pre-copy and/or post-copy • Take care of file-system migration (if needed) 19
  • 20. Under the hoodUnder the hood 20 CRIU CRIUp.haul p.hauldocker -d docker -d migrate src dst check (CPUs, kernels) pre-dump memory dump other images restore memory lazy mem FS FS copy done pre-copypost-copy kill freeze time
  • 21. More infoMore info • http://criu.org • http://criu.org/P.Haul • criu@openvz.org • +CriuOrg / @__criu__ • https://github.com/xemul/(criu|p.haul) 21