SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
The hardest thing
in computer science
Hard things
Docker Caching
Dependency versions
Install dependencies
[ 20 minutes or so ]
Only here copy all sources
Intended behaviour
● No change:
docker is not rebuilt - LIGHTNING FAST!!!!
● Sources change/dependencies not:
only sources are added - QUITE FAST !!!
● Dependencies change:
dependencies installed, sources - LITTLE SLOWER !!
Actual behaviour
same machine - local checkout
● Local docker registry
● Repeated build: 1:06m
● Only sources: 1:30m
● Dependencies: 11m
● Whole build: ~ 30m
CI case
● Always fresh machine
○ no code
○ no registry
● Git clone/checkout
● Build
● Wipeout
Docker registry to the rescue!
Build cache:
● Docker build
● Docker push airflow/airflow:latest
Use cache:
● Docker pull airflow/airflow:latest
● docker build --cache-from ariflow/airflow:latest
Actual behaviour
Docker Hub automated build
● DockerHub docker registry as cache
● Repeated build: 11m
● Only sources: 11m <- Still OK
● Dependencies: ~1h
● Whole build: ~ 2h
Using the cache in Travis CI
● Docker Hub builds are slow
● Travis or Cloud Build use earlier image
with --cache-from
● But only sources change most of the
time
Actual BAD behaviour
Travis CI automated build
● Build on Travis with cache from DockerHub
● Repeated build: 11m
● Only sources: 1 h <-
● Dependencies: 1h
● Whole build: ~ 2h
Problem no 1
Git & permissions
● git clone file creation:
○ local user
○ default user’s group
● file/dir permissions (rwxs)
○ preserves user, group and other rx permissions files & dirs
○ does not store w and by default uses umask when cloning by default
○ core.sharedRepository git-config
■ one of: group(true), all, umask(false), 0xxx
● Umask WTF:
○ file: 644 (DockeHub) vs. 664 (Travis CI)
○ dir: 755 (DockerHub) vs. 775 (Travis CI)
Solution to problem 1
Fix group permissions
Problem no 2
Generated files
● not only .gitignore
● generated files
○ autoapi - documentation
○ build artifacts
○ npm cache
○ .pyc files
○ files created accidentally (wget in source folder anyone?)
● COPY .
● Context calculated based on ALL files
● .dockerignore != .gitignore
● slightly different syntax
Solution to problem 2
Set .dockerignore ** by default
Problem no. 3
● Download & compile ALL dependencies takes time!
Partial solution to problem 3
Find the weakest link
Solution to problem 3
a) build image with wheels
Solution to problem 3
b) Copy directory via multi-stage
Docker builds
Solution 3
c) install using wheels
Thank You!
You can add some info where to follow you,
or add information about
polidea.com/blog

Contenu connexe

Tendances

Inside Sqale's Backend at Sapporo Ruby Kaigi 2012
Inside Sqale's Backend at Sapporo Ruby Kaigi 2012Inside Sqale's Backend at Sapporo Ruby Kaigi 2012
Inside Sqale's Backend at Sapporo Ruby Kaigi 2012
Gosuke Miyashita
 

Tendances (20)

Introduction to Docker Compose
Introduction to Docker ComposeIntroduction to Docker Compose
Introduction to Docker Compose
 
Docker Athens: Docker Engine Evolution & Containerd Use Cases
Docker Athens: Docker Engine Evolution & Containerd Use CasesDocker Athens: Docker Engine Evolution & Containerd Use Cases
Docker Athens: Docker Engine Evolution & Containerd Use Cases
 
RancherOS - The perfect place to run Docker
RancherOS - The perfect place to run DockerRancherOS - The perfect place to run Docker
RancherOS - The perfect place to run Docker
 
Docker for Developers: Dev, Test, Deploy @ BucksCo Devops at MeetMe HQ
Docker for Developers: Dev, Test, Deploy @ BucksCo Devops at MeetMe HQDocker for Developers: Dev, Test, Deploy @ BucksCo Devops at MeetMe HQ
Docker for Developers: Dev, Test, Deploy @ BucksCo Devops at MeetMe HQ
 
Ansible as a better shell script
Ansible as a better shell scriptAnsible as a better shell script
Ansible as a better shell script
 
[Szjug] Docker. Does it matter for java developer?
[Szjug] Docker. Does it matter for java developer?[Szjug] Docker. Does it matter for java developer?
[Szjug] Docker. Does it matter for java developer?
 
Let's Count Bytes! Launching Ruby in 32K of RAM
Let's Count Bytes! Launching Ruby in 32K of RAMLet's Count Bytes! Launching Ruby in 32K of RAM
Let's Count Bytes! Launching Ruby in 32K of RAM
 
Containers: What are they, Really?
Containers: What are they, Really?Containers: What are they, Really?
Containers: What are they, Really?
 
CRI Runtimes Deep-Dive: Who's Running My Pod!?
CRI Runtimes Deep-Dive: Who's Running My Pod!?CRI Runtimes Deep-Dive: Who's Running My Pod!?
CRI Runtimes Deep-Dive: Who's Running My Pod!?
 
Clustering Docker with Docker Swarm on openSUSE
Clustering Docker with Docker Swarm on openSUSEClustering Docker with Docker Swarm on openSUSE
Clustering Docker with Docker Swarm on openSUSE
 
EC2 Storage for Docker 150526b
EC2 Storage for Docker   150526bEC2 Storage for Docker   150526b
EC2 Storage for Docker 150526b
 
CoreOS + Kubernetes @ All Things Open 2015
CoreOS + Kubernetes @ All Things Open 2015CoreOS + Kubernetes @ All Things Open 2015
CoreOS + Kubernetes @ All Things Open 2015
 
It's 2018. Are My Containers Secure Yet!?
It's 2018. Are My Containers Secure Yet!?It's 2018. Are My Containers Secure Yet!?
It's 2018. Are My Containers Secure Yet!?
 
Upstate DevOps - Containers 101 - March 28, 2019
Upstate DevOps - Containers 101 - March 28, 2019Upstate DevOps - Containers 101 - March 28, 2019
Upstate DevOps - Containers 101 - March 28, 2019
 
Docker. Micro services for lazy developers
Docker. Micro services for lazy developersDocker. Micro services for lazy developers
Docker. Micro services for lazy developers
 
Datacenter Airlift - "Docker and the world of “containerized" environments"
Datacenter Airlift - "Docker and the world of “containerized" environments"Datacenter Airlift - "Docker and the world of “containerized" environments"
Datacenter Airlift - "Docker and the world of “containerized" environments"
 
2 docker engine_hands_on
2 docker engine_hands_on2 docker engine_hands_on
2 docker engine_hands_on
 
Inside Sqale's Backend at Sapporo Ruby Kaigi 2012
Inside Sqale's Backend at Sapporo Ruby Kaigi 2012Inside Sqale's Backend at Sapporo Ruby Kaigi 2012
Inside Sqale's Backend at Sapporo Ruby Kaigi 2012
 
Docker tutorial2
Docker tutorial2Docker tutorial2
Docker tutorial2
 
Dockerin10mins
Dockerin10minsDockerin10mins
Dockerin10mins
 

Similaire à Caching in Docker - the hardest thing in computer science

A Gentle Introduction to Docker and Containers
A Gentle Introduction to Docker and ContainersA Gentle Introduction to Docker and Containers
A Gentle Introduction to Docker and Containers
Docker, Inc.
 

Similaire à Caching in Docker - the hardest thing in computer science (20)

Docker presentation
Docker presentationDocker presentation
Docker presentation
 
Introduction to containers
Introduction to containersIntroduction to containers
Introduction to containers
 
Docker4Drupal 2.1 for Development
Docker4Drupal 2.1 for DevelopmentDocker4Drupal 2.1 for Development
Docker4Drupal 2.1 for Development
 
Docker n co
Docker n coDocker n co
Docker n co
 
Powercoders · Docker · Fall 2021.pptx
Powercoders · Docker · Fall 2021.pptxPowercoders · Docker · Fall 2021.pptx
Powercoders · Docker · Fall 2021.pptx
 
Build optimization mechanisms in GitLab and Docker
Build optimization mechanisms in GitLab and DockerBuild optimization mechanisms in GitLab and Docker
Build optimization mechanisms in GitLab and Docker
 
DockerCon EU 2015: Breaking the RPiDocker Challenge
DockerCon EU 2015: Breaking the RPiDocker Challenge DockerCon EU 2015: Breaking the RPiDocker Challenge
DockerCon EU 2015: Breaking the RPiDocker Challenge
 
Docker 0.11 at MaxCDN meetup in Los Angeles
Docker 0.11 at MaxCDN meetup in Los AngelesDocker 0.11 at MaxCDN meetup in Los Angeles
Docker 0.11 at MaxCDN meetup in Los Angeles
 
Talk on PHP Day Uruguay about Docker
Talk on PHP Day Uruguay about DockerTalk on PHP Day Uruguay about Docker
Talk on PHP Day Uruguay about Docker
 
Docker SQL Continuous Integration Flow
Docker SQL Continuous Integration FlowDocker SQL Continuous Integration Flow
Docker SQL Continuous Integration Flow
 
Docker 1 0 1 0 1: a Docker introduction, actualized for the stable release of...
Docker 1 0 1 0 1: a Docker introduction, actualized for the stable release of...Docker 1 0 1 0 1: a Docker introduction, actualized for the stable release of...
Docker 1 0 1 0 1: a Docker introduction, actualized for the stable release of...
 
Perspectives on Docker
Perspectives on DockerPerspectives on Docker
Perspectives on Docker
 
Super powered Drupal development with docker
Super powered Drupal development with dockerSuper powered Drupal development with docker
Super powered Drupal development with docker
 
A Gentle Introduction to Docker and Containers
A Gentle Introduction to Docker and ContainersA Gentle Introduction to Docker and Containers
A Gentle Introduction to Docker and Containers
 
Magento Docker Setup.pdf
Magento Docker Setup.pdfMagento Docker Setup.pdf
Magento Docker Setup.pdf
 
Infrastructure = Code
Infrastructure = CodeInfrastructure = Code
Infrastructure = Code
 
Docker 102 - Immutable Infrastructure
Docker 102 - Immutable InfrastructureDocker 102 - Immutable Infrastructure
Docker 102 - Immutable Infrastructure
 
Настройка окружения для кросскомпиляции проектов на основе docker'a
Настройка окружения для кросскомпиляции проектов на основе docker'aНастройка окружения для кросскомпиляции проектов на основе docker'a
Настройка окружения для кросскомпиляции проектов на основе docker'a
 
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
 
Introduction to Docker at Glidewell Laboratories in Orange County
Introduction to Docker at Glidewell Laboratories in Orange CountyIntroduction to Docker at Glidewell Laboratories in Orange County
Introduction to Docker at Glidewell Laboratories in Orange County
 

Plus de Jarek Potiuk

Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFestManageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Jarek Potiuk
 
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
Manageable data pipelines with airflow (and kubernetes)   november 27, 11 45 ...Manageable data pipelines with airflow (and kubernetes)   november 27, 11 45 ...
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
Jarek Potiuk
 

Plus de Jarek Potiuk (11)

What's Coming in Apache Airflow 2.0 - PyDataWarsaw 2019
What's Coming in Apache Airflow 2.0 - PyDataWarsaw 2019What's Coming in Apache Airflow 2.0 - PyDataWarsaw 2019
What's Coming in Apache Airflow 2.0 - PyDataWarsaw 2019
 
Subtle Differences between Python versions
Subtle Differences between Python versionsSubtle Differences between Python versions
Subtle Differences between Python versions
 
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFestManageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
Manageable Data Pipelines With Airflow (and kubernetes) - GDG DevFest
 
Off time - how to use social media to be more out of social media
Off time - how to use social media to be more out of social mediaOff time - how to use social media to be more out of social media
Off time - how to use social media to be more out of social media
 
It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
It's a Breeze to develop Apache Airflow (London Apache Airflow meetup)
 
Berlin Apache Con EU Airflow Workshops
Berlin Apache Con EU Airflow WorkshopsBerlin Apache Con EU Airflow Workshops
Berlin Apache Con EU Airflow Workshops
 
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
Manageable data pipelines with airflow (and kubernetes)   november 27, 11 45 ...Manageable data pipelines with airflow (and kubernetes)   november 27, 11 45 ...
Manageable data pipelines with airflow (and kubernetes) november 27, 11 45 ...
 
Ci for android OS
Ci for android OSCi for android OS
Ci for android OS
 
It's a Breeze to develop Apache Airflow (Apache Con Berlin)
It's a Breeze to develop Apache Airflow (Apache Con Berlin)It's a Breeze to develop Apache Airflow (Apache Con Berlin)
It's a Breeze to develop Apache Airflow (Apache Con Berlin)
 
It's a Breeze to develop Airflow (Cloud Native Warsaw)
It's a Breeze to develop Airflow (Cloud Native Warsaw)It's a Breeze to develop Airflow (Cloud Native Warsaw)
It's a Breeze to develop Airflow (Cloud Native Warsaw)
 
React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Dernier (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Caching in Docker - the hardest thing in computer science

  • 1. The hardest thing in computer science
  • 3. Docker Caching Dependency versions Install dependencies [ 20 minutes or so ] Only here copy all sources
  • 4. Intended behaviour ● No change: docker is not rebuilt - LIGHTNING FAST!!!! ● Sources change/dependencies not: only sources are added - QUITE FAST !!! ● Dependencies change: dependencies installed, sources - LITTLE SLOWER !!
  • 5. Actual behaviour same machine - local checkout ● Local docker registry ● Repeated build: 1:06m ● Only sources: 1:30m ● Dependencies: 11m ● Whole build: ~ 30m
  • 6. CI case ● Always fresh machine ○ no code ○ no registry ● Git clone/checkout ● Build ● Wipeout
  • 7. Docker registry to the rescue! Build cache: ● Docker build ● Docker push airflow/airflow:latest Use cache: ● Docker pull airflow/airflow:latest ● docker build --cache-from ariflow/airflow:latest
  • 8. Actual behaviour Docker Hub automated build ● DockerHub docker registry as cache ● Repeated build: 11m ● Only sources: 11m <- Still OK ● Dependencies: ~1h ● Whole build: ~ 2h
  • 9. Using the cache in Travis CI ● Docker Hub builds are slow ● Travis or Cloud Build use earlier image with --cache-from ● But only sources change most of the time
  • 10.
  • 11. Actual BAD behaviour Travis CI automated build ● Build on Travis with cache from DockerHub ● Repeated build: 11m ● Only sources: 1 h <- ● Dependencies: 1h ● Whole build: ~ 2h
  • 12.
  • 13. Problem no 1 Git & permissions ● git clone file creation: ○ local user ○ default user’s group ● file/dir permissions (rwxs) ○ preserves user, group and other rx permissions files & dirs ○ does not store w and by default uses umask when cloning by default ○ core.sharedRepository git-config ■ one of: group(true), all, umask(false), 0xxx ● Umask WTF: ○ file: 644 (DockeHub) vs. 664 (Travis CI) ○ dir: 755 (DockerHub) vs. 775 (Travis CI)
  • 14. Solution to problem 1 Fix group permissions
  • 15. Problem no 2 Generated files ● not only .gitignore ● generated files ○ autoapi - documentation ○ build artifacts ○ npm cache ○ .pyc files ○ files created accidentally (wget in source folder anyone?) ● COPY . ● Context calculated based on ALL files ● .dockerignore != .gitignore ● slightly different syntax
  • 16. Solution to problem 2 Set .dockerignore ** by default
  • 17. Problem no. 3 ● Download & compile ALL dependencies takes time!
  • 18. Partial solution to problem 3 Find the weakest link
  • 19. Solution to problem 3 a) build image with wheels
  • 20. Solution to problem 3 b) Copy directory via multi-stage Docker builds
  • 21. Solution 3 c) install using wheels
  • 22.
  • 23. Thank You! You can add some info where to follow you, or add information about polidea.com/blog