Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

ISC HPCW talks

247 vues

Publié le

1. Current state of rootless dockerd
2. Rootless buildwith BuildKit
3. OCI Image Spec & Distribution

http://www.qnib.org/2019/06/20/isc2019-hpcw/

Publié dans : Logiciels
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

ISC HPCW talks

  1. 1. Copyright©2019 NTT Corp. All Rights Reserved. Akihiro Suda ( @_AkihiroSuda_ ) NTT Software Innovation Center My ISC HPCW Talks 1. Current state of rootless dockerd 2. Rootless build with BuildKit 3. OCI Image Spec & Distribution 5th Annual High Performance Container Workshop, ISC (June 20, 2019)
  2. 2. Copyright©2019 NTT Corp. All Rights Reserved. Akihiro Suda ( @_AkihiroSuda_ ) NTT Software Innovation Center Current state of rootless dockerd 5th Annual High Performance Container Workshop, ISC (June 20, 2019)
  3. 3. 3
 Copyright©2019 NTT Corp. All Rights Reserved. What is rootless dockerd? • Run Docker daemon (and also containers of course) as a non-root user • Don’t confuse with: • sudo • usermod -aG docker penguin • docker run --user • dockerd --userns-remap • Experimentally supported since Docker v19.03 https://get.docker.com/rootless Image: https://xkcd.com/149/
  4. 4. 4
 Copyright©2019 NTT Corp. All Rights Reserved. Why? • For Cloud-Native envs: • To mitigate potential vulnerability of container runtimes and orchestrator • For HPC envs: • To run containers without the risk of breaking other users environments
  5. 5. 5
 Copyright©2019 NTT Corp. All Rights Reserved. How it works: User Namespaces • User namespaces allow non-root users to pretend to be the root • Root-in-UserNS can have “fake” UID 0 and also create other namespaces (MountNS, NetNS..) • Unlike Singularity, NetNS can be unshared • By using either usermode TCP/IP stack (VPNKit, slirp4netns) or SETUID binary (lxc-user-nic)
  6. 6. 6
 Copyright©2019 NTT Corp. All Rights Reserved. System requirements: /etc/{subuid,subgid} • If /etc/subuid contains “1001:100000:65536” • Having 65536 sub-users should be enough for most containers 0 1001 100000 165535 232Host UserNS primary user sub-users start sub-users length 0 1 65536
  7. 7. 7
 Copyright©2019 NTT Corp. All Rights Reserved. Unresolved issues (Contribution wanted!) • Hard to maintain subuid & subgid in LDAP/AD envs • NSS module is being under discussion https://github.com/shadow-maint/shadow/issues/154 • Single-mapping mode w/o subuid & subgid is also under discussion • uses ptrace and xattrs (slow!) • seccomp could be used for acceleration https://github.com/rootless-containers/runrootless
  8. 8. 8
 Copyright©2019 NTT Corp. All Rights Reserved. Unresolved issues (Contribution wanted!) • Lacks cgroup • cgroup2 (unified-mode) supports unprivileged mode but migration may take a few years… or even more • For cgroup1, pam_cgfs could be used instead, but not available in Fedora / RHEL due to a security concern • Kernel / VM / HW may have vulns • Not suitable for real multi-tenancy • gVisor might able to mitigate some of them
  9. 9. Copyright©2019 NTT Corp. All Rights Reserved. Akihiro Suda ( @_AkihiroSuda_ ) NTT Software Innovation Center Rootless build with BuildKit 5th Annual High Performance Container Workshop, ISC (June 20, 2019)
  10. 10. 10
 Copyright©2019 NTT Corp. All Rights Reserved. What is BuildKit? • Next-generation docker build with focus on performance and security • Accurate dependency analysis • Concurrent execution of independent instructions • Support injecting secret files... • Integrated to Docker since v18.06 (export DOCKER_BUILDKIT=1) • Non-Docker standalone BuildKit is also available • Works with Podman and CRI-O as well :P
  11. 11. 11
 Copyright©2019 NTT Corp. All Rights Reserved. Rootless mode • Rootless mode allows building images as a non-root user • Dockerfile RUN instructions are executed as a “fake root” in UserNS (So apt-get/yum works!) • Produces Docker image / OCI image / raw tarball • Compatible with Rootless Docker / Rootless Podman / … whatever • Even works inside a container • Good for distributed CI/CD on Kubernetes • Works with default securityContext configuration (but seccomp and AppArmor needs to be disabled for nesting containers)
  12. 12. 12
 Copyright©2019 NTT Corp. All Rights Reserved. Rootless BuildKit vs kaniko • https://github.com/GoogleContainerTools/kaniko • Kaniko runs as the root but “unprivileged” • No need to disable seccomp and AppArmor because kaniko doesn’t nest containers on the kaniko container itself • Kaniko might be able to mitigate some vuln that Rootless BuildKit cannot mitigate - and vice versa • Rootless BuildKit might be weak against kernel vulns • Kaniko might be weak against runc vulns
  13. 13. Copyright©2019 NTT Corp. All Rights Reserved. Akihiro Suda ( @_AkihiroSuda_ ) NTT Software Innovation Center OCI Image Spec & Distribution 5th Annual High Performance Container Workshop, ISC (June 20, 2019)
  14. 14. 14
 Copyright©2019 NTT Corp. All Rights Reserved. Open Containers Initiative Specifications • OCI Runtime Spec • How to create container from config JSON and rootfs dir • Based on Docker libcontainer (now runc) • OCI Image Spec • How to represent image layers for OCI runtimes • Based on Docker Image Manifest V2, Schema 2 • OCI Distribution Spec • How to distribute OCI images • Based on Docker Registry HTTP API
  15. 15. 15
 Copyright©2019 NTT Corp. All Rights Reserved. Image layout /blobs/sha256/e692418e... /blobs/sha256/b5b2b2c5... /blobs/sha256/61be55a8... /blobs/sha256/3c3a4604... /blobs/sha256/3c3a4604... JSON JSON tar.gz tar.gz tar.gz Manifest • Merkle DAG structure ensures reproducibility of docker pull foo@sha256:e692418e… Container Config AUFS layer archives (for each Dockerfile FROM and RUN) v1.0Manifest list latest
  16. 16. 16
 Copyright©2019 NTT Corp. All Rights Reserved. Image layout latest amd64 /blobs/sha256/e692418e... /blobs/sha256/b5b2b2c5... /blobs/sha256/61be55a8... /blobs/sha256/3c3a4604... /blobs/sha256/3c3a4604... JSON JSON tar.gz tar.gz tar.gz JSON Manifest list Manifest • Supports multi-arch (use BuildKit to build) Container Config latest arm64 AUFS layer archives (for each Dockerfile FROM and RUN)
  17. 17. 17
 Copyright©2019 NTT Corp. All Rights Reserved. Image layout latest Ice Lake /blobs/sha256/e692418e... /blobs/sha256/b5b2b2c5... /blobs/sha256/61be55a8... /blobs/sha256/3c3a4604... /blobs/sha256/3c3a4604... JSON JSON tar.gz tar.gz tar.gz JSON Manifest list Manifest • And even multi-microarchitectures via qnib/metahub • https://metahub.qnib.org Container Config latest Broadwell Tesla M60 AUFS layer archives (for each Dockerfile FROM and RUN)
  18. 18. 18
 Copyright©2019 NTT Corp. All Rights Reserved. Post-OCI image format? • Issues of current OCI v1 • Too coarse deduplication granularity • Containers cannot be started until the entire image is pulled • An alternative: CernVM-FS • Supports file-level deduplication rather than layer-level • Files are lazy-pulled on demand using FUSE • Integrating CernVM-FS to containerd is under discussion https://github.com/containerd/containerd/issues/2943
  19. 19. 19
 Copyright©2019 NTT Corp. All Rights Reserved. Post-OCI image format? • ”OCI v2” https://github.com/openSUSE/umoci/issues/256 • Much finer deduplication granularity • No implementation yet • Container Registry Filesystem https://github.com/google/crfs • Focus on lazy-pulling CI images • IPCS https://github.com/hinshun/ipcs • IPFS integration for containerd

×