Christian Kniep presented this deck at the 2016 HPC Advisory Council Switzerland Conference.
"With Docker v1.9 a new networking system was introduced, which allows multi-host network- ing to work out-of-the-box in any Docker environment. This talk provides an introduction on what Docker networking provides, followed by a demo that spins up a full SLURM cluster across multiple machines. The demo is based on QNIBTerminal, a Consul backed set of Docker Images to spin up a broad set of software stacks."
Watch the video presentation:
http://wp.me/p3RLHQ-f7G
See more talks in the Swiss Conference Video Gallery:
http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter:
http://insidehpc.com/newsletter
4. 1. “Linux Container” / “Docker Ecosystem” in a Nutshell
2. Confusion about Ecosystem / Vision to tackle it
3. Docker -> SWARM -> SLURM
-> BigData
4. Discussion of Opportunities and Problems
4
Agenda
6. Userland (OS)Userland (OS) Userland (OS)
Userland (OS)
Ubuntu:14.04 Ubuntu:15.10 RHEL7.2
Tiny Core Linux
Linux Containers
6
SERVER
HOST KERNEL
HYPERVISOR
KERNEL
SERVICE
Userland (OS)
KERNEL KERNEL
Userland (OS)Userland (OS) Userland (OS)
SERVICE SERVICE
SERVER
HOST KERNEL
SERVICE SERVICE SERVICE
Traditional Virtualisation Containerisation
Containers do not spin up a distinct kernel
all containers & the host share the same
user-lands are independent
they are separated by Kernel Namespaces
7. Containers are ‘grouped processes’
isolated by Kernel Namespaces
resource restrictions applicable through CGroups (disk/netIO)
HOST
container1
7
Kernel Namespaces
bash
ls -l
container2
apache
container3
mysqld
consul consul
PIDNamespaces: Network Mount IPC UTS
container4
slurmd
ssh
consul
8. Container Runtime Daemon
creates/…/removes containers, exposes REST API
handles Namespaces, CGroups, bind-mounts, etc.
IP connectivity by default via ‘host-only’ network bridge
Docker Engine
8
SERVER
eth0
docker0
container1
container2
Docker-Engine
9. Docker Compose
9
Describes stack of container configurations
instead of writing a small bash script…
… it holds the runtime configuration as YAML file.
10. Docker Networking spans networks across engines
KV-store to synchronise (Zookeeper, etcd, Consul)
VXLAN to pass messages along
SERVER0 SERVER1 SERVER<n>
Docker Networking
10
Consul
Docker-Engine
Consul Consul
Docker-Engine Docker-Engine
Consul DC
global
container0 container1 containerN
11. Docker Swarm proxies docker-engines
serves an API endpoint in front of multiple docker-engines
does placement decisions.
SERVER0 SERVER1 SERVER<n>
Docker Swarm
11
Docker-Engine Docker-Engine Docker-Engine
swarm-client swarm-client swarm-client
swarm-master
:2376 :2376 :2376
:2375
container1
-e constraint:node==SERVER0
17. 1. No special distributions
useful for certain use-cases, such as elasticity and green-field
deployment
not so much for an on-premise datacenter w/ legacy in it.
2. Leverage existing processes/resources
install workflow, syslog, monitoring
security (ssh infrastructure), user auth.
3. keep up with docker ecosystem
incorporate new features of engine, swarm, compose
networking, volumes, user-namespaces
17
Vision
19. Hardware (courtesy of )
8x Sun Fire x2250, 2x 4core XEON, 32GB, Mellanox ConnectX-2)
Software
Base installation
CentOS 7.2 base installation (updated from 7-alpha)
Ansible
consul, sensu
docker v1.10, docker-compose
docker SWARM
19
Testbed
30. 1. Where to base images on?
Ubuntu/Fedora: ~200MB
Debian: ~100MB
Alpine Linux: 5MB (musl-libc)
2. Trimm the Images down at all cost?
How about debugging tools? Possibility to run tools on the host
and ‘inspect’ namespaced processes inside of a container.
If PID-sharing arrives, carving out (e.g.) monitoring could be a
thing.
30
Small vs. Big
31. 1. In an ideal world…
a container only runs one process, e.g. the HPC solver.
2. In reality…
MPI want’s to connect to a sshd within the job-peers
monitoring, syslog, service discovery should be present as well.
3. How fast / aggressive to break traditional
approaches?
31
One vs. Many Processes
33. Running OpenFOAM on small scale is cumbersome
manually install OpenFOAM on a workstation
be confident that the installation works correctly
A containerised OpenFOAM installation tackles both
33
Reproducibility / Downscaling
http://qnib.org/immutablehttp://qnib.org/immutable-paper
34. 1. Since the environments are rather dynamic…
how does the containers discover services?
external registry as part of the framework?
discovery service as part of the container stacks?
34
Service Discovery
35. With Docker Swarm it is rather easy
to spin up a Kubernetes or Mesos cluster within Swarm.
35
Orchestration Frameworks
SERVER0 SERVER1 SERVER<n>
Docker-Engine Docker-Engine Docker-Engine
swarm-client swarm-client swarm-client
swarm-master
etcd
kubelet
scheduler apiserver
etcd
kubelet
etcd
kubelet
36. 1. Containers should be controlled via ENV or flags
External access/change of a running container is discouraged
2. Configuration management
Downgraded to bootstrap a host?
36
Immutable vs. Config Mgmt
37. If containers are immutable within pipeline
testing/deployment should be automated
developers should have a production replica
37
Continuous Dev./Integration
38. 38
Docker Momentum
Software Dev
DatacenterOps
IT Tinkering (Hello World)
Continuous Dev/Int/Dep
Microservices, hyper scale
Big Data
High Performance Computing
HPC
Disclaimer: subjective exaggeration
39. Spinning up production-like environment is great
MongoDB, PostreSQL, memcached as separate containers
python2.7, python3.4
39
Docker in Software Development
Like python’s virtualenv on steroids,
iteration speedup through reproducibility
40. Spinning up production-like environment is…
…not that easy
focus more on engineer/scientist, not the software-developer
1. For development it might work
close to non-HPC software dev
2. But is that the iteration-focus?
rather job settings / input data?
40
Docker in HPC development
41. Split input iteration / development from operation
non-distributed stays vanilla
transition to HPC cluster using tech to foster operation
41
Separation of Concerns?
http://gmkurtzer.github.io/singularity
Input/Dev
42. Docker-Engine 1.11 will not be the parent of containers
runC usage under the hood
42
containerd Integration
43. 1. Separat Dev and Ops
don’t block the momentum fostering
iteration speed in Development
2. Using vanilla docker-tech
keep up with the ecosystem and prevent vendor/ecosystem
lock-in
3. 80/20 rule
have caveats on the radar but don’t bother too much
everything is so fast moving - it’s hard to predict
43
Recap aka. IMHO