In this talk, I cover the key elements of running multiple Docker containers per VM, the major frameworks available to assist with this, and when to choose each.
3. Why Devs
• Lightweight
Containers are just isolated processes. We can start a new
container in seconds.
• Portable
My Mac, the linux EC2 instance, and your Windows PC all run
the exact same container.
• Ecosystem
I can easily share images, manage private images, and use
“official” images for virtually all open source software.
4. Why Devs
• Squeeze More Resources out of a Single Server
Did you know this dirty secret of the Infrastructure-as-a-Service world?
85%
15%
In Use Free
SOURCE: http://radar.oreilly.com/2014/12/why-the-data-center-needs-an-operating-system.html
Typical Data Center Resource Utilization
5. So can I run multiple
containers in a single VM?
6. VM-1 VM-2
Service A Service AService B
Service C
Service B
Something like this?
8. The Gartner Tech Hypecycle
Any guesses where the “multi-container VM” paradigm is?
SOURCE: https://setandbma.wordpress.com/2012/05/28/technology-adoption-shift/
9. My Take on This
In reality, the exact spot varies by team,
so this is a bit of a generalization.
10. Today’s talk is about our
options for that red dot.
For each option, we’ll cover:
• Pro’s
• Con’s
• When to use
11. • Full-stack web-app engineer for 12+ years.
• Since I’ve worked with many different teams, I generally
help accelerate the DevOps/AWS learning curve for teams.
• PhxDevOps Clients include: Intel, Infusionsoft, American
Bible Society, CÜR Music, plus multiple startups and web
design companies.
Josh Padnick
These slides are posted on http://joshpadnick.com
I help software teams scale their app
using DevOps and AWS.
http://PhoenixDevOps.com
@OhMyGoshJosh
My LinkedIn
My GitHub
Want to know more about building scalable apps on AWS? Check out a
12,000+ word article I wrote on how at http://bit.ly/1EtYRbL.
12. Disclaimers
• I have a bias toward AWS and may leave
out solutions from other IaaS providers
such as Azure.
• The solutions we cover today are deep
and diverse. This talk reflects my own
experiences but your mileage may vary!
13. Agenda
• CoreOS in 60 seconds
• Theory of Multi-Container VM’s
• The Three Paradigms of Multi-Container VM’s
• Cover all the Major Solutions
16. What is ?
• It’s super stripped-down linux. You don’t even
get a package manager.
• The idea is you run everything as a container.
• CoreOS is based on ChromiumOS, which itself is
based on Gentoo Linux.
• Uses systemd for init.
17. CoreOS and This Presentation
• Because CoreOS is “built for Docker”, many
solutions use it as their default OS.
• In reality, you can usually use any OS that runs
Docker natively, but CoreOS is often the
“recommended” linux distro for Docker.
21. • If we build from a fresh environment each time,
every Docker image/layer is downloaded from
scratch.
• Ideally, our “Docker Builder” has pre-downloaded
(“seeded”) all our most popular Docker images.
• In practice, this is managed by your build tool, like
Jenkins, CircleCI, Shippable, etc.
Docker Builder
We need
somewhere to
build our image.
23. • Main options here are:
• Cloud
• Docker Hub
• Quay.io
• On-Premise
• Docker Trusted Registry (Paid)
• Docker Distribution (Free)
• Quay.io (Paid)
Docker Registry
We need
somewhere to
store our built
images.
Docker Builder
We need
somewhere to
build our image.
24. Docker Registry
We need
somewhere to
store our built
images.
Automated Deployment
We need a way to deploy our
Docker image into the cluster.
Docker Builder
We need
somewhere to
build our image.
25. Docker Registry
We need
somewhere to
store our built
images.
Automated Deployment
We need a way to deploy our
Docker image into the cluster.
Container Scheduling
We need something to
decide which host will
launch our new container?
Docker Builder
We need
somewhere to
build our image.
29. Docker Builder
We need
somewhere to
build our image.
Docker Registry
We need
somewhere to
store our built
images.
Automated Deployment
We need a way to deploy our
Docker image into the cluster.
Container Scheduling
We need something to
decide which host will
launch our new container?
30. Docker Registry
We need
somewhere to
store our built
images.
Automated Deployment
We need a way to deploy our
Docker image into the cluster.
Container Scheduling
We need something to
decide which host will
launch our new container?
• One of the most important
considerations when choosing
a host is “who’s got the memory
and CPU I need?”
• But we also need to know who’s
in a different Availability Zone /
Data Center so we can achieve
high fault tolerance.
Docker Builder
We need
somewhere to
build our image.
31. Docker Registry
We need
somewhere to
store our built
images.
Automated Deployment
We need a way to deploy our
Docker image into the cluster.
Routing / Load Balancing
We need a way to route a
request to any of our
containers.
Container Scheduling
We need something to
decide which host will
launch our new container?
Docker Builder
We need
somewhere to
build our image.
33. VM-1 VM-2
Service A Service AService B
Service C
Service B
GET ServiceB HTTP/1.1
34. VM-1 VM-2
Service A Service AService B
Service C
Service B
Routing / Load Balancing Solution
35. Docker Registry
We need
somewhere to
store our built
images.
Automated Deployment
We need a way to deploy our
Docker image into the cluster.
Routing / Load Balancing
We need a way to route a
request to any of our
containers.
Container Scheduling
We need something to
decide which host will
launch our new container?
Docker Builder
We need
somewhere to
build our image.
36. Docker Registry
We need
somewhere to
store our built
images.
Automated Deployment
We need a way to deploy our
Docker image into the cluster.
Service Discovery
When we launch new
containers, we need to tell
our router they exist.
Routing / Load Balancing
We need a way to route a
request to any of our
containers.
Container Scheduling
We need something to
decide which host will
launch our new container?
Docker Builder
We need
somewhere to
build our image.
37. VM-1 VM-2
Service A Service AService B
Service C
Service B
Routing / Load Balancing Solution
38. VM-1 VM-2
Service A Service AService B
Service C
Service B
Routing / Load Balancing Solution
“Increase Service C container count from 1 to 2.”
39. VM-1 VM-2
Service A Service AService B
Service C
Service B
Routing / Load Balancing Solution
Service C
40. VM-1 VM-2
Service A Service AService B
Service C
Service B
Routing / Load Balancing Solution
Service C
41. VM-1 VM-2
Service A Service AService B
Service C
Service B
Routing / Load Balancing Solution
Service C
42. Docker Registry
We need
somewhere to
store our built
images.
Automated Deployment
We need a way to deploy our
Docker image into the cluster.
Service Discovery
When we launch new
containers, we need to tell
our router they exist.
Routing / Load Balancing
We need a way to route a
request to any of our
containers.
Container Scheduling
We need something to
decide which host will
launch our new container?
Docker Builder
We need
somewhere to
build our image.
43. Docker Registry
We need
somewhere to
store our built
images.
Automated Deployment
We need a way to deploy our
Docker image into the cluster.
Auto-Restart Failed
Containers
Something needs to know
that a container failed and
auto-restart it.
Service Discovery
When we launch new
containers, we need to tell
our router they exist.
Routing / Load Balancing
We need a way to route a
request to any of our
containers.
Container Scheduling
We need something to
decide which host will
launch our new container?
Docker Builder
We need
somewhere to
build our image.
44. Docker Registry
We need
somewhere to
store our built
images.
Automated Deployment
We need a way to deploy our
Docker image into the cluster.
Auto-Restart Failed
Containers
Something needs to know
that a container failed and
auto-restart it.
Extract Container Logs
We need a way to read logs
from all containers.
Service Discovery
When we launch new
containers, we need to tell
our router they exist.
Routing / Load Balancing
We need a way to route a
request to any of our
containers.
Container Scheduling
We need something to
decide which host will
launch our new container?
Docker Builder
We need
somewhere to
build our image.
45. Docker Registry
We need
somewhere to
store our built
images.
Automated Deployment
We need a way to deploy our
Docker image into the cluster.
Extract Container Logs
We need a way to read logs
from all containers.
Monitor Everything
We need to monitor cluster
resources and individual
containers.
Auto-Restart Failed
Containers
Something needs to know
that a container failed and
auto-restart it.
Service Discovery
When we launch new
containers, we need to tell
our router they exist.
Routing / Load Balancing
We need a way to route a
request to any of our
containers.
Container Scheduling
We need something to
decide which host will
launch our new container?
Docker Builder
We need
somewhere to
build our image.
46. Docker Registry
We need
somewhere to
store our built
images.
Automated Deployment
We need a way to deploy our
Docker image into the cluster.
Extract Container Logs
We need a way to read logs
from all containers.
Monitor Everything
We need to monitor cluster
resources and individual
containers.
Auto-Restart Failed
Containers
Something needs to know
that a container failed and
auto-restart it.
Service Discovery
When we launch new
containers, we need to tell
our router they exist.
Routing / Load Balancing
We need a way to route a
request to any of our
containers.
Container Scheduling
We need something to
decide which host will
launch our new container?
Docker Builder
We need
somewhere to
build our image.
47. Does our cluster have “state”?
• Yes!
• Router needs to know which containers are from
which services.
• We need to know which hosts are actually in our
cluster.
48. Storing Cluster State
• This topic alone warrants full books.
• One option for storing state is to simply use a
database like PostgreSQL.
• But the more popular option is for each host in the
cluster to store state in an eventually consistent way
using a “consensus algorithm.” I call this a cluster
datastore.
• The most popular such solutions are: etcd, consul,
and zookeeper.
49. Unit of Container Deployment
• We need something that describes what kind of
container to deploy.
• This is typically a declarative file in either YAML
or JSON that declares all aspects of our docker
run command, whether 2+ containers are run
together, etc.
50. The Theory in Summary
• Docker builder
• Docker registry
• Automated deployment
• Container scheduling
• Routing / Load Balancing
• Service discovery
• Auto-restart failed containers
• Logging
• Monitoring
• Cluster datastore
• Unit of Container Deployment
54. The Big Idea
• You control the infrastructure (e.g. AWS, Azure)
• You’re given an unopinionated set of primitives
on top of which you can build your own solution.
• Primitives include launching containers, but not
full deployment.
57. The Big Idea
• You control the infrastructure (e.g. AWS, Azure)
• Install the PaaS tool on top of your own
infrastructure.
• PaaS tool typically sits on top of a Cluster
Framework.
• You’re not 100% sure how it works, but it solves
your needs today and you can always deep dive
later, or (hopefully) get commercial support.
60. The Big Idea
• You control the infrastructure (e.g. AWS, Azure)
• You’re given an opinionated framework which
has everything you need to deploy.
• You operate at the abstraction level of “cluster”
and really don’t care when individual hosts die.
• These tend to be the most powerful, and the
most complex.
61. Major Data Center Operating
System Frameworks
(We’ll cover each of these today)
( + ? )
67. Mentally Managing the
Hybrids
• Don’t get too caught up on these exotic
combinations.
• Focus first on one of the “non-hybrid”
technologies.
• Then evaluate what can be run on top of your
choice technology, and whether it will make life
easier for you.
71. How It Works
• Launch a CoreOS cluster on the IaaS platform of your
choice (e.g. AWS, Azure, VMWare, etc.)
• CoreOS comes with a CLI tool fleet that enables
launching containers, but does not constitute a full
deployment system.
• Best thought of as a set of primitives you can work
with, not a full-fledged framework.
• Define systemd unit files to describe the Docker
container you want to launch.
72. Docker Builder Roll Your Own
Docker Registry Roll Your Own
Deployment: Scheduling Built into fleet, but no resource-
aware scheduling
Deployment: Routing Roll Your Own
Deployment: Service Discovery Roll Your Own
Auto-Restart Failed Containers Built into Fleet
Monitoring Roll Your Own
Logging Roll Your Own
Cluster Data Store etcd
Unit of Deployment Systemd unit file
73. Pro’s
• Relatively mature/stable among Cluster Frameworks.
• Once you’ve setup etcd, everything else “just works”.
• RESTful API into fleet allows for easily building out your own custom
solution.
• Fleet will auto-restart failed containers.
• Tagging cluster nodes allows for clever distribution of containers (e.g.
across Availability Zones).
• CoreOS gives us a well-defined method for updating individual cluster
nodes to the latest CoreOS.
• Commercial support available.
74. Con’s
• Setting up etcd can be painful.
• Fleet does not allow resource-aware scheduling, so containers may
run out of resources.
• Fleet does not expose a primitive for “transferring” a container from
one cluster node to another.
• No built-in way to monitor cluster-wide resource consumption.
• Not usable for a production cluster without significant setup
overhead (e.g. setting up service discovery).
• Learning fleet ultimately requires learning systemd and discovering
what fleet commands actually do.
75. When To Use It
• You want to learn the foundations of CoreOS.
• You want high customizability over your setup
and can tolerate non-resource-aware
scheduling.
• You’re willing to manually handle many
operations such as launching additional
containers.
77. How It Works
• Launch at least 3 EC2 instances in AWS.
• Install the ECS agent on each node (or launch an AMI
with the agent pre-installed).
• Cluster setup “just works”
• Define a “Task Definition” to describe how one or more
Docker containers should be launched.
• Define a “Service” that launches one or more instances of
the Task Definition, and ECS auto-deploys your Tasks
(containers).
78.
79.
80.
81.
82.
83.
84. Docker Builder Roll Your Own
Docker Registry Roll Your Own
Deployment: Scheduling
Resource-aware, pluggable scheduler.
Can be swapped w/ custom one.
Deployment: Routing Leverages AWS
Elastic Load Balancers
Deployment: Service Discovery Built in to services
Auto-Restart Failed Containers Built in to services
Monitoring Basic monitoring included at
cluster level.
Logging Roll Your Own
Cluster Data Store Zookeeper
(but this is hidden to us)
Unit of Deployment Task Definition
85. Pro’s
• Very easy to set up.
• Simple UX.
• Low learning curve.
• Covers most of what you need out of the box,
including built-in routing and service discovery.
• Presumably AWS will keep improving it.
• Supported via AWS.
86. Con’s
• Doesn’t support dynamic port mapping from container to host.
• Each service requires its own Elastic Load Balancer, which is $18/month.
(Unless you’re willing to expose a service on a port other than 80/443)
• Supports rolling deployments provided you have a spare node to launch
a new service instance on. Blue/Green deployments are claimed as a
feature, but require out-of-band customization.
• Not recommended to leverage the existing Zookeeper cluster datastore
in use, so you may have to run two cluster datastores (e.g Zookeeper +
Consul).
• Use of “private subnets” requires two separate clusters, one for public
and one for private.
87. When To Use It
• You use AWS and…
• You want to get up and running quickly with your Docker-
based microservices framework.
• You want to run your monolith using containers today,
knowing you can migrate to other cluster tech’s in the
future.
• You want an official solution with official support.
• You want to minimize the number of vendors/tech’s you
work with.
89. How It Works
• You use “docker-machine” to launch multiple EC2
instances (or other VMs). Each EC2 instance is
configured with the Docker daemon and the
docker-swarm agent (which is just a container).
• You launch one or more “Swarm Masters”, one of
which is the “master leader.” You use this to
control your cluster.
• You can now run Docker CLI commands to launch
containers on your cluster.
90.
91.
92. Docker Builder Roll Your Own
Docker Registry Roll Your Own
Deployment: Scheduling
Resource-aware, pluggable scheduler.
Can be swapped w/ custom one.
Deployment: Routing Roll Your Own
Deployment: Service Discovery Roll Your Own
Auto-Restart Failed Containers Open GitHub Issue:
https://github.com/docker/swarm/issues/599
Monitoring Roll Your Own
Logging Roll Your Own
Cluster Data Store Pluggable!
See https://docs.docker.com/
swarm/discovery/
Unit of Deployment Docker container, or Docker
compose manifest
93. Pro’s
• Use the Docker CLI you’ve come to know and love.
• You can run other Docker tools that call the old Docker CLI directly
on top of Swarm and they will “just work”. Does this matter?
• Potentially simpler to program against compared to CoreOS
fleet.
• Resource-aware scheduling.
• Open source event bus available allows for interesting possibilities
in response to cluster events (esp. around service discovery).
• Official Docker solution.
94. Con’s
• Not yet recommended for production.
• My own experience with docker-machine and
docker-swarm have been underwhelming in
terms of stability.
• Other than “official Docker solution” and “use
Docker CLI”, I don’t see any superior features to
alternatives.
95. When To Use It
• For experiments and curiosity.
• Docker swarm is intriguing, but under heavy
development and doesn’t yet present a clear
value proposition compared to alternatives.
• But check back in 6 months, and it may be a solid
contender. See Project Orca for an exciting
opinionated take on the UX. https://youtu.be/
8vSPpPSd00w?t=1h25m16s
97. How Deis Works
Control Plane
Component
Cluster
Control Plane
Component
Control Plane
Component
Data Plane
Component
Data Plane
Component
Router
per Host
Router
per Host
98. How Deis Works
Control Plane
Component
Cluster
Control Plane
Component
Control Plane
Component
Data Plane
Component
Data Plane
Component
Router
per Host
Router
per Host
Service A Service B Service B Service C
99. Deis Workflow for Dev’s
(basically, Heroku on your own infrastructure)
SOURCE: http://docs.deis.io/en/latest/understanding_deis/concepts/
100. How Deis Works for Operators
SOURCE: http://docs.deis.io/en/latest/understanding_deis/architecture/
101. Docker Builder Included!
Docker Registry Included!
Deployment: Scheduling
Defaults to resource-unaware fleet.
Pluggable schedulers are in tech-preview.
Deployment: Routing Included!
Deployment: Service Discovery Included!
Auto-Restart Failed Containers Included!
Monitoring Roll Your Own
Logging
Built-in logspout means you can send
your logs anywhere.
Cluster Data Store PostgreSQL + Ceph
Unit of Deployment
Heroku buildpack, Dockerfile, or
Docker image.
102. Pro’s
• Everything in one package, ready to go. Get up and
running pretty quickly.
• Nice workflow for dev’s
• Open source
• Great community
• Good paradigm of what you would eventually need to
build.
• Commercial support available through Engine Yard.
103. Con’s
• Learning curve for operators can feel steep.
• When the PaaS fails, it’s time to start climbing
the learning curve. For example, I once
terminated a node, broke the ceph cluster and
had to dig into the guts to figure out how to fix it.
• Deis’s architectural opinions may differ from your
own.
125. Docker Builder Roll Your Own
Docker Registry Roll Your Own
Deployment: Scheduling
Resource-aware scheduler.
Can run other “frameworks” side by side.
Deployment: Routing Roll Your Own
Deployment: Service Discovery Roll Your Own
Auto-Restart Failed Containers Included!
Monitoring Roll Your Own
Logging Roll Your Own
Cluster Data Store Zookeeper
Unit of Deployment
Mesos Task (which will usually be a Docker
container, usually submitted through Marathon.
126. But wait, there’s more!
• Setting up Mesos involves coordinating many
different moving pieces.
• Also, there’s no immediate way to gain a cluster-
wide view of total memory/CPU/disk space use.
• Also, the learning curve can be steep.
127. Mesosphere DCOS is meant
to solve these problems.
• Offers “turn-key” setup (though the setup itself is
not really production-grade).
• Offers a fancy UI for viewing cluster resource
usage.
• Offers a special CLI for installing frameworks with
1 command.
• It’s very much in active development and would
work best with a Mesosphere support plan.
128. Pro’s
• I find the Mesos abstraction the most intuitive when it
comes to managing cluster resources.
• Scalability is off the charts. Verizon, Siri, Yelp, Twitter and
OpenTable all use Mesos.
• Growing community.
• Multiple “frameworks” already supported such as Apache
Spark and Cassandra.
• Solomon Hykes called it the “gold standard” for running
Docker containers in a cluster.
129. Con’s
• It can take weeks to setup if you need to do it right.
• The learning curve for dev’s is manageable, but for
operators there are many moving pieces.
• There are certain edge cases that are rare but that would
affect cluster performance over time.
• If you want to run Mesos on CoreOS, either you need to
violate the CoreOS way, or run Mesos Master / Slave
(Agent) in docker containers which is officially not
recommended.
130. When To Use It
• You’re running multiple microservices, and you anticipate
significant scale.
• You want to squeeze as much possible utilization out of your
large cluster as possible.
• You’re ready to adopt the cluster as the primary abstraction
and expect to co-mingle prod and dev, multiple services,
and multiple frameworks.
• Note: Smallest company I met at MesosCon was ~60
employees. That is probably the lower limit of company size
before Mesos makes sense (IMO).
131. Mesos + Docker Swarm
• At MesosCon (August 2015), Docker showed
Docker Swarm as the CLI-based way to control
Mesos deployments.
133. Disclaimers
• Mr. Padnick may or may not have any
actual real-world experience with
Kubernetes but felt it necessary to include
it here for the sake of completeness.
134. Kubernetes Pods
• A pod is a group of docker containers that
should be run together.
Pod
Web Server
Content
Management
Server
SOURCE: Illustrations reproduced from https://www.youtube.com/watch?v=Fcb4aoSAZ98
135. Kubernetes Labels
• A label is a set of key-value pairs that attach to allow Kubernetes
to identify groups of pods.
• Concept of labels is baked into most APIs.
Pod
SOURCE: Illustrations reproduced from https://www.youtube.com/watch?v=Fcb4aoSAZ98
FE
Pod
BI, FE
Pod
v2
136. Kubernetes
Replication Controllers
• A replication controller is a definition:
“I want to run this pod 5 times.”
• If one of the pods fails, Kubernetes will auto-restart a new one.
SOURCE: Illustrations reproduced from https://www.youtube.com/watch?v=Fcb4aoSAZ98
Pod
v1
Pod
v1
Replication
Controller
#Pods = 2
Label selector: v1
137. Kubernetes Cluster
Node
Kubernetes Master
etcd
API Server
Controller Manager Server
Scheduler Server
Kubernetes Master
etcd
API Server
Controller Manager Server
Scheduler Server
kubelet
agent
proxy
Pod
FE
Pod
v2
Pod
v1
Node
Node
Node
Node
Node
Node
138. Docker Builder Roll Your Own
Docker Registry Roll Your Own
Deployment: Scheduling Resource-aware scheduler.
Deployment: Routing Included!
Deployment: Service Discovery Included!
Auto-Restart Failed Containers Included!
Monitoring
Optimal support with Google Cloud
Engine. Limited support for others.
Logging
Optimal support with Google Cloud
Engine. Limited support for others.
Cluster Data Store etcd
Unit of Deployment Pod
139. Pro’s
• Produced by google.
• Very well-documented.
• Open source.
• The “successor” to CoreOS + Fleet.
Commercially supported by CoreOS as tectonic.
• If run in Google Cloud Engine, can potentially be
quite powerful.
140. Con’s
• Preferential support for Google Cloud Engine.
• Produced by Google but not necessarily the exact
system Google uses to run its own cluster (though
based on it).
• I may or may not be aware of additional issues.
141. When To Use It
• You’re running Google Cloud Engine
• You have prior experience from working at Google
• I may or may not be aware of add’l use cases.
142. Mesos + Kubernetes
• You can run kubernetes on top of Mesos as an
alternative to Marathon.
144. Closing Thoughts
• To get started quickly, choose EC2 Container
Service.
• To get a feel for the core technologies, choose a
PaaS like Deis and slowly learn CoreOS.
• To run multi-container VM’s at (potentially huge)
scale, choose Mesos.
• There are many more “satellite” projects I didn’t
cover solving unique problems!