CoreOS: A Lightweight OS for Running Docker Containers in Production

CoreOS
What is it and why should I care?
1 / 80

Who am
I?
Karl Grzeszczak
Senior Software Engineer - Mediafly
twitter @karl_grz
karlgrz.com
2 / 80

At Mediafly, a lot of our infrastructure is
service oriented distributed systems
running docker containers
3 / 80

CoreOS seems like an ideal fit for our
needs, so I decided to investigate
4 / 80

I am -not- affiliated with CoreOS
(I'm just curious and wanted to understand it!)
5 / 80

Lightweight
CoreOS is designed to be a modern, minimal base to build your
platform. Consumes 40% less RAM on boot than an average
Linux installation.
https://coreos.com/
7 / 80

Painless Updating
Utilizes an active/passive dual-partition scheme to update
the OS as a single unit instead of package by package. This
makes each update quick, reliable and able to be easily rolled
back.
https://coreos.com/
8 / 80

Docker Containers
Applications on CoreOS run as Docker containers. Containers
provide maximum flexibility in packaging and can start in
milliseconds.
https://coreos.com/
9 / 80

Clustered By Default
CoreOS works well on a single machine, but it's designed to
be clustered. Easily run application containers across
multiple machines with fleet and connect them together with
service discovery.
https://coreos.com/
10 / 80

Distributed Systems Tools
Built-in primitives such as distributed locking and master
election are the building blocks for large scale distributed
systems.
https://coreos.com/
11 / 80

Service Discovery
Easily locate where services are being run within the cluster
and be notified when something changes. Essential for a
complex, highly dynamic cluster. Built into CoreOS with high
availability and automatic fail-over.
https://coreos.com/
12 / 80

How is it different from other *NIXes?
13 / 80

No package manager
All your applications should run as a container
Linux kernel, docker, systemd, fleetd, etcd, sshd
According to https://coreos.com, it uses 114MB of RAM at
boot, approximately 40% less than average Linux server
Designed specifically for running distributed systems
14 / 80

This is ideal if you already use docker
15 / 80

What do you have to do differently?
16 / 80

What do
you have to
do
differently?
etcd service discovery
17 / 80

What do
you have to
do
differently?
broadcast your applications key
infrastructure settings back to etcd
18 / 80

What do
you have to
do
differently?
broadcast your applications key
infrastructure settings back to etcd
use fleet to orchestrate your containers
19 / 80

http://github.com/coreos/etcd
21 / 80

A highly-available key value store for
shared configuration and service
discovery. etcd is inspired by Apache
ZooKeeper and doozer
https://github.com/coreos/etcd#readme-version-046
22 / 80

Simple: curl'able user facing API (HTTP+JSON)
Secure: optional SSL client cert authentication
Fast: benchmarked 1000s of writes/s per instance
Reliable: properly distributed using Raft
etcd is written in Go and uses the Raft consensus algorithm
to manage a highly-available replicated log.
https://github.com/coreos/etcd#readme-version-046
23 / 80

Raft Concensus Algorithm
24 / 80

In Search of an Understandable Concensus Algorithm by
Stanford's Diego Ongaro and John Ousterhout
https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf
"As a result, each state machine processes the same series
of commands and thus produces the same series of results
and arrives at the same series of states."
http://raftconsensus.github.io/
25 / 80

Raft elects a leader, and the leader records a master version
and distributes that to the other nodes in the cluster. It does
not write a confirmation until it hears back from a concensus
of nodes that agree.
If the leader goes AWOL for a certain time, then a new
election process begins to find a new leader and continue.
27 / 80

For now, just understand...
Raft is similar to Paxos in fault-tolerance and performance
and it makes sure that etcd and your cluster can continue
operating even if some nodes experience partitions (or are
terminated!)
28 / 80

This is an AWESOME animation you should watch because it
explains Raft MUCH better than I can:
http://thesecretlivesofdata.com/raft/
29 / 80

http://github.com/coreos/fleet
31 / 80

fleet ties systemd and etcd together into a distributed init
system
32 / 80

Supported Deployment Patterns
33 / 80

Deploy a single unit anywhere on the
cluster
https://github.com/coreos/fleet#supported-deployment-patterns
34 / 80

Deploy a unit globally everywhere in the
cluster
35 / 80

Automatic rescheduling of units on
machine failure
36 / 80

Ensure that units are deployed together
on the same machine
37 / 80

Forbid specific units from colocation on
the same machine (anti-affinity)
38 / 80

Deploy units to machines only with
specific metadata
39 / 80

It makes it very easy to know what is running in your cluster,
where, and how it's doing
40 / 80

fleet has a LOT of promise, and is my
favorite part of CoreOS
41 / 80

...it's also my least favorite part of
CoreOS
43 / 80

fleet (0.8) seems very early, rough, and
opinionated whereas etcd seems ready
for production
44 / 80

...but it feels like the best option out
there right now
45 / 80

Read this post later:
http://lukebond.ghost.io/deploying-docker-containers-on-a-
coreos-cluster-with-fleet/
I found this while putting together this presentation, and I
think it does a great job explaining all this in written form
46 / 80

Walkthrough on Vagrant
http://github.com/coreos/coreos-vagrant
https://coreos.com/docs/running-coreos/
platforms/vagrant/
48 / 80

bootstrapping the cluster
karl@karl-mediafly:~$ curl discovery.etcd.io/new
https://discovery.etcd.io/b9845b31a57793fe9f88137220b7f454
49 / 80

Output gets pasted into user-data:
#cloud-config
coreos:
etcd:
# generate a new token for each unique cluster from https://discovery.etcd.io/new
# WARNING: replace each time you 'vagrant destroy'
discovery: https://discovery.etcd.io/b9845b31a57793fe9f88137220b7f454
addr: $public_ipv4:4001
peer-addr: $public_ipv4:7001
fleet:
public-ip: $public_ipv4
units:
- name: etcd.service
command: start
- name: fleet.service
command: start
- name: docker-tcp.socket
command: start
enable: true
50 / 80

show all machines in your cluster
core@core-01 ~/share $ fleetctl list-machines
MACHINE IP METADATA
78e5ab3e... 172.17.8.103 -
adddf8be... 172.17.8.102 -
df763c2f... 172.17.8.101 -
51 / 80

service unit
[Unit]
Description=karlgrz.com
After=docker.service
Requires=docker.service
[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill karlgrz_web
ExecStartPre=-/usr/bin/docker rm karlgrz_web
ExecStartPre=/usr/bin/docker pull karlgrz/ubuntu-14.04-base-nginx
ExecStartPre=/bin/sh -c "cd /srv/karlgrz.com &&
/usr/bin/docker build -t karlgrz/karlgrz_web ."
ExecStart=/usr/bin/docker run --name karlgrz_web -p 8001:8001 karlgrz/karlgrz_web
ExecStop=/usr/bin/docker stop karlgrz_web
52 / 80

start up some units
core@core-01 ~/share/karlgrz-docker/fleet $ fleetctl start fantasy_web.service
jcsdoorsolutions_web.service stickfigureninjas_web.service karlgrz_web.service
Unit fantasy_web.service launched on adddf8be.../172.17.8.102
Unit karlgrz_web.service launched on adddf8be.../172.17.8.102
Unit jcsdoorsolutions_web.service launched on 78e5ab3e.../172.17.8.103
Unit stickfigureninjas_web.service launched on 78e5ab3e.../172.17.8.103
53 / 80

list loaded units and their status
core@core-01 ~/share/karlgrz-docker/fleet $ fleetctl list-units
UNIT MACHINE ACTIVE SUB
fantasy_web.service adddf8be.../172.17.8.102 activating start-pre
jcsdoorsolutions_web.service 78e5ab3e.../172.17.8.103 activating start-pre
karlgrz_web.service adddf8be.../172.17.8.102 active running
stickfigureninjas_web.service 78e5ab3e.../172.17.8.103 activating start-pre
fantasy_web.service adddf8be.../172.17.8.102 active running
jcsdoorsolutions_web.service 78e5ab3e.../172.17.8.103 active running
stickfigureninjas_web.service 78e5ab3e.../172.17.8.103 active running
54 / 80

discovery sidekick
[Unit]
Description=Announce karlgrz.com
BindsTo=karlgrz_web.service
[Service]
EnvironmentFile=/etc/environment
ExecStart=/bin/sh -c "while true;
do etcdctl set /apps/karlgrz_web
'{ "host": "karlgrz.com",
"appkey": "karlgrz_web",
"ip" :"${COREOS_PUBLIC_IPV4}",
"port" :"8001" }'
--ttl 60; sleep 45; done"
ExecStop=/usr/bin/etcdctl rm /apps/karlgrz_web
[X-Fleet]
MachineOf=karlgrz_web.service
55 / 80

run discovery sidekicks
core@core-01 ~/share/karlgrz-docker/fleet $ etcdctl ls /apps
core@core-01 ~/share/karlgrz-docker/fleet $ fleetctl start fantasy_discovery.service
jcsdoorsolutions_discovery.service stickfigureninjas_discovery.service
karlgrz_discovery.service
Unit jcsdoorsolutions_discovery.service launched on 78e5ab3e.../172.17.8.103
Unit stickfigureninjas_discovery.service launched on 78e5ab3e.../172.17.8.103
Unit fantasy_discovery.service launched on adddf8be.../172.17.8.102
Unit karlgrz_discovery.service launched on adddf8be.../172.17.8.102
core@core-01 ~/share/karlgrz-docker/fleet $ etcdctl ls /apps
/apps/rethinkdb_services
/apps/fantasy_web
/apps/karlgrz_web
/apps/jcsdoorsolutions_web
/apps/stickfigureninjas_web
56 / 80

etcd values
core@core-01 ~/share/karlgrz-docker/fleet $ etcdctl get /apps/karlgrz_web
{ "host": "karlgrz.com", "appkey": "karlgrz_web", "ip" :"172.17.8.102", "port" :"8001" }
57 / 80

list units
fantasy_discovery.service adddf8be.../172.17.8.102 active running
jcsdoorsolutions_discovery.service 78e5ab3e.../172.17.8.103 active running
jcsdoorsolutions_web.service 78e5ab3e.../172.17.8.103 active running
karlgrz_discovery.service adddf8be.../172.17.8.102 active running
rethinkdb_discovery.service df763c2f.../172.17.8.101 active running
rethinkdb_services.service df763c2f.../172.17.8.101 active running
stickfigureninjas_discovery.service78e5ab3e.../172.17.8.103 active running
stickfigureninjas_web.service 78e5ab3e.../172.17.8.103 active running
58 / 80

run a unit on ONLY one SPECIFIC node
[Unit]
Description=rethinkdb
After=docker.service
Requires=docker.service
[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill rethinkdb_services
ExecStartPre=-/usr/bin/docker rm rethinkdb_services
ExecStartPre=/usr/bin/docker pull dockerfile/rethinkdb
ExecStart=/usr/bin/docker run --name rethinkdb_services
-p 8080:8080 -p 28015:28015 -p 29015:29105 -v /home/core/rethinkdb:/data
-t dockerfile/rethinkdb rethinkdb -d /data --bind all
ExecStop=/usr/bin/docker stop rethinkdb_services
[X-Fleet]
MachineID=9f152bf8
59 / 80

see logging output from a running container
core@core-01 ~ $ fleetctl journal fantasy_web
-- Logs begin at Wed 2014-09-24 21:32:32 UTC, end at Thu 2014-09-25 19:55:26 UTC. --
Sep 25 18:22:08 core-02 docker[1572]: Python version: 2.7.6 (default, Mar 22 2014, 23:03:41) Sep 25 18:22:08 core-02 docker[1572]: Python main interpreter initialized at 0xc53540
Sep 25 18:22:08 core-02 docker[1572]: python threads support enabled
Sep 25 18:22:08 core-02 docker[1572]: your server socket listen backlog is limited to
100 connections
Sep 25 18:22:08 core-02 docker[1572]: your mercy for graceful operations on workers is
60 seconds
Sep 25 18:22:08 core-02 docker[1572]: mapped 72768 bytes (71 KB) for 1 cores
Sep 25 18:22:08 core-02 docker[1572]: *** Operational MODE: single process ***
Sep 25 18:22:09 core-02 docker[1572]: WSGI app 0 (mountpoint='') ready in 1 seconds on
interpreter 0xc53540 pid: 13 (default app)
Sep 25 18:22:09 core-02 docker[1572]: *** uWSGI is running in multiple interpreter mode ***
Sep 25 18:22:09 core-02 docker[1572]: spawned uWSGI worker 1 (and the only) (pid: 13, cores: 60 / 80

core@core-03 ~ $ fleetctl journal karlgrz_web
Sep 25 18:21:58 core-03 sh[1315]: ---> Using cache
Sep 25 18:21:58 core-03 sh[1315]: ---> ce8cd32fe157
Sep 25 18:21:58 core-03 sh[1315]: Step 6 : RUN cd /srv && make publish
Sep 25 18:21:58 core-03 sh[1315]: ---> 83f7f333889b
Sep 25 18:21:58 core-03 sh[1315]: Step 7 : CMD ["nginx"]
Sep 25 18:21:58 core-03 sh[1315]: ---> 4cf274f01dae
Sep 25 18:21:58 core-03 sh[1315]: Successfully built 4cf274f01dae
Sep 25 18:21:59 core-03 systemd[1]: Started karlgrz.com.
61 / 80

core@core-02 ~/share/karlgrz-docker/fleet $ fleetctl journal classholes_web
Sep 25 20:01:40 core-02 systemd[1]: Starting classholes.com...
Sep 25 20:01:40 core-02 docker[3071]: Error response from daemon: No such container:
classholes_web
Sep 25 20:01:40 core-02 docker[3071]: 2014/09/25 20:01:40 Error: failed to kill one or
more containers
Sep 25 20:01:40 core-02 docker[3085]: Error response from daemon: No such container:
classholes_web
Sep 25 20:01:40 core-02 docker[3085]: 2014/09/25 20:01:40 Error: failed to remove one or
more containers
Sep 25 20:01:40 core-02 docker[3095]: Pulling repository karlgrz/ubuntu-14.04-base-nginx
Sep 25 20:01:42 core-02 systemd[1]: classholes_web.service: control process exited, code=
exited status=1
Sep 25 20:01:42 core-02 systemd[1]: Failed to start classholes.com.
Sep 25 20:01:42 core-02 sh[3110]: /bin/sh: line 0: cd: /home/core/share/classholes: No such
file or directory
Sep 25 20:01:42 core-02 systemd[1]: Unit classholes_web.service entered failed state.
62 / 80

terminate a node and see the services running on it moved to
another node in the cluster
karl@karl-mediafly:~/workspace/coreos-vagrant$ vagrant ssh core-03 -- -A
Last login: Thu Sep 25 16:37:01 2014 from 10.0.2.2
CoreOS (beta)
core@core-03 ~ $ shutdown -n
shutdown: invalid option -- 'n'
core@core-03 ~ $ shutdown
Must be root.
core@core-03 ~ $ sudo shutdown -n
shutdown: invalid option -- 'n'
core@core-03 ~ $ sudo shutdown
Shutdown scheduled for Thu 2014-09-25 16:46:14 UTC, use 'shutdown -c' to cancel.
Broadcast message from root@core-03 (Thu 2014-09-25 16:45:14 UTC):
The system is going down for power-off at Thu 2014-09-25 16:46:14 UTC!
63 / 80

core@core-02 ~ $ fleetctl list-units
64 / 80

jcsdoorsolutions_discovery.service df763c2f.../172.17.8.101 active running
jcsdoorsolutions_web.service df763c2f.../172.17.8.101 activating start-pre
stickfigureninjas_discovery.servicedf763c2f.../172.17.8.101 active running
stickfigureninjas_web.service df763c2f.../172.17.8.101 activating start-pre
65 / 80

core@core-02 ~ $ fleetctl list-machines
MACHINE IP METADATA
adddf8be... 172.17.8.102 -
df763c2f... 172.17.8.101 -
66 / 80

jcsdoorsolutions_discovery.service df763c2f.../172.17.8.101 active running
jcsdoorsolutions_web.service df763c2f.../172.17.8.101 active running
stickfigureninjas_discovery.servicedf763c2f.../172.17.8.101 active running
stickfigureninjas_web.service df763c2f.../172.17.8.101 active running
67 / 80

Please keep in mind I ran this cluster on my laptop using
Vagrant, not on cloud infrastructure
69 / 80

Clustering just worked
(I didn't even really have to think about failover or replication
myself)
70 / 80

alpha software
fleet and etcd are great, but they both need some more work
before being "production ready"
71 / 80

fleet in particular gets into situations sometimes where I
have destroyed a unit but it still shows in the list of units for
a while
72 / 80

fleet doesn't have a nice mechanism to restart all your units
or groups (at least that I found)
73 / 80

not quite ready for Mediafly
75 / 80

I plan on deploying CoreOS to power my
side projects, blog, and the handful of
sites I run for friends soon
77 / 80

I feel that after a bit of work this will be
the OS that powers distributed systems
in the future
78 / 80

CoreOS: A Lightweight OS for Running Docker Containers in Production

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (19)

Similaire à CoreOS: A Lightweight OS for Running Docker Containers in Production

Similaire à CoreOS: A Lightweight OS for Running Docker Containers in Production (20)

Dernier

Dernier (20)

CoreOS: A Lightweight OS for Running Docker Containers in Production