SlideShare une entreprise Scribd logo
1  sur  29
Télécharger pour lire hors ligne
Docker and the
Future of Containers in
Production
CTO
bryan@joyent.com
Bryan Cantrill
@bcantrill
Prehistory: Virtualization as cloud catalyst
• In the 1960s — shortly after the dawn of computing! —
pundits foresaw a compute utility that would be public
and multi-tenant
• The vision was four decades too early: it took the
internet + commodity computing + virtualization to yield
cloud computing
• Virtualization is the essential ingredient for multi-tenant
operation — but where in the stack to virtualize?
• Choices around virtualization capture tensions between
elasticity, tenancy, and performance
• tl;dr: Virtualization choices drive economic tradeoffs
• The historical answer — since the 1960s — has been to
virtualize at the level of the hardware:
• A virtual machine is presented upon which each
tenant runs an operating system of their choosing
• There are as many operating systems as tenants
• The singular advantage of hardware virtualization: it can
run entire legacy stacks unmodified
• However, hardware virtualization exacts a heavy price:
operating systems are not designed to share resources
like DRAM, CPU, I/O devices or the network
• Hardware virtualization limits tenancy, elasticity and
performance
Hardware-level virtualization?
• Virtualizing at the application platform layer addresses
the tenancy challenges of hardware virtualization
• Added advantage of a much more nimble (& developer-
friendly!) abstraction…
• ...but at the cost of dictating abstraction to the developer
• This creates the “Google App Engine problem”:
developers are in a straightjacket where toy programs
are easy — but sophisticated apps are impossible
• Virtualizing at the application platform layer poses many
other challenges with respect to security, containment
and scalability
Platform-level virtualization?
• Virtualizing at the OS level hits the sweet spot:
• Single OS (i.e., single kernel) allows for efficient use of
hardware resources, maximizing tenancy and
performance
• Disjoint instances are securely compartmentalized by
the operating system
• Gives users what appears to be a virtual machine (albeit
a very fast one) on which to run higher-level software
• The ease of a PaaS with the generality of IaaS
• Model was pioneered by FreeBSD jails and taken to
their logical extreme by Solaris zones — and then aped
by Linux containers
OS-level virtualization!
OS-level virtualization in the cloud
• Joyent runs OS containers in the cloud via SmartOS
(our illumos derivative) — and we have run containers in
multi-tenant production since ~2006
• Core SmartOS facilities are container-aware and
optimized: Zones, ZFS, DTrace, Crossbow, SMF, etc.
• SmartOS also supports hardware-level virtualization —
but we have long advocated OS-level virtualization for
new build out
• We emphasized their operational characteristics
(performance, elasticity, tenancy), and for many years
we were a lone voice...
Containers as PaaS foundation?
• Some saw the power of OS containers to facilitate up-
stack platform-as-a-service abstractions
• For example, dotCloud — a platform-as-a-service
provider — build their PaaS on OS containers
• Hearing that many were interested in their container
orchestration layer (but not their PaaS), dotCloud open
sourced their container-based orchestration layer...
...and Docker was born
Docker revolution
• Docker has used the rapid provisioning + shared
underlying filesystem of containers to allow developers
to think operationally
• Developers can encode dependencies and deployment
practices into an image
• Images can be layered, allowing for swift development
• Images can be quickly deployed — and re-deployed
• Docker will do to apt what apt did to tar
Docker’s challenges
• The Docker model is the future of containers
• Docker’s challenges are largely around production
deployment: security, network virtualization, persistence
• Security concerns are real enough that for multi-tenancy,
OS containers are currently running in hardware VMs (!!)
• SmartOS, we have spent a decade addressing these
concerns — and are proven in production…
• Could we combine the best of both worlds?
• Could we somehow deploy Docker containers as
SmartOS zones?
Docker + SmartOS: Linux binaries?
• First (obvious) problem: while it has been designed to
be cross-platform, Docker is Linux-centric
• While Docker could be ported, the encyclopedia of
Docker images will likely forever remain Linux binaries
• SmartOS is Unix — but it isn’t Linux…
• Could we somehow natively emulate Linux — and run
Linux binaries directly on the SmartOS kernel?
OS emulation: An old idea
• Operating systems have long employed system call
emulation to allow binaries from one operating system
run on another on the same instruction set architecture
• Combines the binary footprint of the emulated system
with the operational advantages of the emulating system
• Sun first did this with SunOS 4.x binaries on Solaris 2.x
• In mid-2000s, Sun developed zone-based OS emulation
for Solaris: branded zones
• Several brands were developed — notably including an
LX brand that allowed for Linux emulation
LX-branded zones: Life and death
• The LX-branded zone worked for RHEL 3 (!): glibc 2.3.2
+ Linux 2.4
• Remarkable amount of work was done to handle device
pathing, signal handling, /proc — and arcana like TTY
ioctls, ptrace, etc.
• Worked for a surprising number of binaries!
• But support was only for 2.4 kernels and only for 32-bit;
2.6 + 64-bit appeared daunting…
• Support was ripped out of the system on June 11, 2010
• Fortunately, this was after the system was open sourced
in June 2005 — and the source was out there...
LX-branded zones: Resurrection!
• In January 2014, David Mackay, an illumos community
member, announced that he was able to resurrect the
LX brand —and that it appeared to work!
Linked below is a webrev which restores LX branded zones
support to Illumos:
http://cr.illumos.org/~webrev/DavidJX8P/lx-zones-restoration/
I have been running OpenIndiana, using it daily on my
workstation for over a month with the above webrev applied to
the illumos-gate and built by myself.
It would definitely raise interest in Illumos. Indeed, I have
seen many people who are extremely interested in LX zones.
The LX zones code is minimally invasive on Illumos itself, and
is mostly segregated out.
I hope you find this of interest.
LX-branded zones: Revival
• Encouraged that the LX-branded work was salvageable,
Joyent engineer Jerry Jelinek reintegrated the LX brand
into SmartOS on March 20, 2014...
• ...and started the (substantial) work to modernize it
• Guiding principles for LX-branded zone work:
• Do it all in the open
• Do it all on SmartOS master (illumos-joyent)
• Add base illumos facilities wherever possible
• Aim to upstream to illumos when we’re done
LX-branded zones: Progress
• Working assiduously over the course of 2014, progress
was difficult but steady:
• Ubuntu 10.04 booted in April
• Ubuntu 12.04 booted in May
• Ubuntu 14.04 booted in July
• 64-bit Ubuntu 14.04 booted in October (!)
• Going into 2015, it was becoming increasingly difficult to
find Linux software that didn’t work...
LX-branded zones: Working well...
...and, um, well received
Docker + SmartOS: Provisioning?
• With the binary problem being tackled, focus turned to
the mechanics of integrating Docker with the SmartOS
facilities for provisioning
• Provisioning a SmartOS zone operates via the global
zone that represents the control plane of the machine
• docker is a single binary that functions as both client
and server — and with too much surface area to run in
the global zone, especially for a public cloud
• docker has also embedded Go- and Linux-isms that
we did not want in the global zone; we needed to find a
different approach...
Docker Remote API
• While docker is a single binary that can run on the
client or the server, it does not run in both at once…
• docker (the client) communicates with docker (the
server) via the Docker Remote API
• The Docker Remote API is expressive, modern and
robust (i.e. versioned), allowing for docker to
communicate with Docker backends that aren’t docker
• The clear approach was therefore to implement a
Docker Remote API endpoint for SmartDataCenter
Aside: SmartDataCenter
• Orchestration software for SmartOS-based clouds
• Unlike other cloud stacks, not designed to run arbitrary
hypervisors, sell legacy hardware or get 160 companies
to agree on something
• SmartDataCenter is designed to leverage the SmartOS
differentiators: ZFS, DTrace and (esp.) zones
• Runs both the Joyent Public Cloud and business-critical
on-premises clouds at well-known brands
• Born proprietary — but made entirely open source on
November 6, 2014: http://github.com/joyent/sdc
SmartDataCenter: Architecture
Booter
AMQP
broker
Public
API
Customer
portal
ZFS-based multi-tenant filesystem
VirtualNIC
VirtualNIC
Virtual
SmartOS
(OS virt.)
...
VirtualNIC
VirtualNICLinux
Guest
(HW virt.)
...
VirtualNIC
VirtualNIC
Windows
Guest
(HW virt.)
...
VirtualNIC
VirtualNIC
Virtual OS
or Machine
...
SmartOS kernel
(network booted)
SmartOS kernel
(flash booted)
Provisioner
Instrumenter
Heartbeater
DHCP/TFTP
AMQP
AMQP agents
Public HTTP
Head-node
Compute node
Tens/hundreds per
head-node
. . .
SDC 7 core services
BinderDNS
Operator
portal
. . .
Firewall
SmartDataCenter: Core Services
Analytics
aggregator
Key/Value
Service
(Moray)
Firewall
API
(FWAPI)
Virtual
Machine
API
(VMAPI)
Directory
Service
(UFDS)
Designation
API
(DAPI)
Workflow
API
Network
API
(NAPI)
Compute-
Node API
(CNAPI)
Image
API
Alerts &
Monitoring
(Amon)
Packaging
API
(PAPI)
Service
API
(SAPI)
DHCP/
TFTP
AMQP
DNS
Booter
AMQP
broker
Binder
Public
API
Customer
portal
Public HTTP
Operator
portal
Operator
Services Manta
Other DCs
Note: Service
interdependencies not
shown for readability
Head-node
Other core services
may be provisioned on
compute nodes
SDC7 Core Services
SmartDataCenter + Docker
• Implementing an SDC-wide endpoint for the Docker
remote API allows us to build in terms of our established
core services: UFDS, CNAPI, VMAPI, Image API, etc.
• Has the welcome side-effect of virtualizing the notion of
Docker host machine: Docker containers can be placed
anywhere within the data center
• From a developer perspective, one less thing to manage
• From an operations perspective, allows for a flexible
layer of management and control: Docker API endpoints
become a potential administrative nexus
• As such, virtualizing the Docker host is somewhat
analogous to the way ZFS virtualized the filesystem...
SmartDataCenter + Docker: Challenges
• Some Docker constructs have (implicitly) encoded co-
locality of Docker containers on a physical machine
• Some of these constructs (e.g., --volumes-from) we
will discourage but accommodate by co-scheduling
• Others (e.g., host directory-based volumes) we are
implementing via NFS backed by Manta, our (open
source!) distributed object storage service
• Moving forward, we are working with Docker to help
assure that the Docker Remote API doesn’t create new
implicit dependencies on physical locality
SmartDataCenter + Docker: Networking
• Parallel to our SmartOS and Docker work, we have
been working on next-generation software-defined
networking for SmartOS and SmartDataCenter
• Goal was to use standard encapsulation/decapsulation
protocols (i.e., VXLAN) for overlay networks
• We have taken a kernel-based (and ARP-inspired)
approach to assure scale
• Complements SDC’s existing in-kernel, API-managed
firewall facilities
• All done in the open: on the dev-overlay branch of
SmartOS (illumos-joyent) and as sdc-portolan
Putting it all together: sdc-docker
• Our Docker engine for SDC, sdc-docker, implements
the end points for the Docker Remote API
• Work is young (started in earnest in early fall 2014), but
because it takes advantage of a proven orchestration
substrate, progress has been very quick…
• We will be deploying it into early access production in
the Joyent Public Cloud in Q1CY15
• It’s open source: http://github.com/joyent/sdc-docker;
you can install SDC (either on hardware or on VMware)
and check it out for yourself!
• A demo is worth a thousand slides...
Future of containers in production
• For nearly a decade, we at Joyent have believed that
OS-virtualized containers are the future of computing
• While the efficiency gains are tremendous, they have
not alone been enough to propel containers into the
mainstream
• We believe that the developer ease of Docker combined
with the proven production substrate of SmartOS and
SmartDataCenter yields the best of all worlds
• The future of containers is one without compromise:
developer efficiency, operational elasticity, multi-tenant
security and on-the-metal performance!
Thank you!
• Jerry Jelinek, @pfmooney, @jmclulow and @jperkin for
their work on LX branded zones
• @joshwilsdon, @trentmick, @cachafla and @orlandov
for their work on sdc-docker
• @rmustacc, @wayfaringrob, @fredfkuo and @notmatt
for their work on SDC overlay networking
• The countless engineers who have worked on or with
illumos because they believed in OS-based virtualization

Contenu connexe

En vedette

En vedette (15)

Leaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guideLeaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guide
 
The dream is alive! Running Linux containers on an illumos kernel
The dream is alive! Running Linux containers on an illumos kernelThe dream is alive! Running Linux containers on an illumos kernel
The dream is alive! Running Linux containers on an illumos kernel
 
Why it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metalWhy it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metal
 
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
The Peril and Promise of Early Adoption: Arriving 10 Years Early to ContainersThe Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
 
Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)
 
The DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps PlaybookThe DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps Playbook
 
node.js and Containers: Dispatches from the Frontier
node.js and Containers: Dispatches from the Frontiernode.js and Containers: Dispatches from the Frontier
node.js and Containers: Dispatches from the Frontier
 
Debugging microservices in production
Debugging microservices in productionDebugging microservices in production
Debugging microservices in production
 
Bidirektionale Verbindungen für Webanwendungen
Bidirektionale Verbindungen für WebanwendungenBidirektionale Verbindungen für Webanwendungen
Bidirektionale Verbindungen für Webanwendungen
 
Run containers on bare metal already!
Run containers on bare metal already!Run containers on bare metal already!
Run containers on bare metal already!
 
Papers We Love: Jails and Zones
Papers We Love: Jails and ZonesPapers We Love: Jails and Zones
Papers We Love: Jails and Zones
 
Oral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsOral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generations
 
Containers for Non-Developers
Containers for Non-DevelopersContainers for Non-Developers
Containers for Non-Developers
 
Docker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote APIDocker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote API
 
Presentation oracle super cluster t5-8 technical deep dive
Presentation   oracle super cluster t5-8 technical deep divePresentation   oracle super cluster t5-8 technical deep dive
Presentation oracle super cluster t5-8 technical deep dive
 

Plus de bcantrill

Plus de bcantrill (20)

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Present
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmaking
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systems
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systems
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolution
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Age
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Law
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemaps
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system software
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the union
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systems
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after dark
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadership
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data path
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyond
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mind
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Docker and the Future of Containers in Production

  • 1. Docker and the Future of Containers in Production CTO bryan@joyent.com Bryan Cantrill @bcantrill
  • 2. Prehistory: Virtualization as cloud catalyst • In the 1960s — shortly after the dawn of computing! — pundits foresaw a compute utility that would be public and multi-tenant • The vision was four decades too early: it took the internet + commodity computing + virtualization to yield cloud computing • Virtualization is the essential ingredient for multi-tenant operation — but where in the stack to virtualize? • Choices around virtualization capture tensions between elasticity, tenancy, and performance • tl;dr: Virtualization choices drive economic tradeoffs
  • 3. • The historical answer — since the 1960s — has been to virtualize at the level of the hardware: • A virtual machine is presented upon which each tenant runs an operating system of their choosing • There are as many operating systems as tenants • The singular advantage of hardware virtualization: it can run entire legacy stacks unmodified • However, hardware virtualization exacts a heavy price: operating systems are not designed to share resources like DRAM, CPU, I/O devices or the network • Hardware virtualization limits tenancy, elasticity and performance Hardware-level virtualization?
  • 4. • Virtualizing at the application platform layer addresses the tenancy challenges of hardware virtualization • Added advantage of a much more nimble (& developer- friendly!) abstraction… • ...but at the cost of dictating abstraction to the developer • This creates the “Google App Engine problem”: developers are in a straightjacket where toy programs are easy — but sophisticated apps are impossible • Virtualizing at the application platform layer poses many other challenges with respect to security, containment and scalability Platform-level virtualization?
  • 5. • Virtualizing at the OS level hits the sweet spot: • Single OS (i.e., single kernel) allows for efficient use of hardware resources, maximizing tenancy and performance • Disjoint instances are securely compartmentalized by the operating system • Gives users what appears to be a virtual machine (albeit a very fast one) on which to run higher-level software • The ease of a PaaS with the generality of IaaS • Model was pioneered by FreeBSD jails and taken to their logical extreme by Solaris zones — and then aped by Linux containers OS-level virtualization!
  • 6. OS-level virtualization in the cloud • Joyent runs OS containers in the cloud via SmartOS (our illumos derivative) — and we have run containers in multi-tenant production since ~2006 • Core SmartOS facilities are container-aware and optimized: Zones, ZFS, DTrace, Crossbow, SMF, etc. • SmartOS also supports hardware-level virtualization — but we have long advocated OS-level virtualization for new build out • We emphasized their operational characteristics (performance, elasticity, tenancy), and for many years we were a lone voice...
  • 7. Containers as PaaS foundation? • Some saw the power of OS containers to facilitate up- stack platform-as-a-service abstractions • For example, dotCloud — a platform-as-a-service provider — build their PaaS on OS containers • Hearing that many were interested in their container orchestration layer (but not their PaaS), dotCloud open sourced their container-based orchestration layer...
  • 9. Docker revolution • Docker has used the rapid provisioning + shared underlying filesystem of containers to allow developers to think operationally • Developers can encode dependencies and deployment practices into an image • Images can be layered, allowing for swift development • Images can be quickly deployed — and re-deployed • Docker will do to apt what apt did to tar
  • 10. Docker’s challenges • The Docker model is the future of containers • Docker’s challenges are largely around production deployment: security, network virtualization, persistence • Security concerns are real enough that for multi-tenancy, OS containers are currently running in hardware VMs (!!) • SmartOS, we have spent a decade addressing these concerns — and are proven in production… • Could we combine the best of both worlds? • Could we somehow deploy Docker containers as SmartOS zones?
  • 11. Docker + SmartOS: Linux binaries? • First (obvious) problem: while it has been designed to be cross-platform, Docker is Linux-centric • While Docker could be ported, the encyclopedia of Docker images will likely forever remain Linux binaries • SmartOS is Unix — but it isn’t Linux… • Could we somehow natively emulate Linux — and run Linux binaries directly on the SmartOS kernel?
  • 12. OS emulation: An old idea • Operating systems have long employed system call emulation to allow binaries from one operating system run on another on the same instruction set architecture • Combines the binary footprint of the emulated system with the operational advantages of the emulating system • Sun first did this with SunOS 4.x binaries on Solaris 2.x • In mid-2000s, Sun developed zone-based OS emulation for Solaris: branded zones • Several brands were developed — notably including an LX brand that allowed for Linux emulation
  • 13. LX-branded zones: Life and death • The LX-branded zone worked for RHEL 3 (!): glibc 2.3.2 + Linux 2.4 • Remarkable amount of work was done to handle device pathing, signal handling, /proc — and arcana like TTY ioctls, ptrace, etc. • Worked for a surprising number of binaries! • But support was only for 2.4 kernels and only for 32-bit; 2.6 + 64-bit appeared daunting… • Support was ripped out of the system on June 11, 2010 • Fortunately, this was after the system was open sourced in June 2005 — and the source was out there...
  • 14. LX-branded zones: Resurrection! • In January 2014, David Mackay, an illumos community member, announced that he was able to resurrect the LX brand —and that it appeared to work! Linked below is a webrev which restores LX branded zones support to Illumos: http://cr.illumos.org/~webrev/DavidJX8P/lx-zones-restoration/ I have been running OpenIndiana, using it daily on my workstation for over a month with the above webrev applied to the illumos-gate and built by myself. It would definitely raise interest in Illumos. Indeed, I have seen many people who are extremely interested in LX zones. The LX zones code is minimally invasive on Illumos itself, and is mostly segregated out. I hope you find this of interest.
  • 15. LX-branded zones: Revival • Encouraged that the LX-branded work was salvageable, Joyent engineer Jerry Jelinek reintegrated the LX brand into SmartOS on March 20, 2014... • ...and started the (substantial) work to modernize it • Guiding principles for LX-branded zone work: • Do it all in the open • Do it all on SmartOS master (illumos-joyent) • Add base illumos facilities wherever possible • Aim to upstream to illumos when we’re done
  • 16. LX-branded zones: Progress • Working assiduously over the course of 2014, progress was difficult but steady: • Ubuntu 10.04 booted in April • Ubuntu 12.04 booted in May • Ubuntu 14.04 booted in July • 64-bit Ubuntu 14.04 booted in October (!) • Going into 2015, it was becoming increasingly difficult to find Linux software that didn’t work...
  • 18. ...and, um, well received
  • 19. Docker + SmartOS: Provisioning? • With the binary problem being tackled, focus turned to the mechanics of integrating Docker with the SmartOS facilities for provisioning • Provisioning a SmartOS zone operates via the global zone that represents the control plane of the machine • docker is a single binary that functions as both client and server — and with too much surface area to run in the global zone, especially for a public cloud • docker has also embedded Go- and Linux-isms that we did not want in the global zone; we needed to find a different approach...
  • 20. Docker Remote API • While docker is a single binary that can run on the client or the server, it does not run in both at once… • docker (the client) communicates with docker (the server) via the Docker Remote API • The Docker Remote API is expressive, modern and robust (i.e. versioned), allowing for docker to communicate with Docker backends that aren’t docker • The clear approach was therefore to implement a Docker Remote API endpoint for SmartDataCenter
  • 21. Aside: SmartDataCenter • Orchestration software for SmartOS-based clouds • Unlike other cloud stacks, not designed to run arbitrary hypervisors, sell legacy hardware or get 160 companies to agree on something • SmartDataCenter is designed to leverage the SmartOS differentiators: ZFS, DTrace and (esp.) zones • Runs both the Joyent Public Cloud and business-critical on-premises clouds at well-known brands • Born proprietary — but made entirely open source on November 6, 2014: http://github.com/joyent/sdc
  • 22. SmartDataCenter: Architecture Booter AMQP broker Public API Customer portal ZFS-based multi-tenant filesystem VirtualNIC VirtualNIC Virtual SmartOS (OS virt.) ... VirtualNIC VirtualNICLinux Guest (HW virt.) ... VirtualNIC VirtualNIC Windows Guest (HW virt.) ... VirtualNIC VirtualNIC Virtual OS or Machine ... SmartOS kernel (network booted) SmartOS kernel (flash booted) Provisioner Instrumenter Heartbeater DHCP/TFTP AMQP AMQP agents Public HTTP Head-node Compute node Tens/hundreds per head-node . . . SDC 7 core services BinderDNS Operator portal . . . Firewall
  • 23. SmartDataCenter: Core Services Analytics aggregator Key/Value Service (Moray) Firewall API (FWAPI) Virtual Machine API (VMAPI) Directory Service (UFDS) Designation API (DAPI) Workflow API Network API (NAPI) Compute- Node API (CNAPI) Image API Alerts & Monitoring (Amon) Packaging API (PAPI) Service API (SAPI) DHCP/ TFTP AMQP DNS Booter AMQP broker Binder Public API Customer portal Public HTTP Operator portal Operator Services Manta Other DCs Note: Service interdependencies not shown for readability Head-node Other core services may be provisioned on compute nodes SDC7 Core Services
  • 24. SmartDataCenter + Docker • Implementing an SDC-wide endpoint for the Docker remote API allows us to build in terms of our established core services: UFDS, CNAPI, VMAPI, Image API, etc. • Has the welcome side-effect of virtualizing the notion of Docker host machine: Docker containers can be placed anywhere within the data center • From a developer perspective, one less thing to manage • From an operations perspective, allows for a flexible layer of management and control: Docker API endpoints become a potential administrative nexus • As such, virtualizing the Docker host is somewhat analogous to the way ZFS virtualized the filesystem...
  • 25. SmartDataCenter + Docker: Challenges • Some Docker constructs have (implicitly) encoded co- locality of Docker containers on a physical machine • Some of these constructs (e.g., --volumes-from) we will discourage but accommodate by co-scheduling • Others (e.g., host directory-based volumes) we are implementing via NFS backed by Manta, our (open source!) distributed object storage service • Moving forward, we are working with Docker to help assure that the Docker Remote API doesn’t create new implicit dependencies on physical locality
  • 26. SmartDataCenter + Docker: Networking • Parallel to our SmartOS and Docker work, we have been working on next-generation software-defined networking for SmartOS and SmartDataCenter • Goal was to use standard encapsulation/decapsulation protocols (i.e., VXLAN) for overlay networks • We have taken a kernel-based (and ARP-inspired) approach to assure scale • Complements SDC’s existing in-kernel, API-managed firewall facilities • All done in the open: on the dev-overlay branch of SmartOS (illumos-joyent) and as sdc-portolan
  • 27. Putting it all together: sdc-docker • Our Docker engine for SDC, sdc-docker, implements the end points for the Docker Remote API • Work is young (started in earnest in early fall 2014), but because it takes advantage of a proven orchestration substrate, progress has been very quick… • We will be deploying it into early access production in the Joyent Public Cloud in Q1CY15 • It’s open source: http://github.com/joyent/sdc-docker; you can install SDC (either on hardware or on VMware) and check it out for yourself! • A demo is worth a thousand slides...
  • 28. Future of containers in production • For nearly a decade, we at Joyent have believed that OS-virtualized containers are the future of computing • While the efficiency gains are tremendous, they have not alone been enough to propel containers into the mainstream • We believe that the developer ease of Docker combined with the proven production substrate of SmartOS and SmartDataCenter yields the best of all worlds • The future of containers is one without compromise: developer efficiency, operational elasticity, multi-tenant security and on-the-metal performance!
  • 29. Thank you! • Jerry Jelinek, @pfmooney, @jmclulow and @jperkin for their work on LX branded zones • @joshwilsdon, @trentmick, @cachafla and @orlandov for their work on sdc-docker • @rmustacc, @wayfaringrob, @fredfkuo and @notmatt for their work on SDC overlay networking • The countless engineers who have worked on or with illumos because they believed in OS-based virtualization