SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
Let Me Contain That For You

Containers @ Google

Victor Marmol (vmarmol@google.com)
Rohit Jnagal (jnagal@google.com)
SF Bay Area Large-Scale Production Engineering: Lightweight Containers Meetup
February 20, 2014

Google Confidential and Proprietary
Containers in the Wild

User 1

User 2

User 3

User 4

Linux Kernel

●
●
●
●

Used to provide VM-like instances
High density (lower costs) and high performance
Fast to start
Migration is hard, but possible
Google Confidential and Proprietary
The Need for Isolation: A Shared Google Machine

I/O:CPU:Mem
Sensitive Task

Front End Task

Back End Task

Alloc

BACKGROUND
System Daemons

Batch workload

TASKS
Soaker workload
Google Confidential and Proprietary
Containers @ Google
SS1
SS2

Sub 2
Task 1

Task 2

Sub 1

Sub 4

Sub 1

SS3

Sub 3

SS4
Sub 3
Sub 2
Alloc 1

Task 1

Task 2

Linux Kernel

●
●
●
●

Container-aware tasks use asymmetric subcontainers
Provide different guarantees of quality of service
Overcommit resources to achieve high utilization
Early users, few namespaces, and near-zero overhead
Google Confidential and Proprietary
Asymmetric Isolation

Isolating only certain resources (e.g., CPU but not memory).

CPU

Memory

Net

Container 1

Container 2

Container 3

Google Confidential and Proprietary
Containers @ Google Today

● Historically
○
○
○

●
●
●
●
●

2004: No isolation
2006: Cgroups
Now: Namespaces

Primarily Linux cgroups + user-space policies and monitoring
We skipped VMs due to high overhead
Used everywhere: SaaS, PaaS, IaaS; Android, Chrome OS
Heterogeneous workloads: Latency, bandwidth, and priority
High task churn

Google Confidential and Proprietary
Goals

● Isolation
○ Tasks do not impact each other
○ The behavior of a Task is the same regardless of what else is
on the machine
● Predictability
○ Tasks behave the same each time they run
○ Unless they are specifically configured to use "slack"
● Quality of Service
○ Different tasks get different quality of resources
● Overcommitment
○ Oversell machine resources within QoS guarantees

Google Confidential and Proprietary
lmctfy: Let Me Contain That For You
Open source containers stack based on Google’s.

github.com/google/lmctfy/
Provides the Container abstraction to higher levels by abstracting away
the kernel interfaces.
Motivation
● Existing code, systems, and design around containers
● Problems with LXC
○
○

No abstraction (direct knob exposure)
No easy way to access programmatically

Google Confidential and Proprietary
lmctfy: Let Me Contain That For You
Objectives
● Abstract away enforcement: separate policy from enforcement
● Scalability and parallel access
● Intent-based container specifications
● Asymmetric isolation
● Subcontainer support
● Provides tiers of quality of service
System Layers
● CL1
○
○
○

Container abstraction and enforcement
Thin and light layer
Current lmctfy

● CL2
○
○
○

Sets policy (QoS, overcommitment)
Higher level logic, monitoring, and control loops
Stateful entity

Google Confidential and Proprietary
lmctfy: Fine-tuned resource isolation
Current cgroup API is complicated with lots of knobs (each a cgroup
file):
Common: 5+ files
cgroup.clone_children cgroup.event_control cgroup.procs notify_on_release release_agent
CPU: 8+ files
cpuacct.stat cpuacct.usage cpuacct.usage_percpu cpu.cfs_period_us cpu.cfs_quota_us cpu.
rt_period_us cpu.rt_runtime_us cpu.shares cpu.stat
Memory: 12+ files
memory.failcnt memory.force_empty memory.limit_in_bytes memory.max_usage_in_bytes memory.
move_charge_at_immigrate memory.numa_stat memory.oom_control memory.pressure_level
memory.soft_limit_in_bytes memory.stat memory.swappiness memory.usage_in_bytes memory.
use_hierarchy
Cpuset: 12+ files
cpuset.cpu_exclusive cpuset.cpus cpuset.mem_exclusive cpuset.mem_hardwall cpuset.
memory_migrate cpuset.memory_pressure cpuset.memory_pressure_enabled cpuset.
memory_spread_page cpuset.memory_spread_slab cpuset.mems cpuset.sched_load_balance
cpuset.sched_relax_domain_level
+DiskIO
+Net
+...
Google Confidential and Proprietary
Released 0.4.0 (This Week!)
Initial version of lowest layer
● Written entirely in C++
● Delivered as a CLI and a C++ library (C and Go bindings soon)
● Isolation for CPU, memory, and perf event
● Full support for subcontainers
● “Stateless” and lightweight
● Initial support for namespaces, more to come in the next week.
Can be augmented with custom kernel patches
● CPU latency and accounting
● OOM priority
Supported configurations
● Target configuration is well supported
● Designed to be flexible, but we test on a limited set of them
● More target configurations being added
● Contributions to add more are welcome
Google Confidential and Proprietary
Container Specifications
message ContainerSpec {
optional int64 owner = 1;
optional
optional
optional
optional
optional
...

CpuSpec cpu = 2;
MemorySpec memory = 3;
DiskIoSpec diskio = 4;
NetworkSpec network = 5;
VirtualHost virtualhost = 6;

}
message CpuSpec {
optional ShedulingLatency scheduling_latency = 1;
optional uint64 limit = 2;
optional uint64 max_limit = 3;
...
}

Create: “cpu:<limit:1000 max_limit:2000>
memory:<limit:4096000 reservation:1024000>”
Google Confidential and Proprietary
Cgroup Specifications

Create: “cpu:<limit:1000 max_limit:2000
scheduling_latency:PRIORITY>
memory:<limit:4096000 reservation:1024000>”

equivalent lxc cgroup config:
lxc.cgroup.cpu.shares = 2048
lxc.cgroup.cpu.cfs_period_us = 50000
lxc.cgroup.cpu.cfs_quota_us = 10000
lxc.cgroup.cpu.lat = 25
.. cpu performance knobs ..
lxc.cgroup.memory.limit_in_bytes = 4096000
lxc.cgroup.memory.soft_limit_in_bytes = 1024000
.. memory performance knobs ..

Google Confidential and Proprietary
C++ API
::containers::lmctfy::ContainerApi
● Create
● Get
● Destroy
● Detect
● InitMachine
::containers::lmctfy::Container
● Update
● Run
● Notifications
● List (threads, PIDs, and subcontainers)
● Stats
● Pause/Resume
● KillAll
CLI is a thin wrapper around the C++ API
Google Confidential and Proprietary
Container Names
Path-like hierarchy of container names:
Absolute: /parent/self
Relative: self when in /parent
Container Name

Refers To

/

The root top-level container

/sys

The sys top-level container

/sys/sub

The sub subcontainer of the sys top-level container

. or ./

The current container (current relative to the calling process)

..

The parent container (parent relative to the calling process)

./foo_container
or foo_container

The foo_container subcontainer of the current container

/foo_container

The foo_container top-level container

Google Confidential and Proprietary
Roadmap
Towards Version 1.0
● Improve VirtualHost support
● Root file systems
● Checkpoint restore
● Support and target most major distros
● Fully compatible with Docker’s use of containers
Higher Layer
● Admission control and feasibility checks
● Monitoring, notifications, and statistics
● Tiers of quality of service guarantees
Contributions Welcome!

Google Confidential and Proprietary
Questions?

Repository: https://github.com/google/lmctfy/
Mailing list: lmctfy@googlegroups.com

Victor Marmol: vmarmol@google.com
Rohit Jnagal: jnagal@google.com

Google Confidential and Proprietary

Contenu connexe

Tendances

Tendances (20)

Kubernetes Basic Operation
Kubernetes Basic OperationKubernetes Basic Operation
Kubernetes Basic Operation
 
GlusterFS Cinder integration presented at GlusterNight Paris event @ Openstac...
GlusterFS Cinder integration presented at GlusterNight Paris event @ Openstac...GlusterFS Cinder integration presented at GlusterNight Paris event @ Openstac...
GlusterFS Cinder integration presented at GlusterNight Paris event @ Openstac...
 
Deep Dive into Kubernetes - Part 2
Deep Dive into Kubernetes - Part 2Deep Dive into Kubernetes - Part 2
Deep Dive into Kubernetes - Part 2
 
GlusterFS and Openstack Storage
GlusterFS and Openstack StorageGlusterFS and Openstack Storage
GlusterFS and Openstack Storage
 
Docker and Kubernetes 101 workshop
Docker and Kubernetes 101 workshopDocker and Kubernetes 101 workshop
Docker and Kubernetes 101 workshop
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the Datacenter
 
Container Orchestration from Theory to Practice
Container Orchestration from Theory to PracticeContainer Orchestration from Theory to Practice
Container Orchestration from Theory to Practice
 
Swarm: Native Docker Clustering
Swarm: Native Docker ClusteringSwarm: Native Docker Clustering
Swarm: Native Docker Clustering
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 
Federated Kubernetes: As a Platform for Distributed Scientific Computing
Federated Kubernetes: As a Platform for Distributed Scientific ComputingFederated Kubernetes: As a Platform for Distributed Scientific Computing
Federated Kubernetes: As a Platform for Distributed Scientific Computing
 
Mobycraft:Docker in 8-bit (Meetup at Docker HQ 4/7)
Mobycraft:Docker in 8-bit (Meetup at Docker HQ 4/7)Mobycraft:Docker in 8-bit (Meetup at Docker HQ 4/7)
Mobycraft:Docker in 8-bit (Meetup at Docker HQ 4/7)
 
Unleashing k8 s to reduce complexities of an entire middleware platform
Unleashing k8 s to reduce complexities of an entire middleware platformUnleashing k8 s to reduce complexities of an entire middleware platform
Unleashing k8 s to reduce complexities of an entire middleware platform
 
Kubernetes networking
Kubernetes networkingKubernetes networking
Kubernetes networking
 
Kubernetes architecture
Kubernetes architectureKubernetes architecture
Kubernetes architecture
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Running Docker with OpenStack | Docker workshop #1
Running Docker with OpenStack | Docker workshop #1Running Docker with OpenStack | Docker workshop #1
Running Docker with OpenStack | Docker workshop #1
 
Zun project update (boston summit)
Zun project update (boston summit)Zun project update (boston summit)
Zun project update (boston summit)
 
Zun presentation (OpenStack Barcelona summit)
Zun presentation (OpenStack Barcelona summit)Zun presentation (OpenStack Barcelona summit)
Zun presentation (OpenStack Barcelona summit)
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015
 

Similaire à Containers @ Google

Deep dive into OpenStack storage, Sean Cohen, Red Hat
Deep dive into OpenStack storage, Sean Cohen, Red HatDeep dive into OpenStack storage, Sean Cohen, Red Hat
Deep dive into OpenStack storage, Sean Cohen, Red Hat
Sean Cohen
 

Similaire à Containers @ Google (20)

LXC on Ganeti
LXC on GanetiLXC on Ganeti
LXC on Ganeti
 
Mattia Gandolfi - Improving utilization and portability with Containers and C...
Mattia Gandolfi - Improving utilization and portability with Containers and C...Mattia Gandolfi - Improving utilization and portability with Containers and C...
Mattia Gandolfi - Improving utilization and portability with Containers and C...
 
Customize and Secure the Runtime and Dependencies of Your Procedural Language...
Customize and Secure the Runtime and Dependencies of Your Procedural Language...Customize and Secure the Runtime and Dependencies of Your Procedural Language...
Customize and Secure the Runtime and Dependencies of Your Procedural Language...
 
Containers > VMs
Containers > VMsContainers > VMs
Containers > VMs
 
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
 
Deep dive into OpenStack storage, Sean Cohen, Red Hat
Deep dive into OpenStack storage, Sean Cohen, Red HatDeep dive into OpenStack storage, Sean Cohen, Red Hat
Deep dive into OpenStack storage, Sean Cohen, Red Hat
 
Introduction to rook
Introduction to rookIntroduction to rook
Introduction to rook
 
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
 
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius SchumacherOSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014
 
Securing Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container PlatformSecuring Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container Platform
 
Container orchestration and microservices world
Container orchestration and microservices worldContainer orchestration and microservices world
Container orchestration and microservices world
 
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander KukushkinPGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
 
Docker Security - Secure Container Deployment on Linux
Docker Security - Secure Container Deployment on LinuxDocker Security - Secure Container Deployment on Linux
Docker Security - Secure Container Deployment on Linux
 
Everything you need to know about containers security
Everything you need to know about containers securityEverything you need to know about containers security
Everything you need to know about containers security
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
 
Security on a Container Platform
Security on a Container PlatformSecurity on a Container Platform
Security on a Container Platform
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Ansiblefest 2018 Network automation journey at roblox
Ansiblefest 2018 Network automation journey at robloxAnsiblefest 2018 Network automation journey at roblox
Ansiblefest 2018 Network automation journey at roblox
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Containers @ Google

  • 1. Let Me Contain That For You Containers @ Google Victor Marmol (vmarmol@google.com) Rohit Jnagal (jnagal@google.com) SF Bay Area Large-Scale Production Engineering: Lightweight Containers Meetup February 20, 2014 Google Confidential and Proprietary
  • 2. Containers in the Wild User 1 User 2 User 3 User 4 Linux Kernel ● ● ● ● Used to provide VM-like instances High density (lower costs) and high performance Fast to start Migration is hard, but possible Google Confidential and Proprietary
  • 3. The Need for Isolation: A Shared Google Machine I/O:CPU:Mem Sensitive Task Front End Task Back End Task Alloc BACKGROUND System Daemons Batch workload TASKS Soaker workload Google Confidential and Proprietary
  • 4. Containers @ Google SS1 SS2 Sub 2 Task 1 Task 2 Sub 1 Sub 4 Sub 1 SS3 Sub 3 SS4 Sub 3 Sub 2 Alloc 1 Task 1 Task 2 Linux Kernel ● ● ● ● Container-aware tasks use asymmetric subcontainers Provide different guarantees of quality of service Overcommit resources to achieve high utilization Early users, few namespaces, and near-zero overhead Google Confidential and Proprietary
  • 5. Asymmetric Isolation Isolating only certain resources (e.g., CPU but not memory). CPU Memory Net Container 1 Container 2 Container 3 Google Confidential and Proprietary
  • 6. Containers @ Google Today ● Historically ○ ○ ○ ● ● ● ● ● 2004: No isolation 2006: Cgroups Now: Namespaces Primarily Linux cgroups + user-space policies and monitoring We skipped VMs due to high overhead Used everywhere: SaaS, PaaS, IaaS; Android, Chrome OS Heterogeneous workloads: Latency, bandwidth, and priority High task churn Google Confidential and Proprietary
  • 7. Goals ● Isolation ○ Tasks do not impact each other ○ The behavior of a Task is the same regardless of what else is on the machine ● Predictability ○ Tasks behave the same each time they run ○ Unless they are specifically configured to use "slack" ● Quality of Service ○ Different tasks get different quality of resources ● Overcommitment ○ Oversell machine resources within QoS guarantees Google Confidential and Proprietary
  • 8. lmctfy: Let Me Contain That For You Open source containers stack based on Google’s. github.com/google/lmctfy/ Provides the Container abstraction to higher levels by abstracting away the kernel interfaces. Motivation ● Existing code, systems, and design around containers ● Problems with LXC ○ ○ No abstraction (direct knob exposure) No easy way to access programmatically Google Confidential and Proprietary
  • 9. lmctfy: Let Me Contain That For You Objectives ● Abstract away enforcement: separate policy from enforcement ● Scalability and parallel access ● Intent-based container specifications ● Asymmetric isolation ● Subcontainer support ● Provides tiers of quality of service System Layers ● CL1 ○ ○ ○ Container abstraction and enforcement Thin and light layer Current lmctfy ● CL2 ○ ○ ○ Sets policy (QoS, overcommitment) Higher level logic, monitoring, and control loops Stateful entity Google Confidential and Proprietary
  • 10. lmctfy: Fine-tuned resource isolation Current cgroup API is complicated with lots of knobs (each a cgroup file): Common: 5+ files cgroup.clone_children cgroup.event_control cgroup.procs notify_on_release release_agent CPU: 8+ files cpuacct.stat cpuacct.usage cpuacct.usage_percpu cpu.cfs_period_us cpu.cfs_quota_us cpu. rt_period_us cpu.rt_runtime_us cpu.shares cpu.stat Memory: 12+ files memory.failcnt memory.force_empty memory.limit_in_bytes memory.max_usage_in_bytes memory. move_charge_at_immigrate memory.numa_stat memory.oom_control memory.pressure_level memory.soft_limit_in_bytes memory.stat memory.swappiness memory.usage_in_bytes memory. use_hierarchy Cpuset: 12+ files cpuset.cpu_exclusive cpuset.cpus cpuset.mem_exclusive cpuset.mem_hardwall cpuset. memory_migrate cpuset.memory_pressure cpuset.memory_pressure_enabled cpuset. memory_spread_page cpuset.memory_spread_slab cpuset.mems cpuset.sched_load_balance cpuset.sched_relax_domain_level +DiskIO +Net +... Google Confidential and Proprietary
  • 11. Released 0.4.0 (This Week!) Initial version of lowest layer ● Written entirely in C++ ● Delivered as a CLI and a C++ library (C and Go bindings soon) ● Isolation for CPU, memory, and perf event ● Full support for subcontainers ● “Stateless” and lightweight ● Initial support for namespaces, more to come in the next week. Can be augmented with custom kernel patches ● CPU latency and accounting ● OOM priority Supported configurations ● Target configuration is well supported ● Designed to be flexible, but we test on a limited set of them ● More target configurations being added ● Contributions to add more are welcome Google Confidential and Proprietary
  • 12. Container Specifications message ContainerSpec { optional int64 owner = 1; optional optional optional optional optional ... CpuSpec cpu = 2; MemorySpec memory = 3; DiskIoSpec diskio = 4; NetworkSpec network = 5; VirtualHost virtualhost = 6; } message CpuSpec { optional ShedulingLatency scheduling_latency = 1; optional uint64 limit = 2; optional uint64 max_limit = 3; ... } Create: “cpu:<limit:1000 max_limit:2000> memory:<limit:4096000 reservation:1024000>” Google Confidential and Proprietary
  • 13. Cgroup Specifications Create: “cpu:<limit:1000 max_limit:2000 scheduling_latency:PRIORITY> memory:<limit:4096000 reservation:1024000>” equivalent lxc cgroup config: lxc.cgroup.cpu.shares = 2048 lxc.cgroup.cpu.cfs_period_us = 50000 lxc.cgroup.cpu.cfs_quota_us = 10000 lxc.cgroup.cpu.lat = 25 .. cpu performance knobs .. lxc.cgroup.memory.limit_in_bytes = 4096000 lxc.cgroup.memory.soft_limit_in_bytes = 1024000 .. memory performance knobs .. Google Confidential and Proprietary
  • 14. C++ API ::containers::lmctfy::ContainerApi ● Create ● Get ● Destroy ● Detect ● InitMachine ::containers::lmctfy::Container ● Update ● Run ● Notifications ● List (threads, PIDs, and subcontainers) ● Stats ● Pause/Resume ● KillAll CLI is a thin wrapper around the C++ API Google Confidential and Proprietary
  • 15. Container Names Path-like hierarchy of container names: Absolute: /parent/self Relative: self when in /parent Container Name Refers To / The root top-level container /sys The sys top-level container /sys/sub The sub subcontainer of the sys top-level container . or ./ The current container (current relative to the calling process) .. The parent container (parent relative to the calling process) ./foo_container or foo_container The foo_container subcontainer of the current container /foo_container The foo_container top-level container Google Confidential and Proprietary
  • 16. Roadmap Towards Version 1.0 ● Improve VirtualHost support ● Root file systems ● Checkpoint restore ● Support and target most major distros ● Fully compatible with Docker’s use of containers Higher Layer ● Admission control and feasibility checks ● Monitoring, notifications, and statistics ● Tiers of quality of service guarantees Contributions Welcome! Google Confidential and Proprietary
  • 17. Questions? Repository: https://github.com/google/lmctfy/ Mailing list: lmctfy@googlegroups.com Victor Marmol: vmarmol@google.com Rohit Jnagal: jnagal@google.com Google Confidential and Proprietary