SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
MIRAGEOS 2.0: BRANCH CONSISTENCY 
FOR XEN STUB DOMAINS 
Dave Scott Citrix Systems 
@mugofsoup 
@eriangazag 
@avsm 
Thomas Gazagnaire University of Cambridge 
Anil Madhavapeddy University of Cambridge 
http://openmirage.org 
http://decks.openmirage.org/xendevsummit14/ 
Press <esc> to view the slide index, and the <arrow> keys to navigate.
INTRODUCING MIRAGE OS 2.0 
These slides were written using Mirage on OSX: 
They are hosted in a 938kB Xen unikernel written in statically 
type-safe OCaml, including device drivers and network stack. 
Their application logic is just a couple of source files, written 
independently of any OS dependencies. 
Running on an ARM CubieBoard2, and hosted on the cloud. 
Binaries small enough to track the entire deployment in Git!
INTRODUCING MIRAGE OS 2.0
NEW FEATURES IN 2.0 
Mirage OS 2.0 is an important step forward, supporting more, and 
more diverse, backends with much greater modularity. 
For information about the new components we cannot cover here, 
see openmirage.org: 
Xen/ARM 
Irmin 
OCaml-TLS 
Vchan 
Ctypes 
, for running unikernels on embedded devices . 
, Git-like distributed branchable storage. 
, a from-scratch native OCaml TLS stack. 
, for low-latency inter-VM communication. 
, modular C foreign function bindings.
THIS XEN DEV SUMMIT TALK 
We focus on how we have been using Mirage to: 
improve the core Xenstore toolstack using Irmin. 
a performance and distribution future for Xenstore. 
plans for upstreaming our patches. 
But first, some background...
IRMIN: MIRAGE 2.0 STORAGE 
Irmin is our library database that follows the modular design 
principles of MirageOS: https://github.com/mirage/irmin 
Runs in both userspace and kernelspace 
A key = value store (sound familiar?) 
Git-style: commit, branch, merge 
Preserves history by default 
Backend support for in-memory, Git and HTTP/REST stores. 
Mirage unikernels thus version control all their data, and have a 
distributed provenance graph of all activities.
BASE CONCEPTS 
OBJECT DAG (OR THE "BLOB STORE") 
Append-only and easily distributed. 
Provides stable serialisation of structured values. 
Backend independent storage 
memory or on-disk persistence 
encryption or plaintext 
Position and architecture independent pointers 
such as via SHA1 checksum of blocks.
BASE CONCEPTS 
HISTORY DAG (OR THE "GIT STORE") 
Append-only and easily distributed. 
Can be stored in the Object DAG store. 
Keeps track of history. 
Ordered audit log of all operations. 
Useful for merge (3-way merge is easier than 2-way) 
Snapshots and reverting operations for free.
BASE CONCEPTS
IRMIN TOOLING 
opam update && opam install irmin 
Command-line frontend that uses: 
storage: in-memory format or Git 
network: custom format, Git or HTTP/REST 
interface: JSON interface for storing content easily 
OCaml library that supplies: 
merge-friendly data structures 
backend implementations (Git, HTTP/REST)
XENSTORE: VM METADATA 
Xenstore is our configuration database that stores VM metadata in 
directories (ala Plan 9). 
Runs in either userspace or kernelspace (just like Mirage) 
A key = value store (just like Irmin) 
Logs history by default (just like Irmin...)
XENSTORE: VM METADATA 
Xenstore is our configuration database that stores VM metadata in 
directories (ala Plan 9). 
Runs in either userspace or kernelspace (just like Mirage) 
A key = value store (just like Irmin) 
Logs history by default (just like Irmin...) 
TRANSACTION_START branch; TRANSACTION_END merge 
The "original plan" in 2002 was for seamless distribution across 
hosts/clusters/clouds. What happened? Unfortunately the 
previous transaction implementations all suck.
XENSTORE: CONFLICTS 
Terrible performance impact: a transaction involves 100 RPCs 
to set it up (one per r/w op), only to be aborted and retried. 
Longer lived transactions have a greater chance of conflict vs a 
shorter transaction, repeating the longer transaction. 
Concurrent transactions can lead to live-lock: 
Try starting lots of VMs in parallel! 
Much time wasted removing transactions (from xend )
XENSTORE: CONFLICTS 
Conflicts between Xenstore transactions are so 
devastating, we try hard to avoid transactions 
altogether. However they aren't going away.
XENSTORE: CONFLICTS 
Observe: typical Xenstore transactions (eg creating domains) 
shouldn't conflict. It's a flawed merging algorithm. 
If we were managing domain configurations in git , we 
would simply merge or rebase and it would work. 
Therefore the Irmin Xenstore simply does: 
DB.View.merge_path ~origin db [] transaction >>= function 
| `Ok () -> return true 
| `Conflict msg -> 
(* if merge doesn't work, try rebase *) 
DB.View.rebase_path ~origin db [] transaction >>= function 
| `Ok () -> return true 
| `Conflict msg -> 
(* A true conflict: tell the client *) 
...
XENSTORE: PERFORMANCE
XENSTORE: TRANSACTIONS 
Big transactions give you high-level intent 
useful for debug and tracing 
minimise merge commits (1 per transaction) 
minimise backend I/O (1 op per commit) 
crash during transaction can tell the client to "abort retry" 
Solving the performance problems with big 
transactions in previous implementations greatly 
improves the overall health of Xenstore.
XENSTORE: RELIABILITY 
What happens if Xenstore crashes? 
Rings full of partially read/written packets. No reconnection 
protocol in common use. 
proposal on xen-devel but years before we can rely on it 
Per-connection state in Xenstore: 
watch registrations, pending watch events 
If Xenstore is restarted, many of the rings will be broken 
... you'll probably have to reboot the host
XENSTORE: RELIABILITY 
Irmin to the rescue! 
Data structure libraries built on top of Irmin, for example 
mergeable queues. Use these for (eg) pending watch events. 
We can persist partially read/written packets so fragments can 
be recovered over restart 
We can persist connection information (i.e. ring information 
from an Introduce) and auto-reconnect on start 
Added bonus: easy to introspect state via xenstore-ls , can 
see each registered watch, queue etc
XENSTORE: TRACING 
When a bug is reported normal procedure is: 
stare at Xenstore logs for a very long time 
slowly deduce the state at the time the bug manifested 
(swearing and cursing is strictly optional) 
With Irmin+Xenstore, one can simply: 
git checkout to the revision 
Inspect the state with ls 
In the future: git bisect automation!
XENSTORE: TRACING 
$ git log --oneline --graph --decorate --all 
... 
| | * | 1787fd2 Domain 0: merging transaction 394 
| | |/ 
| * | 0d1521c Domain 0: merging transaction 395 
| |/ 
* | 731356e Domain 0: merging transaction 396 
|/ 
* 8795514 Domain 0: merging transaction 365 
* 74f35b5 Domain 0: merging transaction 364 
* acdd503 Domain 0: merging transaction 363
XENSTORE: DATA STORAGE 
Xenstore contains VM metadata ( /vm ) and domain metadata 
( /local/domain ) 
But VM metadata is duplicated elsewhere and copied in/out 
xl config files, and xapi database 
(insert cloud toolstack here) 
With current daemons, it is unwise to persist large data. 
What if Xenstore could store and distribute this 
data efficiently, and if application data could be 
persisted reliably?
XENSTORE: THE DATA 
Irmin to the rescue! 
Check in VM metadata to Irmin 
clone , pull and push to move between hosts 
expose to host via FUSE, for Plan9 filesystem goodness 
maybe one day even echo start > VM/uuid/ctl 
FUSE code at 
https://github.com/dsheets/profuse 
VM data could be checked in to Irmin 
very important for unikernels that have no native storage
XENSTORE: UPSTREAMING 
Advanced prototype exists using Mirage libraries, but doesn't fully 
pass unit test suite. Before upstreaming: 
Write fixed-size backend for block device 
Preserving history is a good default, but history does need to 
be squashed from time to time. 
Upstream patches: 
switch to using using opam to build Xenstore 
reproducible builds via a custom Xen remote 
allows using modern OCaml libraries (Lwt, Mirage, etc...) 
In Xapi, delete existing db and replace with Xenstore 2.0
XENSTORE: CODE 
Prototype+unit tests at: 
(can build without Xen on MacOS X now) 
https://github.com/mirage/ocaml-xenstore-server 
opam init --comp=4.01.0 
eval `opam config env` 
opam pin irmin git://github.com/mirage/irmin 
opam install xenstore irmin shared-memory-ring xen-evtchn io-page 
git clone git://github.com/mirage/ocaml-xenstore-server 
cd ocaml-xenstore-server 
make 
./main.native --enable-unix --path /tmp/test-socket --database /tmp/db& 
./cli.native -path /tmp/test-socket write foo=bar 
./cli.native -path /tmp/test-socket write read foo 
cd /tmp/db; git log
HTTP://OPENMIRAGE.ORG/ 
Featuring blog posts about Mirage OS 2.0 by: 
Amir Chaudhry , Thomas Gazagnaire , David Kaloper 
, 
Thomas Leonard , Jon Ludlam , Hannes Mehnert , Mindy Preston 
, 
Dave Scott , and Jeremy Yallop 
. 
Mindy Preston and Jyotsna Prakash from OPW/GSoC will also be 
talking about their projects in the community panel! 
More Irmin+Xenstore posts with details: 
Introduction to Irmin 
Using Irmin to add fault-tolerance to Xenstore

Contenu connexe

Tendances

Tendances (20)

CIF16/Scale14x: The latest from the Xen Project (Lars Kurth, Chairman of Xen ...
CIF16/Scale14x: The latest from the Xen Project (Lars Kurth, Chairman of Xen ...CIF16/Scale14x: The latest from the Xen Project (Lars Kurth, Chairman of Xen ...
CIF16/Scale14x: The latest from the Xen Project (Lars Kurth, Chairman of Xen ...
 
Unikernels: Rise of the Library Hypervisor
Unikernels: Rise of the Library HypervisorUnikernels: Rise of the Library Hypervisor
Unikernels: Rise of the Library Hypervisor
 
Simplify Networking for Containers
Simplify Networking for ContainersSimplify Networking for Containers
Simplify Networking for Containers
 
Proxmox for DevOps
Proxmox for DevOpsProxmox for DevOps
Proxmox for DevOps
 
The sexy world of Linux kernel pvops project
The sexy world of Linux kernel pvops projectThe sexy world of Linux kernel pvops project
The sexy world of Linux kernel pvops project
 
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
 
FreeBSD is not Linux
FreeBSD is not LinuxFreeBSD is not Linux
FreeBSD is not Linux
 
Proxmox ve-datasheet
Proxmox ve-datasheetProxmox ve-datasheet
Proxmox ve-datasheet
 
Docker, Linux Containers (LXC), and security
Docker, Linux Containers (LXC), and securityDocker, Linux Containers (LXC), and security
Docker, Linux Containers (LXC), and security
 
Virtualization Architecture & KVM
Virtualization Architecture & KVMVirtualization Architecture & KVM
Virtualization Architecture & KVM
 
CIF16: Knock, Knock: Unikernels Calling! (Richard Mortier, Cambridge University)
CIF16: Knock, Knock: Unikernels Calling! (Richard Mortier, Cambridge University)CIF16: Knock, Knock: Unikernels Calling! (Richard Mortier, Cambridge University)
CIF16: Knock, Knock: Unikernels Calling! (Richard Mortier, Cambridge University)
 
SecurityPI - Hardening your IoT endpoints in Home.
SecurityPI - Hardening your IoT endpoints in Home. SecurityPI - Hardening your IoT endpoints in Home.
SecurityPI - Hardening your IoT endpoints in Home.
 
Open v switch20150410b
Open v switch20150410bOpen v switch20150410b
Open v switch20150410b
 
FreeBSD and Hardening Web Server
FreeBSD and Hardening Web ServerFreeBSD and Hardening Web Server
FreeBSD and Hardening Web Server
 
IITCC15: Xen Project 4.6 Update
IITCC15: Xen Project 4.6 UpdateIITCC15: Xen Project 4.6 Update
IITCC15: Xen Project 4.6 Update
 
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
 
Project kronos open_stack_design_summit
Project kronos open_stack_design_summitProject kronos open_stack_design_summit
Project kronos open_stack_design_summit
 
Kubernetes 1001
Kubernetes 1001Kubernetes 1001
Kubernetes 1001
 
Xen Project CI for OpenStack Overview
Xen Project CI for OpenStack OverviewXen Project CI for OpenStack Overview
Xen Project CI for OpenStack Overview
 
Practical CNI
Practical CNIPractical CNI
Practical CNI
 

Similaire à XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavapeddy, Univeristy of Cambridge

GlusterFS Update and OpenStack Integration
GlusterFS Update and OpenStack IntegrationGlusterFS Update and OpenStack Integration
GlusterFS Update and OpenStack Integration
Etsuji Nakai
 

Similaire à XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavapeddy, Univeristy of Cambridge (20)

Automated Application Management with SaltStack
Automated Application Management with SaltStackAutomated Application Management with SaltStack
Automated Application Management with SaltStack
 
OSCON14: Mirage 2.0
OSCON14: Mirage 2.0 OSCON14: Mirage 2.0
OSCON14: Mirage 2.0
 
CoreOS, or How I Learned to Stop Worrying and Love Systemd
CoreOS, or How I Learned to Stop Worrying and Love SystemdCoreOS, or How I Learned to Stop Worrying and Love Systemd
CoreOS, or How I Learned to Stop Worrying and Love Systemd
 
Gabriele Santomaggio - Inside Elixir/Erlang - Codemotion Milan 2018
Gabriele Santomaggio - Inside Elixir/Erlang - Codemotion Milan 2018Gabriele Santomaggio - Inside Elixir/Erlang - Codemotion Milan 2018
Gabriele Santomaggio - Inside Elixir/Erlang - Codemotion Milan 2018
 
The State of Linux Containers
The State of Linux ContainersThe State of Linux Containers
The State of Linux Containers
 
Squash Those IoT Security Bugs with a Hardened System Profile
Squash Those IoT Security Bugs with a Hardened System ProfileSquash Those IoT Security Bugs with a Hardened System Profile
Squash Those IoT Security Bugs with a Hardened System Profile
 
Metal-k8s presentation by Julien Girardin @ Paris Kubernetes Meetup
Metal-k8s presentation by Julien Girardin @ Paris Kubernetes MeetupMetal-k8s presentation by Julien Girardin @ Paris Kubernetes Meetup
Metal-k8s presentation by Julien Girardin @ Paris Kubernetes Meetup
 
Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned  Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned
 
Oracle 11g R2 RAC setup on rhel 5.0
Oracle 11g R2 RAC setup on rhel 5.0Oracle 11g R2 RAC setup on rhel 5.0
Oracle 11g R2 RAC setup on rhel 5.0
 
GlusterFS Update and OpenStack Integration
GlusterFS Update and OpenStack IntegrationGlusterFS Update and OpenStack Integration
GlusterFS Update and OpenStack Integration
 
Automating Your CloudStack Cloud with Puppet
Automating Your CloudStack Cloud with PuppetAutomating Your CloudStack Cloud with Puppet
Automating Your CloudStack Cloud with Puppet
 
An Ensemble Core with Docker - Solving a Real Pain in the PaaS
An Ensemble Core with Docker - Solving a Real Pain in the PaaS An Ensemble Core with Docker - Solving a Real Pain in the PaaS
An Ensemble Core with Docker - Solving a Real Pain in the PaaS
 
Docker, Linux Containers, and Security: Does It Add Up?
Docker, Linux Containers, and Security: Does It Add Up?Docker, Linux Containers, and Security: Does It Add Up?
Docker, Linux Containers, and Security: Does It Add Up?
 
Learning from ZFS to Scale Storage on and under Containers
Learning from ZFS to Scale Storage on and under ContainersLearning from ZFS to Scale Storage on and under Containers
Learning from ZFS to Scale Storage on and under Containers
 
Hacktivity2014: Virtual Machine Introspection to Detect and Protect
Hacktivity2014: Virtual Machine Introspection to Detect and ProtectHacktivity2014: Virtual Machine Introspection to Detect and Protect
Hacktivity2014: Virtual Machine Introspection to Detect and Protect
 
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides
 
Automating CloudStack with Puppet - David Nalley
Automating CloudStack with Puppet - David NalleyAutomating CloudStack with Puppet - David Nalley
Automating CloudStack with Puppet - David Nalley
 
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
 
Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning
 
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISORLOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
 

Plus de The Linux Foundation

Plus de The Linux Foundation (20)

ELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made SimpleELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made Simple
 
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
 
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
 
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
 
XPDDS19 Keynote: Unikraft Weather Report
XPDDS19 Keynote:  Unikraft Weather ReportXPDDS19 Keynote:  Unikraft Weather Report
XPDDS19 Keynote: Unikraft Weather Report
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
 
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, XilinxXPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
 
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
 
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, BitdefenderXPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
 
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
 
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making... OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, CitrixXPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
 
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltdXPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
 
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
 
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&DXPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsXPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
 
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
 
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
 
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSEXPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavapeddy, Univeristy of Cambridge

  • 1. MIRAGEOS 2.0: BRANCH CONSISTENCY FOR XEN STUB DOMAINS Dave Scott Citrix Systems @mugofsoup @eriangazag @avsm Thomas Gazagnaire University of Cambridge Anil Madhavapeddy University of Cambridge http://openmirage.org http://decks.openmirage.org/xendevsummit14/ Press <esc> to view the slide index, and the <arrow> keys to navigate.
  • 2. INTRODUCING MIRAGE OS 2.0 These slides were written using Mirage on OSX: They are hosted in a 938kB Xen unikernel written in statically type-safe OCaml, including device drivers and network stack. Their application logic is just a couple of source files, written independently of any OS dependencies. Running on an ARM CubieBoard2, and hosted on the cloud. Binaries small enough to track the entire deployment in Git!
  • 4. NEW FEATURES IN 2.0 Mirage OS 2.0 is an important step forward, supporting more, and more diverse, backends with much greater modularity. For information about the new components we cannot cover here, see openmirage.org: Xen/ARM Irmin OCaml-TLS Vchan Ctypes , for running unikernels on embedded devices . , Git-like distributed branchable storage. , a from-scratch native OCaml TLS stack. , for low-latency inter-VM communication. , modular C foreign function bindings.
  • 5. THIS XEN DEV SUMMIT TALK We focus on how we have been using Mirage to: improve the core Xenstore toolstack using Irmin. a performance and distribution future for Xenstore. plans for upstreaming our patches. But first, some background...
  • 6. IRMIN: MIRAGE 2.0 STORAGE Irmin is our library database that follows the modular design principles of MirageOS: https://github.com/mirage/irmin Runs in both userspace and kernelspace A key = value store (sound familiar?) Git-style: commit, branch, merge Preserves history by default Backend support for in-memory, Git and HTTP/REST stores. Mirage unikernels thus version control all their data, and have a distributed provenance graph of all activities.
  • 7. BASE CONCEPTS OBJECT DAG (OR THE "BLOB STORE") Append-only and easily distributed. Provides stable serialisation of structured values. Backend independent storage memory or on-disk persistence encryption or plaintext Position and architecture independent pointers such as via SHA1 checksum of blocks.
  • 8. BASE CONCEPTS HISTORY DAG (OR THE "GIT STORE") Append-only and easily distributed. Can be stored in the Object DAG store. Keeps track of history. Ordered audit log of all operations. Useful for merge (3-way merge is easier than 2-way) Snapshots and reverting operations for free.
  • 10. IRMIN TOOLING opam update && opam install irmin Command-line frontend that uses: storage: in-memory format or Git network: custom format, Git or HTTP/REST interface: JSON interface for storing content easily OCaml library that supplies: merge-friendly data structures backend implementations (Git, HTTP/REST)
  • 11. XENSTORE: VM METADATA Xenstore is our configuration database that stores VM metadata in directories (ala Plan 9). Runs in either userspace or kernelspace (just like Mirage) A key = value store (just like Irmin) Logs history by default (just like Irmin...)
  • 12. XENSTORE: VM METADATA Xenstore is our configuration database that stores VM metadata in directories (ala Plan 9). Runs in either userspace or kernelspace (just like Mirage) A key = value store (just like Irmin) Logs history by default (just like Irmin...) TRANSACTION_START branch; TRANSACTION_END merge The "original plan" in 2002 was for seamless distribution across hosts/clusters/clouds. What happened? Unfortunately the previous transaction implementations all suck.
  • 13. XENSTORE: CONFLICTS Terrible performance impact: a transaction involves 100 RPCs to set it up (one per r/w op), only to be aborted and retried. Longer lived transactions have a greater chance of conflict vs a shorter transaction, repeating the longer transaction. Concurrent transactions can lead to live-lock: Try starting lots of VMs in parallel! Much time wasted removing transactions (from xend )
  • 14. XENSTORE: CONFLICTS Conflicts between Xenstore transactions are so devastating, we try hard to avoid transactions altogether. However they aren't going away.
  • 15. XENSTORE: CONFLICTS Observe: typical Xenstore transactions (eg creating domains) shouldn't conflict. It's a flawed merging algorithm. If we were managing domain configurations in git , we would simply merge or rebase and it would work. Therefore the Irmin Xenstore simply does: DB.View.merge_path ~origin db [] transaction >>= function | `Ok () -> return true | `Conflict msg -> (* if merge doesn't work, try rebase *) DB.View.rebase_path ~origin db [] transaction >>= function | `Ok () -> return true | `Conflict msg -> (* A true conflict: tell the client *) ...
  • 17. XENSTORE: TRANSACTIONS Big transactions give you high-level intent useful for debug and tracing minimise merge commits (1 per transaction) minimise backend I/O (1 op per commit) crash during transaction can tell the client to "abort retry" Solving the performance problems with big transactions in previous implementations greatly improves the overall health of Xenstore.
  • 18. XENSTORE: RELIABILITY What happens if Xenstore crashes? Rings full of partially read/written packets. No reconnection protocol in common use. proposal on xen-devel but years before we can rely on it Per-connection state in Xenstore: watch registrations, pending watch events If Xenstore is restarted, many of the rings will be broken ... you'll probably have to reboot the host
  • 19. XENSTORE: RELIABILITY Irmin to the rescue! Data structure libraries built on top of Irmin, for example mergeable queues. Use these for (eg) pending watch events. We can persist partially read/written packets so fragments can be recovered over restart We can persist connection information (i.e. ring information from an Introduce) and auto-reconnect on start Added bonus: easy to introspect state via xenstore-ls , can see each registered watch, queue etc
  • 20. XENSTORE: TRACING When a bug is reported normal procedure is: stare at Xenstore logs for a very long time slowly deduce the state at the time the bug manifested (swearing and cursing is strictly optional) With Irmin+Xenstore, one can simply: git checkout to the revision Inspect the state with ls In the future: git bisect automation!
  • 21. XENSTORE: TRACING $ git log --oneline --graph --decorate --all ... | | * | 1787fd2 Domain 0: merging transaction 394 | | |/ | * | 0d1521c Domain 0: merging transaction 395 | |/ * | 731356e Domain 0: merging transaction 396 |/ * 8795514 Domain 0: merging transaction 365 * 74f35b5 Domain 0: merging transaction 364 * acdd503 Domain 0: merging transaction 363
  • 22. XENSTORE: DATA STORAGE Xenstore contains VM metadata ( /vm ) and domain metadata ( /local/domain ) But VM metadata is duplicated elsewhere and copied in/out xl config files, and xapi database (insert cloud toolstack here) With current daemons, it is unwise to persist large data. What if Xenstore could store and distribute this data efficiently, and if application data could be persisted reliably?
  • 23. XENSTORE: THE DATA Irmin to the rescue! Check in VM metadata to Irmin clone , pull and push to move between hosts expose to host via FUSE, for Plan9 filesystem goodness maybe one day even echo start > VM/uuid/ctl FUSE code at https://github.com/dsheets/profuse VM data could be checked in to Irmin very important for unikernels that have no native storage
  • 24. XENSTORE: UPSTREAMING Advanced prototype exists using Mirage libraries, but doesn't fully pass unit test suite. Before upstreaming: Write fixed-size backend for block device Preserving history is a good default, but history does need to be squashed from time to time. Upstream patches: switch to using using opam to build Xenstore reproducible builds via a custom Xen remote allows using modern OCaml libraries (Lwt, Mirage, etc...) In Xapi, delete existing db and replace with Xenstore 2.0
  • 25. XENSTORE: CODE Prototype+unit tests at: (can build without Xen on MacOS X now) https://github.com/mirage/ocaml-xenstore-server opam init --comp=4.01.0 eval `opam config env` opam pin irmin git://github.com/mirage/irmin opam install xenstore irmin shared-memory-ring xen-evtchn io-page git clone git://github.com/mirage/ocaml-xenstore-server cd ocaml-xenstore-server make ./main.native --enable-unix --path /tmp/test-socket --database /tmp/db& ./cli.native -path /tmp/test-socket write foo=bar ./cli.native -path /tmp/test-socket write read foo cd /tmp/db; git log
  • 26. HTTP://OPENMIRAGE.ORG/ Featuring blog posts about Mirage OS 2.0 by: Amir Chaudhry , Thomas Gazagnaire , David Kaloper , Thomas Leonard , Jon Ludlam , Hannes Mehnert , Mindy Preston , Dave Scott , and Jeremy Yallop . Mindy Preston and Jyotsna Prakash from OPW/GSoC will also be talking about their projects in the community panel! More Irmin+Xenstore posts with details: Introduction to Irmin Using Irmin to add fault-tolerance to Xenstore