SlideShare une entreprise Scribd logo
1  sur  105
VIRTUALSQUARE



 all the virtuality you wanted
  but you were afraid to ask

      Rome, March 5th 2011

                Renzo Davoli
               Università di Bologna
(Master in Scienze e Tecnologie del Software Libero)
        (Associazione per il Software Libero)

                  Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Virtual...

             Time                 Execution Environment
user-id
                              Device
           Machine
                              Networking
          Memory

                     File System

                 Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
LXC           schroot           libvirt        chroot

                                    GXemul
                   Libguestfs                                                 Bochs
                                                         PearPC
                         Qemu          tinc
      VDE
                                                                            fuse-ext2
                                                           Open-VZ
                   Marionnet       User-mode Linux
    UnionFS                                                           JVM        Umview
                                                     PureLibc
               VirtualBOX                FUSE
FairVPN
                                                                              SPICE
                                                         LWIPv6
  fakeroot                               VirtualBricks
                         View-OS
                                                                            fakeroot-ng
             KVM
                                     Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
What does Virtualization mean?




          Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
User of A


                                               I(A) = I(A')
                                               A' can be used instead of A



Interface         I(A)                    Interface           I(A')


  Well Known service                            New Service

            A                                         A'



                         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
What is “real”
●   The service A can be hardware
                                                                       User of A


       –   Machine                                                                        I(A) = I(A')
                                                                                          A' can be used instead of A


       –   Memory                             Interface         I(A)               Interface           I(A')


                                                Well Known service                       New Service

       –   Network                                        A                                    A'




●   The service A can be software
       –   File system
       –   Execution Environment
       –   Identity

                         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Why virtual?
●   Flexibility
         –   prototyping
         –   modify features at run-time (“real” may be hard to
               modify: e.g. hardware or kernel code).
         –   Satisfy several requirements while sharing common
               structures
●   Safety
         –   least privilege
         –   sandboxing
●   Optimization
         –   Server/service consolidation
         –   No need to maintain several “real” items

                               Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Simulation-Emulation-Virtuality
●   Simulation:
        –   Provide just a model of the phenomena to
              study
        –   Simulation never provides virtualization
●   Emulation:
        –   It means: Behave in the same way.
        –   Can provide virtualization if usable (e.g. It is
             usable)
●   Virtuality can be provided without emulation
        –   e.g. LXC
                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Virtual Machines (Smith-Nair)




        Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Virtual Machines
●   Intrusiveness:
       –   User-mode user-access
       –   User-mode superuser-access
       –   Kernel patch/module
       –   Native
●   Paravirtualization:
       –   Change the user interface to optimize
            virtualization

                        Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Virtual Machines

                 User                       User



 User


             Virtualization             Virtualization



Service


                Service                    Service




             Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Virtual Machine (multiplexing)


                User               User                User



 User




                       Virtually multiplexed service
Service




                 Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Virtual Machines:

●   Qemu-system: system-vm, processor
    emulation (by direct code translation),
    user-space/user-permission
●   GXEmul/PearPC/Mac-on-Linux: system-vm,
    processor emulation, user-space/user-
    permission



                     Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Virtual Machines
●   KVM: system-vm, same-instruction set, may
    provide paravitualization (virtio/vhost-net), user-
    mode monitor, requires processor extensions and
    kernel module (Linux specific/optimized).
●   VirtualBOX(OSE): system-vm, same-instruction set,
    may provide paravitualization, user-mode monitor,
    requires processor extensions and kernel module (it
    runs on several Operating Systems).
●   XEN: system-vm, same-instruction set, provides
    paravirtualization, native mode, multiplexing.


                         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
operating system level VM

●   LXC: Linux Containers/Namespaces: native
    mode, multiplexing, user-mode superuser-
    access (it provides partial virtualization/sharing
    between containers).
●   OpenVZ: native mode, multiplexing, superuser-
    access.




                       Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
process VM
●   User Mode Linux: A linux kernel is the VM
    monitor for processes. Process VM. User-
    mode, user-access. (almost) the same
    interface (system call set, maybe the
    version of the kernel may differ).
●   View-OS (umview/kmview): partial-modular
    virtualization. Process VM. User-mode,
    user-access. (almost) the same interface
    (users may define new system calls).

                    Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
process VM

●   Qemu: process emulation (direct
    translation), user-mode, user-access.
●   Application VM (JVM, Mono, ….)




                     Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Network Virtualization
●   Virtual Private Networks
         –   For secure remote access
●   Overlay Networks
         –   e.g. Akamai, p2p
●   Networks for Virtual Machines
●   Kernel bridge based virtual networks
●   Virtual Distributed Ethernet: data-link
    layer Ethernet consistent, user-mode,
    user access.      Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
File system virtualization

●   Chroot/schroot
●   Fakeroot/fakeroot-ng
●   FUSE
       –   Fuse-ext2
       –   Fuse-ssh
       –   Fuse-*


                       Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Modular Virtualization
●   View-OS: modular partial virtual machine
        – Umview: based on ptrace (user-mode, user access)
        – Kmview: based on utrace/kernel module (user-
            mode)
●   Several modules available:
        – File system (mount)
        – File system (patchworking)
        – Device
        – Uname/time...
        – Networking
●   Chroot/fakeroot/fuse/vpn/binfmt... features have been
    implemented on View-OS.
                          Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
the VirtualSquare view




        Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Virtualsquare
●   Virtualsquare is:
        –   A community....
        –   A containers of projects....
●   Virtualsquare is not:
        –   A company
        –   A brand/product line
●   Virtualsquare started at the University of Bologna but now
    it is an international community
        –   A lot of former students now work abroad
        –   Common ideas with other groups (joint projects)

                              Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
the VirtualSquare view
●   Communication/compatibility
    –   different Virtualities must be interconnected, must
        communicate
●   Integration
    –   different Virtualities can be seen as special cases
        of a broaden idea of Virtuality
●   Extension
    –   if a need cannot yet be captured by a kind of
        virtuality, let us create a new one (maybe
        combining existing virtualities).
                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Communication compatibility
              ●   VDE
  KVM                   –   General purpose user-mode
                             networking support                                 bochs

                        –   Ethernet data-link consistent
VirtualBOX
                        –   Distributed                                         tuntap
                        –   Intuitive (it has the same
User-mode
                              structure of real Ethernet:
  linux                       switches, cables)                                   libvirt




             View-OS                                                LWIPv6

                                   Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Integration
●   View-OS=each process can have its “view” of the
    environment.
●   User-mode/user-permission partial virtual machine approach
●   The features of several existing tools have been re-
    implemented as composable modules.

                                                       VPN
                           fakeroot(ng)
          (s)chroot                                                 binutils
                                             Virtual networking
                                View-OS
           FUSE       UnionFS                     Virtual devices
                                      LXC
                                   user-mode

                                Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
View-OS




GLOBAL VIEW
ASSUMPTION

          Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Extension
●   A new forge for several new concepts and ideas/tools:
        –   Msocket: support for multi-stack applications
        –   LWIPv6: user-space LWIPv4/LWIPv6 hybrid
             networking (multi)stack as a library
        –   Purelibc: process self virtualization
        –   Relativistic virtualization of time: emulation of
             fast machines on slow ones.
        –   Virtual spaces per login shells.
        –   Public Distributed Ethernets
        –   Run-time on-the-fly virtualization

                            Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
VirtualSquare: the book
             Renzo Davoli, Michael Goldweber
             (editors)
             Virtual Square: Users,
             Programmers & Developers Guide
        ●    Available at lulu books or
             downloadable from
             wiki.virtualsquare.org
        ●    Warning: this book is dynamically
             changing as the project evolves




            Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
...a closer look
on virtualsquare projects...




          Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
VDE components
              SWITCH                                   SWITCH




                           CROSS
                           CABLE




              VDE SWITCH                              VDE SWITCH

     VM          VDE        VdeWire       VDE          VM               TunTap
(e.g. QEMU)      plug      (e.g. ssh)     plug     (e.g. U-ML)       Linux Module


                                  Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
VDE: Related Work
●   VPN: (OpenVPN) point2point, for real machines
●   Overlay Networks: specific for application (peer
    to peer, Akamai).
●   VM networking: (tools provided with VM, e.g.
    uml-switch) specific for VM
VDE:
●   multipoint, general mesh
●   no need for root (administration) access
●   heterogeneous VM and non VM connected
                       Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
VDEv2: advertisement
●   VDEv2:
    –   modular design
    –   compatible with user-mode linux, qemu, tuntap,
        (bochs, plex86), umview/lwipv6
    –   through the vdetaplib potentially compatible any
        application using tap
    –   VLAN (802.1Q)
    –   FST (fast spanning tree)
    –   run time maneageable via unixterm (telnet or
        web with vdetelweb)
    –   includes slirpvde and wirefilter
    –   status debug
    –   plugin support: snmp/iplog/pdump
                         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
VDEv2
●   VDE-Switch
    –   number of ports configurable on command
        line
    –   port0 is reserved for management clients, n-1
        ports are available for connections.
    –   management UNIX socket for management
        clients
         ●   self-describing SMTP-like protocol
    –   modules: datasock (VM conn), tuntap,
        consmgmt (management)
                             Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
VDE cables


●   VDE-plug
    –   is a VM that converts the Ethernet packets of
        a VDE port into a stream connection (stdin-
        stdout)
●   VDE-wire
    –   can be any application able to give a
        stdin/stdout stream connection

                        Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Dual Pipe
●   dpipe is a new (general purpose)
    command we have added.
●   Pipe are well known abstractions. The
    following command prints the list of the
    current directory:        ls        lpr
    ls | lpr
●   Dpipe creates a bi-irectional connection
    between the processes
    dpipe cmd1 = cmd2                  cmd1                cmd2

                        Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
VDE cables, plugs, wires/dpipe
●   dpipe is used to create VDE-cables:
dpipe vde_plug = ssh vde.students.cs.unibo.it vde_plug
●   this command connects by a dpipe the local
    vde_plug with a vde_plug running on a remote host
    (the wire is ssh)
●   other applications can be used as wire (e.g.netcat)
●   In the example vde_plug refers to the default
    switch. It is possible to run several switches on the
    same host, an extra option is needed in this case.


                         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
wirefilter
●   wirefilter can be put on a cable (e.g. for network
    testing)
dpipe vde_plug /tmp/s1 = wirefilter -m /tmp/m = vde_plug /tmp/s2
wirefilter -v /tmp/s1:/tmp/s2
●   packet loss, delays, dup, speed, noise figures,
    mtu, fifoness properties of the line can be
    changed with command line options or real
    time via a management socket.
●   It is possible to define several “states”. The
    state transition is driven by a Markov-chain

                                Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
SlirpVDE

                 VDE SWITCH                                 VDE SWITCH



10.0.2.15            VDE           VdeWire         VDE               10.0.2.16
                     plug         (e.g. ssh)       plug

        VM                                                       VM
   (e.g. QEMU)                                               (e.g. U-ML)

                   Note: slirp supports IPv4
                   slirpv6 supports both ipv4 and ipv6                        10.0.2.2
                                                                             SlirpVDE
      Firefox


                                                          http connection from slirpVDE
                                       Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
                                                          running on the hosting O.S. to
vde_cryptcab
●   Coded by Daniele Lacamera (danielinux)
●   A vde_cryptcab is a distributed cable
    manager for VDE switches.
●   Server side
    vde_cryptcab -s /tmp/vde2.ctl -p 2100
●   Client side
    vde_cryptcab -s /tmp/vde2.ctl -c foo@remote.machine.org:2100
●   use a blowfish channel (random key exchanged
    by scp).
                             Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Marionnet (based on VDE)
●   A project by Jean-Vincent Loddo and Luca
    Saiu (et al) Université Paris 13.




                    Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
TINC
●   tinc is a Virtual Private Network (VPN)
    daemon that uses tunnelling and
    encryption to create a secure private
    network between hosts on the Internet.
●   Encryption, authentication and compression
●   Automatic full mesh routing
●   Easily expand your VPN
●   Ability to bridge ethernet segments
●   Runs on many operating systems and supports Ipv6
●   A project by Ivo Timmermans and Guus Sliepen
                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
LWIPv6
●   It is a LWIPv4/v6 (multi) stack implemented as a library.
●   Fork project from LWIP project (Adam Dunkels <adam@sics.se>)
●   Can be connected to any number of VDE, TUN, TAP
    interfaces.
●   It is a hybrid stack (not a dual-stack). One single Ipv6
    “engine” is able also to manage Ipv4 packets in
    compatibility mode
    (130.136.1.110 is managed as 0::ffff:130.136.1.110).




                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
LWIPv6
●   PF_INET, PF_INET6
●   PF_PACKET for raw packet management
    –   support for user-level network analysis tools (e.g.
        sniffers, ethereal)
    –   support for user-level dhcp clients.
●   PF_NETLINK for configuration
●   Packet filtering
●   NEW: dhcp client/server, rarpd, slirp, routing, nat
    on request

                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
LWIPv6 interface definition API
struct stack *lwip_stack_new(void);
void lwip_stack_free(struct stack *stack);
struct stack *lwip_stack_get(void);
void lwip_stack_set(struct stack *stack);

struct netif *lwip_vdeif_add(struct stack *stack, void *arg);
struct netif *lwip_tapif_add(struct stack *stack, void *arg);
struct netif *lwip_tunif_add(struct stack *stack, void *arg);

int lwip_add_addr(struct netif *netif, struct ip_addr *ipaddr, struct ip_addr *netmask);
int lwip_del_addr(struct netif *netif, struct ip_addr *ipaddr, struct ip_addr *netmask);

int lwip_add_route(struct stack *stack,
                        struct ip_addr *addr, struct ip_addr *netmask,
                        struct ip_addr *nexthop, struct netif *netif, int flags);
int lwip_del_route(struct stack *stack,
                        struct ip_addr *addr, struct ip_addr *netmask,
                        struct ip_addr *nexthop, struct netif *netif, int flags);

int lwip_ifup(struct netif *netif);
int lwip_ifdown(struct netif *netif);


                                         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
LWIPv6 socket API                                  (just add lwip_ prefix)
int lwip_msocket(struct stack *stack, int domain, int type, int protocol);
int lwip_socket(int domain, int type, int protocol);

int   lwip_accept(int s, struct sockaddr *addr, socklen_t *addrlen);
int   lwip_bind(int s, struct sockaddr *name, socklen_t namelen);
int   lwip_shutdown(int s, int how);
int   lwip_getpeername (int s, struct sockaddr *name, socklen_t *namelen);
int   lwip_getsockname (int s, struct sockaddr *name, socklen_t *namelen);
int   lwip_getsockopt (int s, int level, int optname, void *optval, socklen_t *optlen);
int   lwip_setsockopt (int s, int level, int optname, const void *optval, socklen_t optlen);
int   lwip_close(int s);
int   lwip_connect(int s, struct sockaddr *name, socklen_t namelen);
int   lwip_listen(int s, int backlog);
int   lwip_recv(int s, void *mem, int len, unsigned int flags);
int   lwip_read(int s, void *mem, int len);
int   lwip_recvfrom(int s, void *mem, int len, unsigned int flags,
         struct sockaddr *from, socklen_t *fromlen);
int   lwip_send(int s, void *dataptr, int size, unsigned int flags);
int   lwip_sendto(int s, void *dataptr, int size, unsigned int flags,
        struct sockaddr *to, socklen_t tolen);
int   lwip_write(int s, void *dataptr, int size);
int   lwip_select(int maxfdp1, fd_set *readset, fd_set *writeset, fd_set *exceptset,
                struct timeval *timeout);
int   lwip_ioctl(int s, long cmd, void *argp);
                                                 Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
LWIPv6: New features
●   Packet forwarding
●   Filtering
●   NAT
●   DHCP server/RADV server onboard
●   SLIRP (v4 and v6)
          struct netif *lwip_add_slirpif(struct stack
            *stack, void *arg, int flags);



                        Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
slirpvde6

●   Extension of slirpvde based on LWIPv6
       –   slirp ipv4/ipv6
       –   Stateless translator
       –   Dhcp/radv server
       –   DNS forwarder
       –   Port and X forwarding (in and out)
           slirpvde6 -d -H10.0.2.1/24 -H2001::1/64 -s
             /tmp/vde.ctl -dhcp -r

                         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Berkeley sockets API: problem #1
●   The Berkeley Sockets API has been
    designed for one protocol stack (per
    protocol family).
    –   Multiple stacks => different networking
        features (per user, per application...)
●   Unix uses the file system as a naming
    space for everything (devices, kernel
    variables, ...) except for networking.
    –   Access control to networking
                       Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Solution #1: msockets
#include <msocket.h>
int msocket(char *path, int domain, int type, int protocol);
●   Path is the pathname of the stack
●   domain/type/protocol are the same defined in socket(2).
●   A stack is a special file (new type of special file, see stat(2)):
      #define S_IFSTACK 0160000
●   Each process has a default stack for each protocol family (domain).
     –   If path==NULL, msocket uses the default stack.
●   It is backwards compatible.
     #define socket(d,t,p) msocket(NULL,(d),(t),(p))

                                     Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Msockets: set the default stack


int msocket(char *path, int domain, int type, int protocol);
●   if type==SOCK_DEFAULT msocket sets the default stack. e.g.
     msocket("/dev/net/ipstack2",PF_INET,SOCK_DEFAULT,0);
     defines /dev/net/ipstack2 as the default stack for Ipv4
●   if type==SOCK_DEFAULT && domain==PF_UNSPEC msocket sets the default
    stack for all the protocol families.




                                  Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Mstack: backward compatibility
●   Mstack uses msocket: it defines the default stack so that existing
    applications can use different stacks.
$ ip addr
..... ip addr on default net
$ mstack /dev/net/ipstack2 ip addr
.... ip addr of “ipstack2”
$ mstack /dev/net/newstack firefox
.... firefox works on newstack
$ mstack /dev/net/otherstack bash
$ ...this new bash works on otherstack



                                 Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Msockets: implementation
●   Msockets API is currently supported by lwipv6
    and by view-os.
●   It is a natural extension, backwards compatible
    for the Berkeley sockets.
●   Many application would benefit from this
    extension (e.g. networkless user accounts).
●   We are studying kernel support for msockets.



                       Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Berkeley sockets API: problem #2
●   Berkeley Sockets API does provide support for IPC
    (AF_UNIX)
●   Berkeley Sockets API does not provide support for
    multicast IPC
●   Berkeley Sockets is mainly for point-to-point, client-server
    communication
    IP multicast, Ethernet broadcast provided by “magic”
      addresses.
●   Many applications need multicast IPC (dbus, vde_switch,
    midi-patchbay, mpeg-ts demultiplexing...)


                          Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
IPN: Inter Process Networking
●   IPN is for IPC (like AF_UNIX)
●   IPN provides fast, kernel implemented,
    multicast communication among
    processes.
         sender         dispatcher       receiver   receiver   receiver


       AF_UNIX based multicasting service (dbus, vde_switch, tee, ....)

         sender         dispatcher       receiver   receiver   receiver


             Policy submodule
                                     AF_IPN

                                Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
IPN: implementation
●   A new address family AF_IPN
●   Policies can be provided as submodules.
    –   IPN_BROADCAST (default) each messages is delivered
        to all the members but the sender
    –   IPN_VDESWITCH a virtual ethernet switch
    –   IPN_MPEGTS mpeg transport stream demultiplexing
●   Two services (sockopt selectable):
    –   LOSSLESS: bounded buffer approach, late receivers
        delay senders
    –   LOSSY: late receivers lose data.
                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
IPN:direct support for multicast
●   BIND=define and get administration access
    to the socket
    –   “x” permission required
●   CONNECT=join the flow of data
    –   “r” and “w” mean permission to receive or send

struct sockaddr_un sun={.sun_family=AF_IPN,.sun_path="/tmp/sockipn"};
int s=socket(AF_IPN,SOCK_RAW,IPN_BROADCAST); /* or a different policy*/
err=bind(s,(struct sockaddr *)&sun,sizeof(sun));
err=connect(s,NULL,0);
                              Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Why IPN instead of...

●   AF_UNIX?
    –   Point-to-point, hub process needed, slow!
●   IP_MULTICAST ttl=0?
    –   No access control, slow!
●   AF_NETLINK?
    –   No access control, designed for
        interface/filtering configuration.


                         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
IPN is fast   (time for 1M msgs, 64B per msg)




         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
IPN is fast   (time for 1M msgs, 1024B per msg)




         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
IPN is fast   (time for 1M msgs, 16 receivers)




         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
IPN communication models
●   Code examples here:
http://wiki.virtualsquare.org/index.php/IPN_examples
●   Peer-to-peer
    –   All the member processes are senders and
        receivers (e.g. vde)
●   Publish_subscribe
    –   A process broadcast messages and client
        processes can join the IPN socket


                      Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
IPN extra features
●   Out-Of-Band messages from core IPN and policy
    submodules
    –   e.g. number of readers notification to stop
        subscriberless services
●   Networking interfaces TAP+GRAB
    –   TAP: a new virtual interface is defined and connected
        to an IPN socket (in kernel-land)
    –   GRAB: an existing networking interface gets
        connected to an IPN socket (in kernel-land)
●   Char-device interface
    –   Define a character device connected to a IPN socket
                            Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
VDETELWEB
●   It is the Web/Telnet Server for VDE switch configuration.
●   It uses the LWIPv6 library
●   It has two connections to the controlled VDE switch:
    –   management socket to give commands
    –   port0: the ethernet port used by the TCP-IP stack.
●   It reads the set of commands, descriptions, arguments
    from the switch itself.
●   Telnet has history/command editing and support for
    asynch debug output (NEW)


                          Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
VDETELWEB: telnet




        Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
VDETELWEB: Web Interface




       Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
View-OS




         ... a process with a view

Each process should be permitted to have its own
       view of the execution environment


                   Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
View Components
●   filesystem namespace, including the
    related ownership and permission
    information,
●   networking configuration,
●   system name,
●   current time,
●   devices, etc.
●   ...
                    Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Global View Assumption
●   In general processes running on the same
    computer share the same view.
       –   A given pathname refers to the same file
            for all processes.
       –   All processes use one shared TCP-IP stack
            for networking hence all processes share
            the same set of IP addresses and routing
            policies.
       –   All processes share the same notion as to
            which users/processes have special
            priviledges.
                       Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
View-OS vs. VMs and Containers


                         View-OS        VM             Container
Memory Impact            LOW            HIGH           LOW
Running State            User           User or Kernel Kernel
Administered by:         user           user           root
Partial Virtualization   Yes            No             Yes (sharing)




                          Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Partial Virtualization

●   Virtualize just what you need:
       –   Virtual and real file systems, devices,
             networks, etc. co-exist in the process'
             view
●   Support for nested virtualization:
       –   e.g. virtual file system defined on virtual
             devices.


                         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
How to start a View-OS monitor:
      user@host:~$ umview bash
      This kernel supports: PTRACE_MULTI PTRACE_SYSVM ppoll 
      View­OS will use: PTRACE_MULTI PTRACE_SYSVM ppoll 

      pure_libc library found: syscall tracing allowed

      rd235 2.6.29­utrace GNU/Linux/View­OS 10585 0  
      user@host[10585:0]:~$ 


●   Umview runs on vanilla Linux kernels, Kmview
    requires a kernel module loaded (and utrace).
●   Instead of bash one may run his/her favorite
    executable (e.g. xterm, script....)


                          Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
View-OS modules
●   View-OS monitor loads only the
    virtualities requested by the user:
       –   Umfuse: file system virtualization
       –   Umnet: networking virtualization
       –   Umdev: device virtualization
       –   Umbinfmt: executable interpreter
            virtualization
       –   Viewfs: file system patchworking
       –   Ummisc: time, system id...
                        Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#1: Virtual Installation of Software
    $ um_add_service viewfs
    $ mkdir /tmp/newroot
    $ viewsu
    # mount ­t viewfs ­o mincow,except=/tmp,vstat /tmp/newroot /
    # apt­get install mynewsoftware


●   Create an empty dir
●   Mount it in “minimal copy on write” mode:
         –   File mod's are on the real file system when allowed.
         –   Mod's stored in the mounted dir otherwise.
         –   A single consistent view.
         –   Vstat: virtualize stat (support for virtual chown,
               chmod/setuid, special files)
                              Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#2: Virtual Networking
    $ um_add_service umnet
    $ mount ­t umnetlwipv6 none /dev/net/default
    $ ip link set vd0 up
    $ ip addr add 10.1.2.3/24 dev vd0
    $ ip addr
    1: lo0: <LOOPBACK,UP> mtu 0
        link/loopback
        inet6 ::1/128 scope host
        inet 127.0.0.1/8 scope host
    2: vd0: <BROADCAST,UP> mtu 1500
        link/ether 02:02:5a:44:e2:06 brd ff:ff:ff:ff:ff:ff
        inet6 fe80::2:5aff:fe44:e206/64 scope link
        inet 10.1.2.3/24 scope global
●   A network stack can be “mounted.”
●   /dev/net/default is the default stack, but View-OS
    supports multiple stacks.
                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#2: Virtual Networking
$ um_add_service umnet
$ mount ­t umnetlwipv6 ­o tn0=tunx none /dev/lwip0
$ mount ­t umnetlwipv6 ­o tp0=tapx,vd0=/tmp/switch none /dev/lwip1
$ mstack /dev/lwip0 ip addr
1: lo0: <LOOPBACK,UP> mtu 0
    link/loopback
    inet6 ::1/128 scope host
    inet 127.0.0.1/8 scope host
2: tn0: <> mtu 0
    link/generic
$ mstack /dev/lwip1 ip addr
1: lo0: <LOOPBACK,UP> mtu 0
    link/loopback
    inet6 ::1/128 scope host
    inet 127.0.0.1/8 scope host
2: vd0: <BROADCAST> mtu 1500
    link/ether 02:02:47:98:ad:06 brd ff:ff:ff:ff:ff:ff
3: tp0: <BROADCAST> mtu 1500
    link/ether 02:02:03:04:05:06 brd ff:ff:ff:ff:ff:ff
$                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#3:Mount a Filesystem


       $ um_add_service umfuse
       $ mount ­t umfuseext2 ­o ro ext2filesystemimage /mnt
       $ mount ­t umfusestrangefilesystem strangeimage /mnt2




●   Source compatible with Fuse.
●   Mount file systems unsupported by the kernel.
●   Safe mount, limited to this View.



                          Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#4: Filesystem Image partition and mount



●   Step 1: Load the umdev (virtual device)
    module and mount an empty file as a disk
    image.
     $ um_add_service umdev
     $ viewsu
     # dd of=/tmp/diskimage bs=1024 count=0 seek=1024000
     # mount -t umdevmbr /tmp/diskimage /dev/hda




                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#4: Filesystem Image partition and mount
              # fdisk /dev/hda
●   Step 2:   Device contains a valid partition table
              Building a new DOS disklabel with disk identifier 0xd403417d.
partition     Command (m for help): n
              Command action
the              e   extended
                 p   primary partition (1­4)
              p
file          Partition number (1­4): 1
              First cylinder (1­127, default 1): 1
system        Last cylinder, +cylinders or +size{K,M,G} (1­127, default 127): 127
              Command (m for help): p
image:        Disk /dev/hda: 1048 MB, 1048576000 bytes
              255 heads, 63 sectors/track, 127 cylinders
              Units = cylinders of 16065 * 512 = 8225280 bytes
              Disk identifier: 0xd403417d
              Device Boot      Start         End       Blocks    Id System
              /dev/hda1               1          127     1020096    83 Linux
              Command (m for help): w
              The partition table has been altered!
              Calling ioctl() to re­read partition table.
              Syncing disks.
                                  Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#4: Filesystem Image partition and mount
●   Step 3:   # mkfs.ext2 /dev/hda1
              mke2fs 1.41.8 (20­Jul­2009)
Create        Filesystem label=
              OS type: Linux
the           Block size=4096 (log=2)
              Fragment size=4096 (log=2)
filesystem    63872 inodes, 255024 blocks
              12751 blocks (5.00%) reserved for the super user
              First data block=0
              Maximum filesystem blocks=264241152
              8 block groups
              32768 blocks per group, 32768 fragments per group
              7984 inodes per group
              Superblock backups stored on blocks:
                     32768, 98304, 163840, 229376
              Writing inode tables: done
              Writing superblocks and filesystem accounting information: do
              This filesystem will be automatically checked every 38 mounts
              180 days, whichever comes first. Use tune2fs ­c or ­i to over
                                Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#4: Filesystem Image partition and mount
●   Step 4: mount the new partition
      # um_add_service umfuse
      # mount ­t umfuseext2 ­o rw+ /dev/hda1 /mnt
      # ls ­l /mnt
      total 16
      drwx­­­­­­ 2 root root 16384 2009­09­16 11:57 lost+found



●   Example of nested virtualization.
●   Compatible with standard sys-admin
    commands.



                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#5: User Mode chroot
●   Step 1: create the jail filesystem
       $ mkdir /tmp/root /tmp/root/bin /tmp/root/lib
       $ cp /bin/busybox /tmp/root/bin
       $ cp /lib/libm­2.9.so /lib/libc­2.9.so /tmp/root/lib
       $ cd /tmp/root/lib
       $ ln ­s libm­2.9.so libm.so.6
       $ ln ­s libc­2.9.so libc.so.6
       $ cd /               /tmp/root


               bin                              lib


                     libc.so.6                                       libm.so.6
     busybox
                                 libc-2.9.so          libm-2.9.so

                             Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#5: User Mode chroot
●   Step 2: change the file system root:
        –   Core mode: by the virtual chroot system call
      $ exec /usr/sbin/chroot /tmp/root /bin/busybox sh
      BusyBox v1.13.3 (Debian 1:1.13.3­1) built­in shell (ash)
      Enter ’help’ for a list of built­in commands.
      / $

        –   By Viewfs:
     $ um_add_service viewfs
     $ exec busybox sh
     BusyBox v1.13.3 (Debian 1:1.13.3­1) built­in shell (ash)
     Enter ’help’ for a list of built­in commands.
     / $ mount ­t viewfs ­o move,permanent /tmp/root /
     / $

                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#5: User Mode chroot

  ●   Step 3: the process is in the jail:
/ $ ls ­lR /
/:
drwxr­xr­x 2 1000   1000    4096 Sep 17 13:37 bin
drwxr­xr­x 2 1000   1000    4096 Sep 17 13:37 lib
/bin:
­rwxr­xr­x   1 1000 1000  401216 Sep 17 13:37 busybox
/lib:
­rwxr­xr­x   1 1000 1000 1302732 Sep 17 13:37 libc­2.9.so
lrwxrwxrwx   1 1000 1000      11 Sep 17 13:37 libc.so.6 ­> libc­2.9.so
­rw­r­­r­­   1 1000 1000  149328 Sep 17 13:37 libm­2.9.so
lrwxrwxrwx   1 1000 1000      11 Sep 17 13:37 libm.so.6 ­> libm­2.9.so
/ $


                            Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#6: Create a Ramdisk and use it:
$ um_add_service umdev
$ um_add_service umfuse
$ um_add_service umproc
$ mount ­t umdevramdisk ­o size=100M none /dev/hdx
$ /sbin/mkfs.vfat /dev/hdx
mkfs.vfat 3.0.3 (18 May 2009)
$ mount ­t umfusefat ­o rw+ /dev/hdx /mnt
$ mount
rootfs on / type rootfs (rw)
/dev/root on / type ext3 (rw,errors=remount­ro,data=ordered)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=755)
... ...
none on /proc/mounts type proc (ro)
none on /dev/hdx type umdevramdisk (size=100M)
/dev/hdx on /mnt type umfusefat (rw+)
$
●   Another example of nested virtualization
●   Umproc virtualizes /proc/mounts
                         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#7: Virtualize running processes
●   Shell #1 (pid 12345, an ordinary shell)
        sh1 $ mkdir /tmp/mnt
        sh1 $ ls /tmp/mnt
        sh1 $

●   Shell #2 (running under ViewOS)
        sh2 $ um_add_service umfuse
        sh2 $ mount ­t ext2 /tmp/linux.img /tmp
        sh2 $ ls /tmp/mnt
        bin  boot dev etc lib lost+found mnt    proc sbin tmp usr
        sh2 $ um_attach 12345
●   Shell #1 has been “attached” to ViewOS
        sh1 $ ls /tmp/mnt
        bin  boot dev etc lib lost+found mnt    proc sbin tmp usr

                               Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
#8: Process proper time
●   Start two xclocks one from a standard shell, the other
    from a shell running ViewOS.
    sh 1 $ xclock ­update 1 &

    sh2 $ xclock ­update 1 &
    sh2 $ um_add_service ummisc
    sh2 $ mount ­t ummisctime none /tmp/mnt

●   Now change the frequency of the virtual time for
    ViewOS:
     sh2 $ echo 2 > /tmp/mnt/frequency




                            Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Behind the Scenes




                   sumodule

                                 sumodule


                                             sumodule



                                                           sumodule

                                                                       sumodule
                      module                module            module
                   *mview


                              Global                                   PCB
                              hash          dispatcher                and fd




                                                                                  PURELIBC
                              table                                   mgmt


process                       Capture layer             Nested Capture



Ptrace or kmview kernel module (utrace)                               Linux Kernel
                                Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Modules & submodules
                                                                                                 ●   Modules provide support for
                                                                                                     classes of virtualizations, e.g.:
                                                                                                            –   Umfuse: file systems
                   sumodule

                                 sumodule


                                             sumodule



                                                            sumodule

                                                                         sumodule
                                                                                                            –   Umnet: networking
                     module                 module            module


                                                                                                                Umdev: devices
                     *mview
                                                                                                            –
                          Global                                         PCB
                          hash              dispatcher                  and fd



                                                                                      PURELIBC
process
                          table

                              Capture layer             Nested Capture
                                                                        mgmt
                                                                                                 ●   Submodules are for specific cases,
                                                                                                     e.g.:
Ptrace or kmview kernel module (utrace)                                Linux Kernel

                                                                                                            –   Umfuseext2, Umfusefat
                                                                                                            –   Umnetlwipv6, umnetnull
                                                                                                            –   Umdevmbr, umdevramdisk

                                                                                                     Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Modules & submodules
module   description                          submodule             description
umproc   /proc/mounts virtualization
umfuse   User-mode fuse                       umfuseext2            ext2 implementation
                                              umfuseiso9660         iso9660
                                              umfusefat             vat/vfat
                                              umfusentfs3g          ntfs
                                              umfusearchive         tar/cdimages (libarchive)
                                              umfuseramfile         single file virtualization
                                              umfusessh (sshfs)     remote file system via ssh
                                              umfuseencfs (encfs)   encrypted file system
umnet    network multi stack support          umnetnull             null stack
                                              umnetlwipv6           Ipv4/v6 hybrid stack
                                              umnetlink             move/merge stacks
                                              umnetcurrent          current stack
umdev    device virtualization                umdevmbr              DOS master boot record
                                              umdevnull             null device
                                              umdevramdisk          ramdisk
                                              umdevvd               VDI, VMDK, VHD disks
                                              umdevtab              virtual tuntap
ummisc   system call based virtualization     ummisctime            time virtualization
                                              ummiscuname           uname id virtualization
viewfs   file system patchworking

                                        Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Capture user process system calls
                                                                                                 ●   umview: based on ptrace
                                                                                                            –   Vanilla Linux kernel
                                                                                                            –   (patches proposed for
                   sumodule

                                 sumodule


                                             sumodule



                                                            sumodule

                                                                         sumodule
                                                                                                                  performance)
                     module                 module            module


                                                                                                     kmview needs a specific kernel
                     *mview
                                                                                                 ●
                          Global                                         PCB
                          hash              dispatcher                  and fd
                                                                                                     module based on utrace.
                                                                                      PURELIBC
                          table                                         mgmt

process                       Capture layer             Nested Capture

                                                                                                            –   Security enhancement
Ptrace or kmview kernel module (utrace)                                Linux Kernel

                                                                                                            –   More complete virtualization
                                                                                                                 support (nested View-OS,
                                                                                                                 strace/gdb, SIGSTOP).


                                                                                                     Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Global Hash Table
                                                                                                 ●   Keeps track of active
                                                                                                     virtualizations:
                                                                                                                Pathname objects
                   sumodule

                                 sumodule


                                             sumodule



                                                            sumodule

                                                                         sumodule
                                                                                                            –
                     module
                     *mview
                                            module            module
                                                                                                            –   File System Types
                                                                                                                Protocol families
                          Global                                         PCB
                          hash              dispatcher                  and fd                              –
                                                                                      PURELIBC
                          table                                         mgmt



                                                                                                                Device Major/Minor ranges
process                       Capture layer             Nested Capture
                                                                                                            –

                                                                                                                System call numbers
Ptrace or kmview kernel module (utrace)                                Linux Kernel
                                                                                                            –




                                                                                                     Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Dispatcher
                                                                                                 ●   The Dispatcher uses the global
                                                                                                     hash table to route each system
                                                                                                     call to the right module or to the
                   sumodule

                                 sumodule


                                             sumodule



                                                            sumodule

                                                                         sumodule
                                                                                                     kernel.
                     module                 module            module
                     *mview


                          Global                                         PCB
                          hash              dispatcher                  and fd



                                                                                      PURELIBC
                          table                                         mgmt

process                       Capture layer             Nested Capture



Ptrace or kmview kernel module (utrace)                                Linux Kernel




                                                                                                     Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Nested Capture
                                                                                                 ●   View-OS captures (and can
                                                                                                     virtualize) the system calls
                                                                                                     generated by modules and
                   sumodule

                                 sumodule


                                             sumodule



                                                            sumodule

                                                                         sumodule
                                                                                                     submodules
                     module                 module            module
                     *mview


                          Global                                         PCB
                                                                                                 ●   Purelibc is a C library providing
                                                                                                     process self virtualization
                                                                                      PURELIBC
                          hash              dispatcher                  and fd
                          table                                         mgmt

process                       Capture layer             Nested Capture



Ptrace or kmview kernel module (utrace)                                Linux Kernel




                                                                                                     Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Desiderata: 1: Linux Kernel
New Ptrace tags for virtualization support:
    –   PTRACE_VM: support for partial virtualization. It is
          possible to skip the current system call and/or
          the second upcall after the system call. (User-
          Mode Linux can use this instead of
          PTRACE_SYSEMU. VM has a simpler
          implementation than SYSEMU.
    –   PTRACE_MULTI: process a sequence of ptrace
          requests + PEEK/POKE of large chunks as a
          single call. (ptrace exchanges one memory
          word per call and /proc/{pid}/mem is not
          writable!)

                        Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Desiderata 2. Open Group/POSIX
#include <msocket.h>
int msocket(char *path, int domain, int type, int protocol);
●   Path is the pathname of the stack
●   domain/type/protocol are the same defined in socket(2).
●   A stack is a special file (new type of special file, see stat(2)):
#define S_IFSTACK 0160000
●   Each process has a default stack for each protocol family (domain).
     –   If path==NULL, msocket uses the default stack.
●   It is backwards compatible:
#define socket(d,t,p) msocket(NULL,(d),(t),(p))

                                     Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Desiderata: 2 Open Group/POSIX
int msocket(char *path, int domain, int type, int protocol);
●   if type==SOCK_DEFAULT msocket sets the default stack. e.g.
     msocket("/dev/net/ipstack2",PF_INET,SOCK_DEFAULT,0);
     defines /dev/net/ipstack2 as the default stack for Ipv4
●   if type==SOCK_DEFAULT && domain==PF_UNSPEC msocket sets the default
    stack for all the protocol families.
●   Mstack uses msocket: it defines the default stack so that existing applications
    can use different stacks.
$ ip addr
..... ip addr on default net
$ mstack /dev/net/newstack firefox
.... firefox works on newstack
$ mstack /dev/net/otherstack bash
$ ...this new bash works on otherstack
                                    Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Desiderata: 3- C library ({e}glibc)
●   C libraries are impure, they are pure C libraries and
    interface to system calls at the same time.
●   It is not possible to do self virtualization of system calls
    for processes using {e}glibc, library calls are internally
    linked to the system calls (e.g. printf calls write).
●   Purelibc is a (ld preloaded) layer on {e}glibc which
    convert the C library in a pure library
●   The support for self virtualization should be a feature of
    mainstream {e}glibc.



                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Desiderata: 4: utrace
                           UTRACE_STOP
●   Utrace supports more tracers (engines) on the same process.
●   Utrace sends the notification to all the tracers and then waits for
    utrace_control(..., UTRACE_RESUME) from each tracer which
    returned UTRACE_STOP.
●   This specification is bad suited for nested virtualization support:
    a notification functions inspects the state (e.g. System call
    parameters) and maybe it changes the state. Next tracer must
    read the state as changed from the previous tracer.
●   Kmview uses a semaphore in its system call notification
    function to stop a process because this UTRACE_STOP
    specification is useless.
                             Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
UTRACE_STOP implementation

  PROCESS            UTRACE         Kmview kernel module    Outer kmview      Inner kmview
Syscall Request
                  Notify engine#2
                                       Notify user space
                                    Return UTRACE_STOP
                  Notify engine#1
                                       Notify user space
                                    Return UTRACE_STOP
                                                                             Mgmt of syscall

                                            RACE CONDITION!
                        Wait
                   (all engines)
                                                           Mgmt of syscall


  run syscall

                                         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Desiderata: new UTRACE_STOP

  PROCESS            UTRACE         Kmview kernel module    Outer kmview      Inner kmview
Syscall Request
                  Notify engine#2
                                       Notify user space
                                    Return UTRACE_STOP
                       Wait                                                  Mgmt of syscall
                  Notify engine#1
                                       Notify user space
                                    Return UTRACE_STOP
                       Wait                                Mgmt of syscall
  run syscall




                                         Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Kmview workaround
PTRACE_SYSCALL_{RUN,ABORT} instead of PTRACE_STOP
  PROCESS            UTRACE         Kmview kernel module    Outer kmview      Inner kmview
Syscall Request
                  Notify engine#2
                                      Notify user space
                                        down(sem)
                                                                             Mgmt of syscall
                                          up(sem)
                  Notify engine#1
                                      Notify user space
                                        down(sem)
                                                           Mgmt of syscall
                                          up(sem)
  run syscall


                                        Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
Multiple meaning of safety...
●   Availability, bug effects confinement:
          – ViewOS runs outside the kernel, errors in modules may
              lead to a crash of the View (not a kernel panic!)
●   Self protection (from mistaken commands):
          – Global View Assumption often force to use root access (or
              powerful capabilities), this is dangerous.
●   Sandbox non-circumvention:
          – At the first sight it seems that Kernel based sandboxes
              are safer (e.g. seccomp).
                  ● Kernel based sandboxes are not flexible

                  ● On/Off security: a bug may compromise the whole


                     system
                  ● A good support for VM can preserve safety

●   The more code, the worse security. Is the kernel “too fat?”
          – Maintenance problems, side effects, etc.
          – ViewOS can move services outside the kernel.
                           Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
The missing ring...

●   View-OS modules are similar to microkernel
    servers.
●   View-OS captures some of the benefit of
    microkernels (separation mechanism and
    policy, flexibility, reliability).
●   View-OS allow microkernel services to be
    implemented (at user level) on monolithic
    kernels.
                    Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
VirtualSquare

●   VDE
●   LWIPv6
●   PureLibC                                Questions?
●   IPN
●   View-OS
          –   Umview/Kmview

                        Renzo Davoli – renzo@cs.unibo.it - Università di Bologna

Contenu connexe

Similaire à Virtualsquare: tutta la virtualità che avete sempre desiderato e non avete osato chiedere

LXC, Docker, security: is it safe to run applications in Linux Containers?
LXC, Docker, security: is it safe to run applications in Linux Containers?LXC, Docker, security: is it safe to run applications in Linux Containers?
LXC, Docker, security: is it safe to run applications in Linux Containers?Jérôme Petazzoni
 
Lightweight Virtualization in Linux
Lightweight Virtualization in LinuxLightweight Virtualization in Linux
Lightweight Virtualization in LinuxSadegh Dorri N.
 
9th SDN Expert Group Seminar - Session3
9th SDN Expert Group Seminar - Session39th SDN Expert Group Seminar - Session3
9th SDN Expert Group Seminar - Session3NAIM Networks, Inc.
 
OpenNebula Interoperability
OpenNebula InteroperabilityOpenNebula Interoperability
OpenNebula Interoperabilitydmamolina
 
Docker, Linux Containers (LXC), and security
Docker, Linux Containers (LXC), and securityDocker, Linux Containers (LXC), and security
Docker, Linux Containers (LXC), and securityJérôme Petazzoni
 
Checkpoint/Restore: are we there yet?
Checkpoint/Restore: are we there yet?Checkpoint/Restore: are we there yet?
Checkpoint/Restore: are we there yet?Kirill Kolyshkin
 
Learn OpenStack from trystack.cn ——Folsom in practice
Learn OpenStack from trystack.cn  ——Folsom in practiceLearn OpenStack from trystack.cn  ——Folsom in practice
Learn OpenStack from trystack.cn ——Folsom in practiceOpenCity Community
 
Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Jérôme Petazzoni
 
Build Your Private Cloud with Ezilla and Haduzilla
Build Your Private Cloud with Ezilla and HaduzillaBuild Your Private Cloud with Ezilla and Haduzilla
Build Your Private Cloud with Ezilla and HaduzillaJazz Yao-Tsung Wang
 
Attach Me, Detach Me, Assemble Me like you Work
Attach Me, Detach Me, Assemble Me like you WorkAttach Me, Detach Me, Assemble Me like you Work
Attach Me, Detach Me, Assemble Me like you WorkJean Vanderdonckt
 
An overview of OpenVZ virtualization technology
An overview of OpenVZ virtualization technologyAn overview of OpenVZ virtualization technology
An overview of OpenVZ virtualization technologyOpenVZ
 
An overview of OpenVZ virtualization technology
An overview of OpenVZ virtualization technologyAn overview of OpenVZ virtualization technology
An overview of OpenVZ virtualization technologyOpenVZ
 
MCollective installed. And now? by Thomas Gelf
MCollective installed. And now? by Thomas GelfMCollective installed. And now? by Thomas Gelf
MCollective installed. And now? by Thomas GelfNETWAYS
 
Am 04 track1--salvatore orlando--openstack-apac-2012-final
Am 04 track1--salvatore orlando--openstack-apac-2012-finalAm 04 track1--salvatore orlando--openstack-apac-2012-final
Am 04 track1--salvatore orlando--openstack-apac-2012-finalOpenCity Community
 
Exploration of eucalyptus_v2.0
Exploration of eucalyptus_v2.0Exploration of eucalyptus_v2.0
Exploration of eucalyptus_v2.0huangwenjun310
 

Similaire à Virtualsquare: tutta la virtualità che avete sempre desiderato e non avete osato chiedere (20)

LXC, Docker, security: is it safe to run applications in Linux Containers?
LXC, Docker, security: is it safe to run applications in Linux Containers?LXC, Docker, security: is it safe to run applications in Linux Containers?
LXC, Docker, security: is it safe to run applications in Linux Containers?
 
Lightweight Virtualization in Linux
Lightweight Virtualization in LinuxLightweight Virtualization in Linux
Lightweight Virtualization in Linux
 
9th SDN Expert Group Seminar - Session3
9th SDN Expert Group Seminar - Session39th SDN Expert Group Seminar - Session3
9th SDN Expert Group Seminar - Session3
 
OpenNebula Interoperability
OpenNebula InteroperabilityOpenNebula Interoperability
OpenNebula Interoperability
 
Docker, Linux Containers (LXC), and security
Docker, Linux Containers (LXC), and securityDocker, Linux Containers (LXC), and security
Docker, Linux Containers (LXC), and security
 
1061 (m2)
1061 (m2)1061 (m2)
1061 (m2)
 
XS Boston 2008 XenLoop
XS Boston 2008 XenLoopXS Boston 2008 XenLoop
XS Boston 2008 XenLoop
 
Open nebula froscon
Open nebula frosconOpen nebula froscon
Open nebula froscon
 
ON.LAB Mininet
ON.LAB MininetON.LAB Mininet
ON.LAB Mininet
 
Checkpoint/Restore: are we there yet?
Checkpoint/Restore: are we there yet?Checkpoint/Restore: are we there yet?
Checkpoint/Restore: are we there yet?
 
Learn OpenStack from trystack.cn ——Folsom in practice
Learn OpenStack from trystack.cn  ——Folsom in practiceLearn OpenStack from trystack.cn  ——Folsom in practice
Learn OpenStack from trystack.cn ——Folsom in practice
 
Handout2o
Handout2oHandout2o
Handout2o
 
Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...
 
Build Your Private Cloud with Ezilla and Haduzilla
Build Your Private Cloud with Ezilla and HaduzillaBuild Your Private Cloud with Ezilla and Haduzilla
Build Your Private Cloud with Ezilla and Haduzilla
 
Attach Me, Detach Me, Assemble Me like you Work
Attach Me, Detach Me, Assemble Me like you WorkAttach Me, Detach Me, Assemble Me like you Work
Attach Me, Detach Me, Assemble Me like you Work
 
An overview of OpenVZ virtualization technology
An overview of OpenVZ virtualization technologyAn overview of OpenVZ virtualization technology
An overview of OpenVZ virtualization technology
 
An overview of OpenVZ virtualization technology
An overview of OpenVZ virtualization technologyAn overview of OpenVZ virtualization technology
An overview of OpenVZ virtualization technology
 
MCollective installed. And now? by Thomas Gelf
MCollective installed. And now? by Thomas GelfMCollective installed. And now? by Thomas Gelf
MCollective installed. And now? by Thomas Gelf
 
Am 04 track1--salvatore orlando--openstack-apac-2012-final
Am 04 track1--salvatore orlando--openstack-apac-2012-finalAm 04 track1--salvatore orlando--openstack-apac-2012-final
Am 04 track1--salvatore orlando--openstack-apac-2012-final
 
Exploration of eucalyptus_v2.0
Exploration of eucalyptus_v2.0Exploration of eucalyptus_v2.0
Exploration of eucalyptus_v2.0
 

Plus de Codemotion

Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Codemotion
 
Pompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending storyPompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending storyCodemotion
 
Pastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storiaPastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storiaCodemotion
 
Pennisi - Essere Richard Altwasser
Pennisi - Essere Richard AltwasserPennisi - Essere Richard Altwasser
Pennisi - Essere Richard AltwasserCodemotion
 
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Codemotion
 
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Codemotion
 
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Codemotion
 
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 - Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 - Codemotion
 
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Codemotion
 
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Codemotion
 
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Codemotion
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Codemotion
 
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Codemotion
 
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Codemotion
 
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Codemotion
 
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...Codemotion
 
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Codemotion
 
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Codemotion
 
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Codemotion
 
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Codemotion
 

Plus de Codemotion (20)

Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
 
Pompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending storyPompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending story
 
Pastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storiaPastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storia
 
Pennisi - Essere Richard Altwasser
Pennisi - Essere Richard AltwasserPennisi - Essere Richard Altwasser
Pennisi - Essere Richard Altwasser
 
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
 
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
 
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
 
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 - Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
 
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
 
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
 
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
 
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
 
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
 
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
 
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
 
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
 
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
 
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
 
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
 

Dernier

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 

Dernier (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 

Virtualsquare: tutta la virtualità che avete sempre desiderato e non avete osato chiedere

  • 1. VIRTUALSQUARE all the virtuality you wanted but you were afraid to ask Rome, March 5th 2011 Renzo Davoli Università di Bologna (Master in Scienze e Tecnologie del Software Libero) (Associazione per il Software Libero) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 2. Virtual... Time Execution Environment user-id Device Machine Networking Memory File System Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 3. LXC schroot libvirt chroot GXemul Libguestfs Bochs PearPC Qemu tinc VDE fuse-ext2 Open-VZ Marionnet User-mode Linux UnionFS JVM Umview PureLibc VirtualBOX FUSE FairVPN SPICE LWIPv6 fakeroot VirtualBricks View-OS fakeroot-ng KVM Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 4. What does Virtualization mean? Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 5. User of A I(A) = I(A') A' can be used instead of A Interface I(A) Interface I(A') Well Known service New Service A A' Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 6. What is “real” ● The service A can be hardware User of A – Machine I(A) = I(A') A' can be used instead of A – Memory Interface I(A) Interface I(A') Well Known service New Service – Network A A' ● The service A can be software – File system – Execution Environment – Identity Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 7. Why virtual? ● Flexibility – prototyping – modify features at run-time (“real” may be hard to modify: e.g. hardware or kernel code). – Satisfy several requirements while sharing common structures ● Safety – least privilege – sandboxing ● Optimization – Server/service consolidation – No need to maintain several “real” items Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 8. Simulation-Emulation-Virtuality ● Simulation: – Provide just a model of the phenomena to study – Simulation never provides virtualization ● Emulation: – It means: Behave in the same way. – Can provide virtualization if usable (e.g. It is usable) ● Virtuality can be provided without emulation – e.g. LXC Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 9. Virtual Machines (Smith-Nair) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 10. Virtual Machines ● Intrusiveness: – User-mode user-access – User-mode superuser-access – Kernel patch/module – Native ● Paravirtualization: – Change the user interface to optimize virtualization Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 11. Virtual Machines User User User Virtualization Virtualization Service Service Service Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 12. Virtual Machine (multiplexing) User User User User Virtually multiplexed service Service Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 13. Virtual Machines: ● Qemu-system: system-vm, processor emulation (by direct code translation), user-space/user-permission ● GXEmul/PearPC/Mac-on-Linux: system-vm, processor emulation, user-space/user- permission Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 14. Virtual Machines ● KVM: system-vm, same-instruction set, may provide paravitualization (virtio/vhost-net), user- mode monitor, requires processor extensions and kernel module (Linux specific/optimized). ● VirtualBOX(OSE): system-vm, same-instruction set, may provide paravitualization, user-mode monitor, requires processor extensions and kernel module (it runs on several Operating Systems). ● XEN: system-vm, same-instruction set, provides paravirtualization, native mode, multiplexing. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 15. operating system level VM ● LXC: Linux Containers/Namespaces: native mode, multiplexing, user-mode superuser- access (it provides partial virtualization/sharing between containers). ● OpenVZ: native mode, multiplexing, superuser- access. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 16. process VM ● User Mode Linux: A linux kernel is the VM monitor for processes. Process VM. User- mode, user-access. (almost) the same interface (system call set, maybe the version of the kernel may differ). ● View-OS (umview/kmview): partial-modular virtualization. Process VM. User-mode, user-access. (almost) the same interface (users may define new system calls). Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 17. process VM ● Qemu: process emulation (direct translation), user-mode, user-access. ● Application VM (JVM, Mono, ….) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 18. Network Virtualization ● Virtual Private Networks – For secure remote access ● Overlay Networks – e.g. Akamai, p2p ● Networks for Virtual Machines ● Kernel bridge based virtual networks ● Virtual Distributed Ethernet: data-link layer Ethernet consistent, user-mode, user access. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 19. File system virtualization ● Chroot/schroot ● Fakeroot/fakeroot-ng ● FUSE – Fuse-ext2 – Fuse-ssh – Fuse-* Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 20. Modular Virtualization ● View-OS: modular partial virtual machine – Umview: based on ptrace (user-mode, user access) – Kmview: based on utrace/kernel module (user- mode) ● Several modules available: – File system (mount) – File system (patchworking) – Device – Uname/time... – Networking ● Chroot/fakeroot/fuse/vpn/binfmt... features have been implemented on View-OS. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 21. the VirtualSquare view Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 22. Virtualsquare ● Virtualsquare is: – A community.... – A containers of projects.... ● Virtualsquare is not: – A company – A brand/product line ● Virtualsquare started at the University of Bologna but now it is an international community – A lot of former students now work abroad – Common ideas with other groups (joint projects) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 23. the VirtualSquare view ● Communication/compatibility – different Virtualities must be interconnected, must communicate ● Integration – different Virtualities can be seen as special cases of a broaden idea of Virtuality ● Extension – if a need cannot yet be captured by a kind of virtuality, let us create a new one (maybe combining existing virtualities). Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 24. Communication compatibility ● VDE KVM – General purpose user-mode networking support bochs – Ethernet data-link consistent VirtualBOX – Distributed tuntap – Intuitive (it has the same User-mode structure of real Ethernet: linux switches, cables) libvirt View-OS LWIPv6 Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 25. Integration ● View-OS=each process can have its “view” of the environment. ● User-mode/user-permission partial virtual machine approach ● The features of several existing tools have been re- implemented as composable modules. VPN fakeroot(ng) (s)chroot binutils Virtual networking View-OS FUSE UnionFS Virtual devices LXC user-mode Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 26. View-OS GLOBAL VIEW ASSUMPTION Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 27. Extension ● A new forge for several new concepts and ideas/tools: – Msocket: support for multi-stack applications – LWIPv6: user-space LWIPv4/LWIPv6 hybrid networking (multi)stack as a library – Purelibc: process self virtualization – Relativistic virtualization of time: emulation of fast machines on slow ones. – Virtual spaces per login shells. – Public Distributed Ethernets – Run-time on-the-fly virtualization Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 28. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 29. VirtualSquare: the book Renzo Davoli, Michael Goldweber (editors) Virtual Square: Users, Programmers & Developers Guide ● Available at lulu books or downloadable from wiki.virtualsquare.org ● Warning: this book is dynamically changing as the project evolves Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 30. ...a closer look on virtualsquare projects... Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 31. VDE components SWITCH SWITCH CROSS CABLE VDE SWITCH VDE SWITCH VM VDE VdeWire VDE VM TunTap (e.g. QEMU) plug (e.g. ssh) plug (e.g. U-ML) Linux Module Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 32. VDE: Related Work ● VPN: (OpenVPN) point2point, for real machines ● Overlay Networks: specific for application (peer to peer, Akamai). ● VM networking: (tools provided with VM, e.g. uml-switch) specific for VM VDE: ● multipoint, general mesh ● no need for root (administration) access ● heterogeneous VM and non VM connected Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 33. VDEv2: advertisement ● VDEv2: – modular design – compatible with user-mode linux, qemu, tuntap, (bochs, plex86), umview/lwipv6 – through the vdetaplib potentially compatible any application using tap – VLAN (802.1Q) – FST (fast spanning tree) – run time maneageable via unixterm (telnet or web with vdetelweb) – includes slirpvde and wirefilter – status debug – plugin support: snmp/iplog/pdump Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 34. VDEv2 ● VDE-Switch – number of ports configurable on command line – port0 is reserved for management clients, n-1 ports are available for connections. – management UNIX socket for management clients ● self-describing SMTP-like protocol – modules: datasock (VM conn), tuntap, consmgmt (management) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 35. VDE cables ● VDE-plug – is a VM that converts the Ethernet packets of a VDE port into a stream connection (stdin- stdout) ● VDE-wire – can be any application able to give a stdin/stdout stream connection Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 36. Dual Pipe ● dpipe is a new (general purpose) command we have added. ● Pipe are well known abstractions. The following command prints the list of the current directory: ls lpr ls | lpr ● Dpipe creates a bi-irectional connection between the processes dpipe cmd1 = cmd2 cmd1 cmd2 Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 37. VDE cables, plugs, wires/dpipe ● dpipe is used to create VDE-cables: dpipe vde_plug = ssh vde.students.cs.unibo.it vde_plug ● this command connects by a dpipe the local vde_plug with a vde_plug running on a remote host (the wire is ssh) ● other applications can be used as wire (e.g.netcat) ● In the example vde_plug refers to the default switch. It is possible to run several switches on the same host, an extra option is needed in this case. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 38. wirefilter ● wirefilter can be put on a cable (e.g. for network testing) dpipe vde_plug /tmp/s1 = wirefilter -m /tmp/m = vde_plug /tmp/s2 wirefilter -v /tmp/s1:/tmp/s2 ● packet loss, delays, dup, speed, noise figures, mtu, fifoness properties of the line can be changed with command line options or real time via a management socket. ● It is possible to define several “states”. The state transition is driven by a Markov-chain Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 39. SlirpVDE VDE SWITCH VDE SWITCH 10.0.2.15 VDE VdeWire VDE 10.0.2.16 plug (e.g. ssh) plug VM VM (e.g. QEMU) (e.g. U-ML) Note: slirp supports IPv4 slirpv6 supports both ipv4 and ipv6 10.0.2.2 SlirpVDE Firefox http connection from slirpVDE Renzo Davoli – renzo@cs.unibo.it - Università di Bologna running on the hosting O.S. to
  • 40. vde_cryptcab ● Coded by Daniele Lacamera (danielinux) ● A vde_cryptcab is a distributed cable manager for VDE switches. ● Server side vde_cryptcab -s /tmp/vde2.ctl -p 2100 ● Client side vde_cryptcab -s /tmp/vde2.ctl -c foo@remote.machine.org:2100 ● use a blowfish channel (random key exchanged by scp). Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 41. Marionnet (based on VDE) ● A project by Jean-Vincent Loddo and Luca Saiu (et al) Université Paris 13. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 42. TINC ● tinc is a Virtual Private Network (VPN) daemon that uses tunnelling and encryption to create a secure private network between hosts on the Internet. ● Encryption, authentication and compression ● Automatic full mesh routing ● Easily expand your VPN ● Ability to bridge ethernet segments ● Runs on many operating systems and supports Ipv6 ● A project by Ivo Timmermans and Guus Sliepen Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 43. LWIPv6 ● It is a LWIPv4/v6 (multi) stack implemented as a library. ● Fork project from LWIP project (Adam Dunkels <adam@sics.se>) ● Can be connected to any number of VDE, TUN, TAP interfaces. ● It is a hybrid stack (not a dual-stack). One single Ipv6 “engine” is able also to manage Ipv4 packets in compatibility mode (130.136.1.110 is managed as 0::ffff:130.136.1.110). Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 44. LWIPv6 ● PF_INET, PF_INET6 ● PF_PACKET for raw packet management – support for user-level network analysis tools (e.g. sniffers, ethereal) – support for user-level dhcp clients. ● PF_NETLINK for configuration ● Packet filtering ● NEW: dhcp client/server, rarpd, slirp, routing, nat on request Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 45. LWIPv6 interface definition API struct stack *lwip_stack_new(void); void lwip_stack_free(struct stack *stack); struct stack *lwip_stack_get(void); void lwip_stack_set(struct stack *stack); struct netif *lwip_vdeif_add(struct stack *stack, void *arg); struct netif *lwip_tapif_add(struct stack *stack, void *arg); struct netif *lwip_tunif_add(struct stack *stack, void *arg); int lwip_add_addr(struct netif *netif, struct ip_addr *ipaddr, struct ip_addr *netmask); int lwip_del_addr(struct netif *netif, struct ip_addr *ipaddr, struct ip_addr *netmask); int lwip_add_route(struct stack *stack, struct ip_addr *addr, struct ip_addr *netmask, struct ip_addr *nexthop, struct netif *netif, int flags); int lwip_del_route(struct stack *stack, struct ip_addr *addr, struct ip_addr *netmask, struct ip_addr *nexthop, struct netif *netif, int flags); int lwip_ifup(struct netif *netif); int lwip_ifdown(struct netif *netif); Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 46. LWIPv6 socket API (just add lwip_ prefix) int lwip_msocket(struct stack *stack, int domain, int type, int protocol); int lwip_socket(int domain, int type, int protocol); int lwip_accept(int s, struct sockaddr *addr, socklen_t *addrlen); int lwip_bind(int s, struct sockaddr *name, socklen_t namelen); int lwip_shutdown(int s, int how); int lwip_getpeername (int s, struct sockaddr *name, socklen_t *namelen); int lwip_getsockname (int s, struct sockaddr *name, socklen_t *namelen); int lwip_getsockopt (int s, int level, int optname, void *optval, socklen_t *optlen); int lwip_setsockopt (int s, int level, int optname, const void *optval, socklen_t optlen); int lwip_close(int s); int lwip_connect(int s, struct sockaddr *name, socklen_t namelen); int lwip_listen(int s, int backlog); int lwip_recv(int s, void *mem, int len, unsigned int flags); int lwip_read(int s, void *mem, int len); int lwip_recvfrom(int s, void *mem, int len, unsigned int flags, struct sockaddr *from, socklen_t *fromlen); int lwip_send(int s, void *dataptr, int size, unsigned int flags); int lwip_sendto(int s, void *dataptr, int size, unsigned int flags, struct sockaddr *to, socklen_t tolen); int lwip_write(int s, void *dataptr, int size); int lwip_select(int maxfdp1, fd_set *readset, fd_set *writeset, fd_set *exceptset, struct timeval *timeout); int lwip_ioctl(int s, long cmd, void *argp); Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 47. LWIPv6: New features ● Packet forwarding ● Filtering ● NAT ● DHCP server/RADV server onboard ● SLIRP (v4 and v6) struct netif *lwip_add_slirpif(struct stack *stack, void *arg, int flags); Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 48. slirpvde6 ● Extension of slirpvde based on LWIPv6 – slirp ipv4/ipv6 – Stateless translator – Dhcp/radv server – DNS forwarder – Port and X forwarding (in and out) slirpvde6 -d -H10.0.2.1/24 -H2001::1/64 -s /tmp/vde.ctl -dhcp -r Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 49. Berkeley sockets API: problem #1 ● The Berkeley Sockets API has been designed for one protocol stack (per protocol family). – Multiple stacks => different networking features (per user, per application...) ● Unix uses the file system as a naming space for everything (devices, kernel variables, ...) except for networking. – Access control to networking Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 50. Solution #1: msockets #include <msocket.h> int msocket(char *path, int domain, int type, int protocol); ● Path is the pathname of the stack ● domain/type/protocol are the same defined in socket(2). ● A stack is a special file (new type of special file, see stat(2)): #define S_IFSTACK 0160000 ● Each process has a default stack for each protocol family (domain). – If path==NULL, msocket uses the default stack. ● It is backwards compatible. #define socket(d,t,p) msocket(NULL,(d),(t),(p)) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 51. Msockets: set the default stack int msocket(char *path, int domain, int type, int protocol); ● if type==SOCK_DEFAULT msocket sets the default stack. e.g. msocket("/dev/net/ipstack2",PF_INET,SOCK_DEFAULT,0); defines /dev/net/ipstack2 as the default stack for Ipv4 ● if type==SOCK_DEFAULT && domain==PF_UNSPEC msocket sets the default stack for all the protocol families. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 52. Mstack: backward compatibility ● Mstack uses msocket: it defines the default stack so that existing applications can use different stacks. $ ip addr ..... ip addr on default net $ mstack /dev/net/ipstack2 ip addr .... ip addr of “ipstack2” $ mstack /dev/net/newstack firefox .... firefox works on newstack $ mstack /dev/net/otherstack bash $ ...this new bash works on otherstack Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 53. Msockets: implementation ● Msockets API is currently supported by lwipv6 and by view-os. ● It is a natural extension, backwards compatible for the Berkeley sockets. ● Many application would benefit from this extension (e.g. networkless user accounts). ● We are studying kernel support for msockets. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 54. Berkeley sockets API: problem #2 ● Berkeley Sockets API does provide support for IPC (AF_UNIX) ● Berkeley Sockets API does not provide support for multicast IPC ● Berkeley Sockets is mainly for point-to-point, client-server communication IP multicast, Ethernet broadcast provided by “magic” addresses. ● Many applications need multicast IPC (dbus, vde_switch, midi-patchbay, mpeg-ts demultiplexing...) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 55. IPN: Inter Process Networking ● IPN is for IPC (like AF_UNIX) ● IPN provides fast, kernel implemented, multicast communication among processes. sender dispatcher receiver receiver receiver AF_UNIX based multicasting service (dbus, vde_switch, tee, ....) sender dispatcher receiver receiver receiver Policy submodule AF_IPN Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 56. IPN: implementation ● A new address family AF_IPN ● Policies can be provided as submodules. – IPN_BROADCAST (default) each messages is delivered to all the members but the sender – IPN_VDESWITCH a virtual ethernet switch – IPN_MPEGTS mpeg transport stream demultiplexing ● Two services (sockopt selectable): – LOSSLESS: bounded buffer approach, late receivers delay senders – LOSSY: late receivers lose data. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 57. IPN:direct support for multicast ● BIND=define and get administration access to the socket – “x” permission required ● CONNECT=join the flow of data – “r” and “w” mean permission to receive or send struct sockaddr_un sun={.sun_family=AF_IPN,.sun_path="/tmp/sockipn"}; int s=socket(AF_IPN,SOCK_RAW,IPN_BROADCAST); /* or a different policy*/ err=bind(s,(struct sockaddr *)&sun,sizeof(sun)); err=connect(s,NULL,0); Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 58. Why IPN instead of... ● AF_UNIX? – Point-to-point, hub process needed, slow! ● IP_MULTICAST ttl=0? – No access control, slow! ● AF_NETLINK? – No access control, designed for interface/filtering configuration. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 59. IPN is fast (time for 1M msgs, 64B per msg) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 60. IPN is fast (time for 1M msgs, 1024B per msg) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 61. IPN is fast (time for 1M msgs, 16 receivers) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 62. IPN communication models ● Code examples here: http://wiki.virtualsquare.org/index.php/IPN_examples ● Peer-to-peer – All the member processes are senders and receivers (e.g. vde) ● Publish_subscribe – A process broadcast messages and client processes can join the IPN socket Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 63. IPN extra features ● Out-Of-Band messages from core IPN and policy submodules – e.g. number of readers notification to stop subscriberless services ● Networking interfaces TAP+GRAB – TAP: a new virtual interface is defined and connected to an IPN socket (in kernel-land) – GRAB: an existing networking interface gets connected to an IPN socket (in kernel-land) ● Char-device interface – Define a character device connected to a IPN socket Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 64. VDETELWEB ● It is the Web/Telnet Server for VDE switch configuration. ● It uses the LWIPv6 library ● It has two connections to the controlled VDE switch: – management socket to give commands – port0: the ethernet port used by the TCP-IP stack. ● It reads the set of commands, descriptions, arguments from the switch itself. ● Telnet has history/command editing and support for asynch debug output (NEW) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 65. VDETELWEB: telnet Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 66. VDETELWEB: Web Interface Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 67. View-OS ... a process with a view Each process should be permitted to have its own view of the execution environment Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 68. View Components ● filesystem namespace, including the related ownership and permission information, ● networking configuration, ● system name, ● current time, ● devices, etc. ● ... Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 69. Global View Assumption ● In general processes running on the same computer share the same view. – A given pathname refers to the same file for all processes. – All processes use one shared TCP-IP stack for networking hence all processes share the same set of IP addresses and routing policies. – All processes share the same notion as to which users/processes have special priviledges. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 70. View-OS vs. VMs and Containers View-OS VM Container Memory Impact LOW HIGH LOW Running State User User or Kernel Kernel Administered by: user user root Partial Virtualization Yes No Yes (sharing) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 71. Partial Virtualization ● Virtualize just what you need: – Virtual and real file systems, devices, networks, etc. co-exist in the process' view ● Support for nested virtualization: – e.g. virtual file system defined on virtual devices. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 72. How to start a View-OS monitor: user@host:~$ umview bash This kernel supports: PTRACE_MULTI PTRACE_SYSVM ppoll  View­OS will use: PTRACE_MULTI PTRACE_SYSVM ppoll  pure_libc library found: syscall tracing allowed rd235 2.6.29­utrace GNU/Linux/View­OS 10585 0   user@host[10585:0]:~$  ● Umview runs on vanilla Linux kernels, Kmview requires a kernel module loaded (and utrace). ● Instead of bash one may run his/her favorite executable (e.g. xterm, script....) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 73. View-OS modules ● View-OS monitor loads only the virtualities requested by the user: – Umfuse: file system virtualization – Umnet: networking virtualization – Umdev: device virtualization – Umbinfmt: executable interpreter virtualization – Viewfs: file system patchworking – Ummisc: time, system id... Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 74. #1: Virtual Installation of Software $ um_add_service viewfs $ mkdir /tmp/newroot $ viewsu # mount ­t viewfs ­o mincow,except=/tmp,vstat /tmp/newroot / # apt­get install mynewsoftware ● Create an empty dir ● Mount it in “minimal copy on write” mode: – File mod's are on the real file system when allowed. – Mod's stored in the mounted dir otherwise. – A single consistent view. – Vstat: virtualize stat (support for virtual chown, chmod/setuid, special files) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 75. #2: Virtual Networking $ um_add_service umnet $ mount ­t umnetlwipv6 none /dev/net/default $ ip link set vd0 up $ ip addr add 10.1.2.3/24 dev vd0 $ ip addr 1: lo0: <LOOPBACK,UP> mtu 0     link/loopback     inet6 ::1/128 scope host     inet 127.0.0.1/8 scope host 2: vd0: <BROADCAST,UP> mtu 1500     link/ether 02:02:5a:44:e2:06 brd ff:ff:ff:ff:ff:ff     inet6 fe80::2:5aff:fe44:e206/64 scope link     inet 10.1.2.3/24 scope global ● A network stack can be “mounted.” ● /dev/net/default is the default stack, but View-OS supports multiple stacks. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 76. #2: Virtual Networking $ um_add_service umnet $ mount ­t umnetlwipv6 ­o tn0=tunx none /dev/lwip0 $ mount ­t umnetlwipv6 ­o tp0=tapx,vd0=/tmp/switch none /dev/lwip1 $ mstack /dev/lwip0 ip addr 1: lo0: <LOOPBACK,UP> mtu 0     link/loopback     inet6 ::1/128 scope host     inet 127.0.0.1/8 scope host 2: tn0: <> mtu 0     link/generic $ mstack /dev/lwip1 ip addr 1: lo0: <LOOPBACK,UP> mtu 0     link/loopback     inet6 ::1/128 scope host     inet 127.0.0.1/8 scope host 2: vd0: <BROADCAST> mtu 1500     link/ether 02:02:47:98:ad:06 brd ff:ff:ff:ff:ff:ff 3: tp0: <BROADCAST> mtu 1500     link/ether 02:02:03:04:05:06 brd ff:ff:ff:ff:ff:ff $ Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 77. #3:Mount a Filesystem $ um_add_service umfuse $ mount ­t umfuseext2 ­o ro ext2filesystemimage /mnt $ mount ­t umfusestrangefilesystem strangeimage /mnt2 ● Source compatible with Fuse. ● Mount file systems unsupported by the kernel. ● Safe mount, limited to this View. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 78. #4: Filesystem Image partition and mount ● Step 1: Load the umdev (virtual device) module and mount an empty file as a disk image. $ um_add_service umdev $ viewsu # dd of=/tmp/diskimage bs=1024 count=0 seek=1024000 # mount -t umdevmbr /tmp/diskimage /dev/hda Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 79. #4: Filesystem Image partition and mount # fdisk /dev/hda ● Step 2: Device contains a valid partition table Building a new DOS disklabel with disk identifier 0xd403417d. partition Command (m for help): n Command action the    e   extended    p   primary partition (1­4) p file Partition number (1­4): 1 First cylinder (1­127, default 1): 1 system Last cylinder, +cylinders or +size{K,M,G} (1­127, default 127): 127 Command (m for help): p image: Disk /dev/hda: 1048 MB, 1048576000 bytes 255 heads, 63 sectors/track, 127 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xd403417d Device Boot      Start         End       Blocks    Id System /dev/hda1               1          127     1020096    83 Linux Command (m for help): w The partition table has been altered! Calling ioctl() to re­read partition table. Syncing disks. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 80. #4: Filesystem Image partition and mount ● Step 3: # mkfs.ext2 /dev/hda1 mke2fs 1.41.8 (20­Jul­2009) Create Filesystem label= OS type: Linux the Block size=4096 (log=2) Fragment size=4096 (log=2) filesystem 63872 inodes, 255024 blocks 12751 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=264241152 8 block groups 32768 blocks per group, 32768 fragments per group 7984 inodes per group Superblock backups stored on blocks:        32768, 98304, 163840, 229376 Writing inode tables: done Writing superblocks and filesystem accounting information: do This filesystem will be automatically checked every 38 mounts 180 days, whichever comes first. Use tune2fs ­c or ­i to over Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 81. #4: Filesystem Image partition and mount ● Step 4: mount the new partition # um_add_service umfuse # mount ­t umfuseext2 ­o rw+ /dev/hda1 /mnt # ls ­l /mnt total 16 drwx­­­­­­ 2 root root 16384 2009­09­16 11:57 lost+found ● Example of nested virtualization. ● Compatible with standard sys-admin commands. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 82. #5: User Mode chroot ● Step 1: create the jail filesystem $ mkdir /tmp/root /tmp/root/bin /tmp/root/lib $ cp /bin/busybox /tmp/root/bin $ cp /lib/libm­2.9.so /lib/libc­2.9.so /tmp/root/lib $ cd /tmp/root/lib $ ln ­s libm­2.9.so libm.so.6 $ ln ­s libc­2.9.so libc.so.6 $ cd / /tmp/root bin lib libc.so.6 libm.so.6 busybox libc-2.9.so libm-2.9.so Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 83. #5: User Mode chroot ● Step 2: change the file system root: – Core mode: by the virtual chroot system call $ exec /usr/sbin/chroot /tmp/root /bin/busybox sh BusyBox v1.13.3 (Debian 1:1.13.3­1) built­in shell (ash) Enter ’help’ for a list of built­in commands. / $ – By Viewfs: $ um_add_service viewfs $ exec busybox sh BusyBox v1.13.3 (Debian 1:1.13.3­1) built­in shell (ash) Enter ’help’ for a list of built­in commands. / $ mount ­t viewfs ­o move,permanent /tmp/root / / $ Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 84. #5: User Mode chroot ● Step 3: the process is in the jail: / $ ls ­lR / /: drwxr­xr­x 2 1000   1000    4096 Sep 17 13:37 bin drwxr­xr­x 2 1000   1000    4096 Sep 17 13:37 lib /bin: ­rwxr­xr­x   1 1000 1000  401216 Sep 17 13:37 busybox /lib: ­rwxr­xr­x   1 1000 1000 1302732 Sep 17 13:37 libc­2.9.so lrwxrwxrwx   1 1000 1000      11 Sep 17 13:37 libc.so.6 ­> libc­2.9.so ­rw­r­­r­­   1 1000 1000  149328 Sep 17 13:37 libm­2.9.so lrwxrwxrwx   1 1000 1000      11 Sep 17 13:37 libm.so.6 ­> libm­2.9.so / $ Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 85. #6: Create a Ramdisk and use it: $ um_add_service umdev $ um_add_service umfuse $ um_add_service umproc $ mount ­t umdevramdisk ­o size=100M none /dev/hdx $ /sbin/mkfs.vfat /dev/hdx mkfs.vfat 3.0.3 (18 May 2009) $ mount ­t umfusefat ­o rw+ /dev/hdx /mnt $ mount rootfs on / type rootfs (rw) /dev/root on / type ext3 (rw,errors=remount­ro,data=ordered) tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=755) ... ... none on /proc/mounts type proc (ro) none on /dev/hdx type umdevramdisk (size=100M) /dev/hdx on /mnt type umfusefat (rw+) $ ● Another example of nested virtualization ● Umproc virtualizes /proc/mounts Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 86. #7: Virtualize running processes ● Shell #1 (pid 12345, an ordinary shell) sh1 $ mkdir /tmp/mnt sh1 $ ls /tmp/mnt sh1 $ ● Shell #2 (running under ViewOS) sh2 $ um_add_service umfuse sh2 $ mount ­t ext2 /tmp/linux.img /tmp sh2 $ ls /tmp/mnt bin  boot dev etc lib lost+found mnt    proc sbin tmp usr sh2 $ um_attach 12345 ● Shell #1 has been “attached” to ViewOS sh1 $ ls /tmp/mnt bin  boot dev etc lib lost+found mnt    proc sbin tmp usr Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 87. #8: Process proper time ● Start two xclocks one from a standard shell, the other from a shell running ViewOS. sh 1 $ xclock ­update 1 & sh2 $ xclock ­update 1 & sh2 $ um_add_service ummisc sh2 $ mount ­t ummisctime none /tmp/mnt ● Now change the frequency of the virtual time for ViewOS: sh2 $ echo 2 > /tmp/mnt/frequency Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 88. Behind the Scenes sumodule sumodule sumodule sumodule sumodule module module module *mview Global PCB hash dispatcher and fd PURELIBC table mgmt process Capture layer Nested Capture Ptrace or kmview kernel module (utrace) Linux Kernel Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 89. Modules & submodules ● Modules provide support for classes of virtualizations, e.g.: – Umfuse: file systems sumodule sumodule sumodule sumodule sumodule – Umnet: networking module module module Umdev: devices *mview – Global PCB hash dispatcher and fd PURELIBC process table Capture layer Nested Capture mgmt ● Submodules are for specific cases, e.g.: Ptrace or kmview kernel module (utrace) Linux Kernel – Umfuseext2, Umfusefat – Umnetlwipv6, umnetnull – Umdevmbr, umdevramdisk Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 90. Modules & submodules module description submodule description umproc /proc/mounts virtualization umfuse User-mode fuse umfuseext2 ext2 implementation umfuseiso9660 iso9660 umfusefat vat/vfat umfusentfs3g ntfs umfusearchive tar/cdimages (libarchive) umfuseramfile single file virtualization umfusessh (sshfs) remote file system via ssh umfuseencfs (encfs) encrypted file system umnet network multi stack support umnetnull null stack umnetlwipv6 Ipv4/v6 hybrid stack umnetlink move/merge stacks umnetcurrent current stack umdev device virtualization umdevmbr DOS master boot record umdevnull null device umdevramdisk ramdisk umdevvd VDI, VMDK, VHD disks umdevtab virtual tuntap ummisc system call based virtualization ummisctime time virtualization ummiscuname uname id virtualization viewfs file system patchworking Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 91. Capture user process system calls ● umview: based on ptrace – Vanilla Linux kernel – (patches proposed for sumodule sumodule sumodule sumodule sumodule performance) module module module kmview needs a specific kernel *mview ● Global PCB hash dispatcher and fd module based on utrace. PURELIBC table mgmt process Capture layer Nested Capture – Security enhancement Ptrace or kmview kernel module (utrace) Linux Kernel – More complete virtualization support (nested View-OS, strace/gdb, SIGSTOP). Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 92. Global Hash Table ● Keeps track of active virtualizations: Pathname objects sumodule sumodule sumodule sumodule sumodule – module *mview module module – File System Types Protocol families Global PCB hash dispatcher and fd – PURELIBC table mgmt Device Major/Minor ranges process Capture layer Nested Capture – System call numbers Ptrace or kmview kernel module (utrace) Linux Kernel – Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 93. Dispatcher ● The Dispatcher uses the global hash table to route each system call to the right module or to the sumodule sumodule sumodule sumodule sumodule kernel. module module module *mview Global PCB hash dispatcher and fd PURELIBC table mgmt process Capture layer Nested Capture Ptrace or kmview kernel module (utrace) Linux Kernel Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 94. Nested Capture ● View-OS captures (and can virtualize) the system calls generated by modules and sumodule sumodule sumodule sumodule sumodule submodules module module module *mview Global PCB ● Purelibc is a C library providing process self virtualization PURELIBC hash dispatcher and fd table mgmt process Capture layer Nested Capture Ptrace or kmview kernel module (utrace) Linux Kernel Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 95. Desiderata: 1: Linux Kernel New Ptrace tags for virtualization support: – PTRACE_VM: support for partial virtualization. It is possible to skip the current system call and/or the second upcall after the system call. (User- Mode Linux can use this instead of PTRACE_SYSEMU. VM has a simpler implementation than SYSEMU. – PTRACE_MULTI: process a sequence of ptrace requests + PEEK/POKE of large chunks as a single call. (ptrace exchanges one memory word per call and /proc/{pid}/mem is not writable!) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 96. Desiderata 2. Open Group/POSIX #include <msocket.h> int msocket(char *path, int domain, int type, int protocol); ● Path is the pathname of the stack ● domain/type/protocol are the same defined in socket(2). ● A stack is a special file (new type of special file, see stat(2)): #define S_IFSTACK 0160000 ● Each process has a default stack for each protocol family (domain). – If path==NULL, msocket uses the default stack. ● It is backwards compatible: #define socket(d,t,p) msocket(NULL,(d),(t),(p)) Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 97. Desiderata: 2 Open Group/POSIX int msocket(char *path, int domain, int type, int protocol); ● if type==SOCK_DEFAULT msocket sets the default stack. e.g. msocket("/dev/net/ipstack2",PF_INET,SOCK_DEFAULT,0); defines /dev/net/ipstack2 as the default stack for Ipv4 ● if type==SOCK_DEFAULT && domain==PF_UNSPEC msocket sets the default stack for all the protocol families. ● Mstack uses msocket: it defines the default stack so that existing applications can use different stacks. $ ip addr ..... ip addr on default net $ mstack /dev/net/newstack firefox .... firefox works on newstack $ mstack /dev/net/otherstack bash $ ...this new bash works on otherstack Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 98. Desiderata: 3- C library ({e}glibc) ● C libraries are impure, they are pure C libraries and interface to system calls at the same time. ● It is not possible to do self virtualization of system calls for processes using {e}glibc, library calls are internally linked to the system calls (e.g. printf calls write). ● Purelibc is a (ld preloaded) layer on {e}glibc which convert the C library in a pure library ● The support for self virtualization should be a feature of mainstream {e}glibc. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 99. Desiderata: 4: utrace UTRACE_STOP ● Utrace supports more tracers (engines) on the same process. ● Utrace sends the notification to all the tracers and then waits for utrace_control(..., UTRACE_RESUME) from each tracer which returned UTRACE_STOP. ● This specification is bad suited for nested virtualization support: a notification functions inspects the state (e.g. System call parameters) and maybe it changes the state. Next tracer must read the state as changed from the previous tracer. ● Kmview uses a semaphore in its system call notification function to stop a process because this UTRACE_STOP specification is useless. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 100. UTRACE_STOP implementation PROCESS UTRACE Kmview kernel module Outer kmview Inner kmview Syscall Request Notify engine#2 Notify user space Return UTRACE_STOP Notify engine#1 Notify user space Return UTRACE_STOP Mgmt of syscall RACE CONDITION! Wait (all engines) Mgmt of syscall run syscall Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 101. Desiderata: new UTRACE_STOP PROCESS UTRACE Kmview kernel module Outer kmview Inner kmview Syscall Request Notify engine#2 Notify user space Return UTRACE_STOP Wait Mgmt of syscall Notify engine#1 Notify user space Return UTRACE_STOP Wait Mgmt of syscall run syscall Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 102. Kmview workaround PTRACE_SYSCALL_{RUN,ABORT} instead of PTRACE_STOP PROCESS UTRACE Kmview kernel module Outer kmview Inner kmview Syscall Request Notify engine#2 Notify user space down(sem) Mgmt of syscall up(sem) Notify engine#1 Notify user space down(sem) Mgmt of syscall up(sem) run syscall Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 103. Multiple meaning of safety... ● Availability, bug effects confinement: – ViewOS runs outside the kernel, errors in modules may lead to a crash of the View (not a kernel panic!) ● Self protection (from mistaken commands): – Global View Assumption often force to use root access (or powerful capabilities), this is dangerous. ● Sandbox non-circumvention: – At the first sight it seems that Kernel based sandboxes are safer (e.g. seccomp). ● Kernel based sandboxes are not flexible ● On/Off security: a bug may compromise the whole system ● A good support for VM can preserve safety ● The more code, the worse security. Is the kernel “too fat?” – Maintenance problems, side effects, etc. – ViewOS can move services outside the kernel. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 104. The missing ring... ● View-OS modules are similar to microkernel servers. ● View-OS captures some of the benefit of microkernels (separation mechanism and policy, flexibility, reliability). ● View-OS allow microkernel services to be implemented (at user level) on monolithic kernels. Renzo Davoli – renzo@cs.unibo.it - Università di Bologna
  • 105. VirtualSquare ● VDE ● LWIPv6 ● PureLibC Questions? ● IPN ● View-OS – Umview/Kmview Renzo Davoli – renzo@cs.unibo.it - Università di Bologna