OpenNebulaConf2017EU: Torturing OpenNebula for Fun and Profit by Carlo Daffara, NodeWeaver

•

0 j'aime•348 vues

NodeWeaver is an OpenNebula-based hyperconverged platform designed to run despite massive hardware, software and networking faults. The talk will cover the kind of issues we had preparing OpenNebula for execution in the strangest places (like behind a NMR machine, or in a pit in the desert), how to test it in ways that could be featured in an horror film, and what OpenNebula allows us to do that would be difficult in other platforms. YouTube: https://youtu.be/G75unWZGMQE

Technologie

Torturing OpenNebula for Fun and Profit
Carlo Daffara - NodeWeaver

● Ensuring that the platform runs well in uncontrolled environment
requires some attention to design (focused on the target) and lots
of testing
● Some basic principles:
○ “perfection is finally attained not when there is no longer
anything to add, but when there is no longer anything to take
away” - Antoine de Saint Exupéry
○ Complexity may be necessary at scale, but not for every
application. Every piece that is added may break at some
point

Source: Werner Wogels, Real-time graph of microservice dependencies at http://amazon.com in 2008.

● If you ever ask the user for something, she becomes part of the
system to be tested! …
● … which means that in principle, you should never ask the user for
information that may be obtained in some other (automated) way
● The user may not understand, may not be there, may be mistaken
by all the knobs and dials, or may be deliberately destructive

● Testing must be done on the complete system -
software+hardware+configs …
● … because software faults are more common than hardware ones
● Faults are complex: stop, corruption, limping…
● Trust only what you measure (as Grace Hopper said: "One
accurate measurement is worth a thousand expert opinions.")

● We model our system as a Petri Net
● We run a group of NodeWeaver images (within NodeWeaver),
each with a set of disks attached to emulate local storage &
multiple virtual ethernet links
● Within each emulated node, we run a small set of Centos images,
that receive from contextualization the number of FIO runs and
the kind of emulated workload
● And we run our little chaos monkey process (actually, some bash
scripts)

● Disks:
○ detach disk, then destroy it
○ detach disk, then attach an empty disk
○ detach, wait (random), then reattach
○ Inject random data in a random file within the disk image
○ Inject random data in the disk image
● Network: virsh domif-setlink (up, down) to simulate a faulty cable
(hint: https://dev.opennebula.org/issues/3219 pretty pleeeease... )
● Virtual Node: Hardreset + full time cluster reset
● Future: wrong BIOS clock (through qemu -rtc base=XXXX), IPMI
emulation, packet loss/latency/bandwidth (through NETem: only
25MB!)

● What we discovered:
○ The underlying filesystem is hugely important
○ EXT4 handles most of it, XFS works (but recovery may be very
slow), BTRFS dies in horrible ways, ZFS barely notices
○ Using MySQL as the OpenNebula DB: every ≅25 crashes it
requires some work, every ≅150 crashes requires non-trivial
manual effort
○ Our custom SQLite (with WAL) survives happily (we
compensate the lack of concurrency with a query sequencer)
○ LizardFS is highly tolerant of multiple, parallel failures - disk,
network, whatever

● We took advantage of the exceptionally simple host probes
mechanism, to add additional information that is used by the
platform and the recovery heuristics
● Adding new probes takes very little time and effort - thanks to
OpenNebula simplicity
● We continue to add probes (for example, the P-value for
predicted user experience) and use background processes to add
forecasts

● OpenNebula works exceptionally well under torture, both in
virtual and physical testing
● LizardFS is amazingly resilient (CRC everywhere helps)...
● ...especially on ZFS with its transaction groups
● Chaos-monkey testing does not guarantee that every possible
fault path is tested…
● ...yet it helps in finding paths that we never thought about - but
our customers will for sure

Thanks!
Carlo Daffara
carlo.daffara@nodeweaver.eu
@cdaffara

Contenu connexe

Tendances

Containers have recently grown in interest in the virtualization community because they are a lightweight alternative to full machine virtualization. Containers offer an operating system level virtualization, where the kernel controls the isolation of the different containers. That means less isolation than virtual machines, but also less overhead in the system because they share parts of the kernel. The most important developments in the Open Source community are LXC and Docker. ONEDock is a set of extensions for OpenNebula to use Docker as a hypervisor in ONE, to provide containers as if they were lightweight Virtual Machines (VM). The underlying idea is that when OpenNebula is asked for a VM, a Docker container will be deployed instead. In the context of OpenNebula, it is managed as if it was a VM, and the user will be able to use IP addresses to access to the container. In this session we will learn about the integration of Docker and ONE, wich kind of the available features in ONE have been incorporated to the Docker containers, and how the operations on VMs are translated to the containers in the context of ONE. Moreover we will discuss some technical details and technical decissions taken in ONEDock, and how the other operations (that have not been implemented yet) will be integrated in ONEDock.

OpenNebulaConf 2016 - ONEDock: Docker as a hypervisor in ONE by Carlos de Alf...

OpenNebula Project

TechDay - Cambridge 2016 - OpenNebula at Knight Point Systems

OpenNebula Project

OpenNebula 4.14 Hands-on Tutorial

OpenNebula Project

OpenNebulaConf 2016 - Storage Hands-on Workshop by Javier Fontán, OpenNebula

OpenNebula Project

OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...

OpenNebula Project

TechDay - April - Customizing VM Images

OpenNebula Project

La direction informatique de l’Université de Strasbourg dispose d’un environnement de virtualisation composé de 700 machines virtuelles hébergées sur une centaine d’hyperviseurs. L’administration est faite à l’aide de virt-manager et de scripts python développés en interne. Suite aux nouvelles demandes de ses utilisateurs, la direction informatique a décider de mettre en place une solution de cloud privé. Le choix de l’outil s’est naturellement orienté vers une solution suffisamment flexible, personnalisable et simple pour permettre d’intégrer l’infrastructure existante et de faire face aux besoins de demain. Talk given by Guillaume Oberlé from Université de Strasbourg (unistra.fr) during Paris Techday 2015 http://opennebula.org/community/techdays/techday-paris-2015/

D’une infrastructure de virtualisation scripté à un cloud privé OpenNebula

OpenNebula Project

Customizing Virtual Machine Images - Javier Fontán

OpenNebula Project

An OpenNebula Private Cloud

databus.pro

We will discuss the new features, workflows and components that will be available in the next release of OpenNebula (4.14). We will learn about disk snapshots, accounting and showback enhancements, a big boost to our Ceph and qcow2 drivers, direct GPU access, cloning of running VMs, improvement of recovery mechanisms for VMs and many other things. Talk given during Paris Techday 2015 http://opennebula.org/community/techdays/techday-paris-2015/

Open nebula is evolving paris techday 2015

OpenNebula Project

BonFIRE is an european project which aims at providing a ”multi-site cloud facility for applications, services and systems research and experimentation”. Grouping different research cloud providers behind a common set of tools, APIs and services, it enables users to run their experiment against a heterogeneous set of infrastructure, hypervisors, networks, etc … BonFIRE, and thus the (OpenNebula) testbeds, provide a relatively small set of images used to boot VMs. However, the experimental nature of BonFIRE projects results in a big ”turnover” of running VMs. Lot of VMs are used for a time period between a few hours and a few days, and an experiment startup can trigger deployment of many VMs at same time on a small set of OpenNebula workers, which does not correspond to usual Cloud workflow. Default OpenNebula is not optimized for such usecase (small amount of worker nodes, high VMs turnover). However, thanks to its ability to be easily modified at each level of a Cloud deployment workflow, OpenNebula has been tuned to make it fit better with BonFIRE deployment process. This presentation will explain how to change OpenNebula TM and VMM to improve the parrallel deployment of many VMs in a short amount of time, reducing time needed to deploy an experiment to its lowest without lot of expensive hardware.

How Can OpenNebula Fit Your Needs: A European Project Feedback

NETWAYS

In the scope of a European Project (BonFIRE - www.bonfire-project.eu ), I had to tune openNebula to fit our requirement that are unusual in a private cloud environment (small hardware, small number of base images, but lot of vms created). These slides explain how, thanks to how OpenNebula enables administrators to tune it, I updated the transfer manager scripts to improve our deployment speed by almost 8.

How can OpenNebula fit your needs - OpenNebulaConf 2013

Maxence Dunnewind

Optimization_of_Virtual_Machines_for_High_Performance

StorPool Storage

Locally run a FIWARE Lab Instance In another Hypervisors

José Ignacio Carretero Guarde

OpenNebulaConf2015 2.03 Docker-Machine and OpenNebula - Jaime Melis

OpenNebula Project

OpenNebula TechDay Waterloo 2015 - OpenNebula is Evolving Fast

OpenNebula Project

OpenNebula TechDay Waterloo 2015 - Open nebula hands on workshop

OpenNebula Project

OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...

NETWAYS

Docker offers a new, lightweight approach to application portability. Applications are shipped using a common container format, and managed with a high-level API. Their processes run within isolated namespaces which abstract the operating environment, independently of the distribution, versions, network setup, and other details of this environment. This "containerization" has often been nicknamed "the new virtualization". But containers are more than lightweight virtual machines. Beyond their smaller footprint, shorter boot times, and higher consolidation factors, they also bring a lot of new features and use cases which were not possible with classical virtual machines. We will focus on one of those features: separation of operational concerns. Specifically, we will demonstrate how some fundamental tasks like logging, remote access, backups, and troubleshooting can be entirely decoupled from the deployment of applications and services. This decoupling results in independent, smaller, simpler moving parts; just like microservice architectures break down large monolithic apps in more manageable components.

Containerization is more than the new Virtualization: enabling separation of ...

Jérôme Petazzoni

In a lot of companies, machine deployment is a delicate subject: every administrator has his own recipe, using CD-ROMs, static binary images deployed via the network, peer delegation ... However, one solution makes the consensus when it comes to automated mass deployments ( except in the Cloud ): PXE boot. The main cons are that the deployment and the management of such a service is a pain, and every OS has its own installation automation system. This is where Cobbler saves the day: it enables a painless and reliably to create a PXE service, usable on either virtual or physical machines, while beeing the most agnostic possible towards the target OSes and its preconfiguration system (preseed, kickstart, sysprep, ...) while offering the possibility to handle lots of configuration parameters in a modular fashion (network, partitionning, user accounts, configuration management agent...) This conference aims to introduce the audience to the general concepts of Cobbler, and some scenarios where it would be a useful solution.

Cobbler - Fast and reliable multi-OS provisioning

RUDDER

Tendances (20)

OpenNebulaConf 2016 - ONEDock: Docker as a hypervisor in ONE by Carlos de Alf...

TechDay - Cambridge 2016 - OpenNebula at Knight Point Systems

OpenNebula 4.14 Hands-on Tutorial

OpenNebulaConf 2016 - Storage Hands-on Workshop by Javier Fontán, OpenNebula

OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...

TechDay - April - Customizing VM Images

D’une infrastructure de virtualisation scripté à un cloud privé OpenNebula

Customizing Virtual Machine Images - Javier Fontán

An OpenNebula Private Cloud

Open nebula is evolving paris techday 2015

How Can OpenNebula Fit Your Needs: A European Project Feedback

How can OpenNebula fit your needs - OpenNebulaConf 2013

Optimization_of_Virtual_Machines_for_High_Performance

Locally run a FIWARE Lab Instance In another Hypervisors

OpenNebulaConf2015 2.03 Docker-Machine and OpenNebula - Jaime Melis

OpenNebula TechDay Waterloo 2015 - OpenNebula is Evolving Fast

OpenNebula TechDay Waterloo 2015 - Open nebula hands on workshop

OpenNebula Conf 2014 | Using Ceph to provide scalable storage for OpenNebula ...

Containerization is more than the new Virtualization: enabling separation of ...

Cobbler - Fast and reliable multi-OS provisioning

Similaire à OpenNebulaConf2017EU: Torturing OpenNebula for Fun and Profit by Carlo Daffara, NodeWeaver

Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1W22OMy. Jerome Petazzoni explains in detail the advantages of immutable servers, then how to implement them with containers in general, and Docker in particular. Filmed at qconnewyork.com. Jerome Petazzoni is a senior engineer at Docker, where he helps others to containerize all the things. In another life he built and operated Xen clouds when EC2 was just the name of a plane, developed a GIS to deploy fiber interconnects through the French subway, managed commando deployments of large-scale video streaming systems in bandwidth-constrained environments such as conference centers.

Easier, Better, Faster, Safer Deployment with Docker and Immutable Containers

C4Media

Polstra 44con2012

Philip Polstra

Hacking and Forensics on the Go - 44CON 2012

44CON

Fun with FUSE

Kernel TLV

MIPS-X

Zoltan Balazs

"Never upgrade a server again. Never update your code. Instead, create new servers, and throw away the old ones!" That's the idea of immutable servers, or immutable infrastructure. This makes many things easier: rollbacks (you can always bring back the old servers), A/B testing (put old and new servers side by side), security (use the latest and safest base system at each deploy), and more. However, throwing in a bunch of new servers at each one-line CSS change is going to be complicated, not to mention costly. Containers to the rescue! Creating container "golden images" is easy, fast, dare I say painless. Replacing your old containers with new ones is also easy to do; much easier than virtual machines, let alone physical ones. In this talk, we'll quickly recap the pros (and cons) of immutable servers; then explain how to implement that pattern with containers. We will use Docker as an example, but the technique can easily be adapted to Rocket or even plain LXC containers.

Immutable infrastructure with Docker and containers (GlueCon 2015)

Jérôme Petazzoni

Docker and-containers-for-development-and-deployment-scale12x

rkr10

The root cause of most software exploits is bugs. Hardening, mitigations and containers are important, but they can't protect a system with thousands of bugs. In this presentation, Dmitry Vyukov will review the current [sad] situation with Linux kernel bugs and security implications based on their experience testing kernel for the past 3 years; overview a set of bug finding tools they are developing (syzbot, syzkaller, KASAN, KMSAN, KTSAN); and discuss problems and areas that require community help to improve the situation.

syzbot and the tale of million kernel bugs

Linux 开源操作系统发展新趋势

Data corruption

Multicore

Why threads are a bad idea

George Ang

The Deck by Phil Polstra GrrCON2012

Philip Polstra

Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ

Jérôme Petazzoni

Docker Introduction + what is new in 0.9

Jérôme Petazzoni

Thinking of fuzzing applications on OS X can quickly lead to a passing conversation of "ooh exotic Mac stuff", "lets fuzz the kernel" or it can otherwise not be thought of as an exciting target, at least for looking for crashes in stuff other than Safari or the iPhone. While there are some intricacies and nuance involved, workaround for security protections to enable debugging and finding tools that work and work well, this research will detail how it can be done in a reliable way and make the topic more tangible and easier to digest, kind of like how people think about using AFL on Linux: it "just works". We'll explore some of the overlooked attack surface of file parsers and some network services on Mac, how to fuzz userland binaries and introduce a new fuzzer that makes setup and crash triage straightforward while poking at some Apple core apps and clients. Have you ever thought "This thing has got to have some bugs" but think twice because it's only on available on Mac and not worth the effort? If so, you may now find yourself both more motivated and better equipped to do some bug hunting on the sleek and eventually accommodating Mac OS.

Summer of Fuzz: macOS

Jeremy Brown

Using the big guns: Advanced OS performance tools for troubleshooting databas...

Nikolay Savvinov

Headless Android

Opersys inc.

Perfect Linux Desktop - OpenSuSE 12.2

Davor Guttierrez

Lightweight Virtualization: LXC containers & AUFS

Jérôme Petazzoni

Similaire à OpenNebulaConf2017EU: Torturing OpenNebula for Fun and Profit by Carlo Daffara, NodeWeaver (20)

Easier, Better, Faster, Safer Deployment with Docker and Immutable Containers

Polstra 44con2012

Hacking and Forensics on the Go - 44CON 2012

Fun with FUSE

MIPS-X

Immutable infrastructure with Docker and containers (GlueCon 2015)

Docker and-containers-for-development-and-deployment-scale12x

syzbot and the tale of million kernel bugs

Linux 开源操作系统发展新趋势

Data corruption

Multicore

Why threads are a bad idea

The Deck by Phil Polstra GrrCON2012

Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ

Docker Introduction + what is new in 0.9

Summer of Fuzz: macOS

Using the big guns: Advanced OS performance tools for troubleshooting databas...

Headless Android

Perfect Linux Desktop - OpenSuSE 12.2

Lightweight Virtualization: LXC containers & AUFS

Plus de OpenNebula Project

We've made our way into the world of open cloud — where each organization can find the right cloud for its unique needs. A single cloud management platform cannot be all things to all people. There will be a cloud space with several offerings focused on different environments and/or industries. The OpenNebula commitment to the open cloud is at the very base of its mission — to become the simplest cloud enabling platform — and its purpose — to bring simplicity to the private and hybrid enterprise cloud. OpenNebula exists to help companies build simple, cost-effective, reliable, open enterprise clouds on existing IT infrastructure. The OpenNebula Conference will be a great opportunity to communicate and share our vision and commitment, to look back at how the project has grown in the last 9 years, and to shed some insight into what to expect from the project in the near future.

OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...

OpenNebula Project

Computer networks are undergoing a phenomenal growth, driven by the rapidly increasing number of nodes constituting the networks. At the same time, the number of security threats on Internet and intranet networks is constantly increasing, and the testing and experimentation of cyber defense solutions require the availability of separate, test environments that best reflect the complexity of a real system. Such environments support the deployment and monitoring of complex mission-driven network scenarios, and cyber security training activities, thus enabling enterprises to study cyber defense strategies and allowing security researchers to evaluate their algorithms at scale. The main objective is delivering to researchers and practitioners an overview of the technological means and the practical steps to setup a private cloud platform based on OpenNebula for the creation and management of virtual environments that support cyber-security activities of training and testing, as well as an overview of its possible applications in the cyber security domain. In particular: 1. We describe our infrastructure based on OpenNebula 2. We overview our application, sitting on top of OpenNebula, as well as the technological tools involved in the management of its lifecycle (e.g., Ansible) . 3. We show how the platform can support various examples of security research activities  [References] Building an emulation environment for cyber security analyses of complex networked systems, Tanasache, Florin Dragos and Sorella, Mara and Bonomi, Silvia and Rapone, Raniero and Meacci, Davide, ICDCN '19, ACM, 2019

OpenNebulaConf2019 - Building Virtual Environments for Security Analyses of C...

OpenNebula Project

OpenNebulaConf2019 - CORD and Edge computing with OpenNebula - Alfonso Aureli...

OpenNebula Project

OpenNebulaConf2019 - 6 years (+) OpenNebula - Lessons learned - Sebastian Man...

OpenNebula Project

OpenNebula users have a range of storage options available to them, including proprietary appliances, proprietary software and Open Source software projects. This session will present a fully Open Source approach, that tightly integrates with Linux, and makes full use of the mature building blocks within the Linux kernel (LVM, Software RAID, DM-crypt, NVMe-oF Target, DRBD, etc...), and delivers one of the highest performance open source storage stacks currently available. The core goal is to expose the improved performance of NVMe storage devices to VMs and containers. The solution covers both local NVMe drives and NVMe-oF. For interacting with NVMe-oF targets it supports the Swordfish-API and LVM & Linux’s software NVMe-oF target. The solution contains a storage addon for OpenNebula.

OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...

OpenNebula Project

OpenNebulaConf2019 - Image Backups in OpenNebula - Momčilo Medić - ITAF

OpenNebula Project

At Iguane Solutions, a lot of our "DevOps" tools are developed in Golang, and we have a good amount of experience in contributing to the Goca. I'll review just what contributions we make, as well as how we use Goca with different tools, on a daily basis, to manage and monitor our OpenNebula cloud. I will delve into the concept of Infrastructure as Code - deployment of VM instances on cloud, as well as, also address the metrics collection of deployed VMs. Finally, I will present how we can abstract VM management with automation tools thanks to GOCA.

OpenNebulaConf2019 - How We Use GOCA to Manage our OpenNebula Cloud - Jean-Ph...

OpenNebula Project

OpenNebulaConf2019 - Crytek: A Video gaming Edge Implementation "on the shoul...

OpenNebula Project

Replacing vCloud with OpenNebula

OpenNebula Project

NTS: What We Do With OpenNebula - and Why We Do It

OpenNebula Project

OpenNebula from the Perspective of an ISP

OpenNebula Project

NTS CAPTAIN / OpenNebula at Julius Blum GmbH

OpenNebula Project

Performant and Resilient Storage: The Open Source & Linux Way

OpenNebula Project

NetApp Hybrid Cloud with OpenNebula

OpenNebula Project

NSX with OpenNebula - upcoming 5.10

OpenNebula Project

Security for Private Cloud Environments

OpenNebula Project

CheckPoint R80.30 Installation on OpenNebula

OpenNebula Project

DE-CIX: CloudConnectivity

OpenNebula Project

DDC Demo

OpenNebula Project

Cloud Disaggregation with OpenNebula

OpenNebula Project

Plus de OpenNebula Project (20)