This document summarizes a study on optimizing virtual machine placement in data centers. It discusses the motivations of energy management, resource usage optimization, and traffic engineering. It then reviews several approaches to virtual machine placement optimization, including stochastic integer programming, genetic algorithms, bin packing, constraint programming, subgraph isomorphism algorithms, and ant colony optimization heuristics. It also discusses considerations for the different approaches and outlines ideas for future work, such as mapping resource managers to placement algorithms and developing an objective/approach matrix.
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
A Study of Virtual Machine Placement Optimization in Data Centers (CLOSER'2017)
1. Stéphanie Challita | Fawaz Paraiso | Philippe Merle
Inria Lille – Nord Europe & University of Lille (France)
7th International Conference on Cloud Computing and Services Science
(CLOSER 2017)
A Study of Virtual Machine
Placement Optimization in
Data Centers
2. VMVMVM
VM
24 – 26 April, 2017 Porto, Portugal2/19
Virtualization
“Virtualization is the key concept of Cloud Computing”
How to select the most suitable host for each virtual
machine?
Hypervisor Hypervisor Hypervisor
4. 24 – 26 April, 2017 Porto, Portugal4/19
Motivation 1
Energy Management
Minimizes the cost of powering at the hardware level
Server consolidation
Green Data Centers are a must to fight against the
huge power consumption and bills caused by
inappropriate virtualization
5. 24 – 26 April, 2017 Porto, Portugal5/19
Motivation 2
Resource Usage Optimization
Resources should be:
Available to applications only as needed
Not allocated statically based on the peak workload
demand
This is known by the “Elasticity of the Cloud”
6. 24 – 26 April, 2017 Porto, Portugal6/19
Motivation 3
Traffic Engineering
To maintain data center applications efficiency
accurate planning of the network architecture
VL2N-Tree (source: (Fang et al., 2013))BCube (source: (Wang et al., 2014))
Fat-tree (source: (Fang et al., 2013))
VL2 (source: (Fang et al., 2013))
7. 24 – 26 April, 2017 Porto, Portugal7/19
Approaches
Stochastic
Integer
Programming
Genetic
Algorithm
Bin Packing
Constraint
Programming
Subgraph
isomorphism
algorithms
Greedy
heuristics
ACO
heuristics
Easily extendable
take additional constraints into account
Relatively long search times
8. 24 – 26 April, 2017 Porto, Portugal
Approaches
Stochastic
Integer
Programming
Genetic
Algorithm
Bin Packing
Constraint
Programming
Subgraph
isomorphism
algorithms
Greedy
heuristics
ACO
heuristics
The number of PMs used is reduced to the half
This approach might put two interfering VMs on one PM
8/19
10. 24 – 26 April, 2017 Porto, Portugal10/19
Approaches
2. Bin Packing
Ant Colony Optimization (ACO) heuristics:
Source: upload.wikimedia.org/wikipedia/commons/thumb/a/af/Aco_branches.svg/2000px-Aco_branches.svg.png
Ant System (AS)
Ant Colony System (ACS)
Min-Max Ant System (MMAS)
11. 24 – 26 April, 2017 Porto, Portugal11/19
Approaches
2. Bin Packing
Subgraph isomorphism algorithms:
f(S1) = VM1
f(S2) = VM2
f(S3) = VM3
f(S4) = VM4
f(S5) = VM5
VM1
VM4 VM3
VM2 VM5
VM graph
S1
S5
S4
S2
S3
Server graph
An isomorphism between
Servers and VMs
12. 24 – 26 April, 2017 Porto, Portugal12/19
Approaches
Stochastic
Integer
Programming
Genetic
Algorithm
Bin Packing
Constraint
Programming
Subgraph
isomorphism
algorithms
Greedy
heuristics
ACO
heuristics
Helpful in estimating the variation in demands and prices
frequent recomputations are not needed
Users might end up paying more if there is an estimation
error
13. 24 – 26 April, 2017 Porto, Portugal13/19
Approaches
Stochastic
Integer
Programming
Genetic
Algorithm
Bin Packing
Constraint
Programming
Subgraph
isomorphism
algorithms
Greedy
heuristics
ACO
heuristics
It solves the VM interference problem encountered in the
Bin Packing approach
It requires more computing time and higher computing
resources
14. 24 – 26 April, 2017 Porto, Portugal14/19
Approaches
Stochastic
Integer
Programming
Genetic
Algorithm
Bin Packing
Constraint
Programming
Subgraph
isomorphism
algorithms
Greedy
heuristics
ACO
heuristics
Population of
server capacities
Determine the
fitness of each server
Select next
generation
Perform reproduction
using crossover
Perform
mutation
Display results
Desired condition reached
Else
15. 24 – 26 April, 2017 Porto, Portugal15/19
Discussion
Constraint
Programming
Bin Packing Stochastic
Integer
Programming
Genetic
Algorithm
We know the
demands of VMs
we compute the
cost functions
The demand is
highly variable
Physical
machines have the
same amount of
memory and
processing
capabilities
We have
uncertain
parameters on
which the cost
depends
We need to
operate on groups
Objective
functions
dynamically change
17. 24 – 26 April, 2017 Porto, Portugal17/19
Discussion
Metrics for Future Empirical Studies
SLA violation percentage
Energy amount
Number of VM migrations
100%
100%
100%
Each VM placement algorithm works well under
specific conditions/objectives
Comparative analysis becomes quite tricky
18. 24 – 26 April, 2017 Porto, Portugal18/19
Future Work
Map between Resource Managers and Placement Algorithms
PA1
PA2
PA3
PA4
OpenStack
Vmotion
Containers add new efficiency to
Cloud Computing
VM
MESOS
Kubernetes
VM
Hypervisor
VM
19. 24 – 26 April, 2017 Porto, Portugal19/19
Future Work
Objective /Approach Matrix
Energy Resources Traffic
Constraint
Programming
Bin Packing
Stochastic Integer
Programming
Genetic Algorithm
Greedy heuristics
Subgraph Isomorphism
algorithms
ACO heuristics
What about an hybrid solution?
Hello everyone, I’m Stéphanie Challita, a PhD student in University of Lille, France and also a member of Inria research team.
I’m here to present for you my paper “A Study of VM Placement Optimization in Data Centers”.
In cloud computing domain, since provisioning Virtual Machines (VMs) is fundamental to provide infrastructure services, one can say that virtualization is the key concept of cloud computing.
However, VMs need to be adequately placed to fulfill performance goals, to optimize network flows, and to reduce CPU, storage and energy
costs. These are the motivations behind this work that I will detail in next slides.
So How to select the most suitable host for each virtual machine?
In order to answer this question, I present a survey of various approaches studying VM placement, highlighting their key concepts, as well as the state-of-the-art implementations.
As I said, the motivation behind VM placement optimization can be energy-aware, resource-aware, traffic-aware, or a combination of these.
First, Enhancing energy efficiency in data centers can be resolved by applying a suitable VM placement algorithm that minimizes the cost of powering at the hardware level.
Moreover, turning off unused machines, on the basis of server consolidation and energy-aware job scheduling, can also constitute a solution for the energy problem.
In this context, “Green Data Centers” are nowadays a must to fight against huge power consumption and bills caused by inappropriate virtualization.
Secondly, In order to maintain the application performance, isolation and security, each VM requires a certain amount of resources, such as CPU, memory and link bandwidth, etc.
In order to minimize their cost, these resources should be made available to applications only as needed and not allocated statically based on
the peak workload demand.
This is known as the “elasticity of the cloud”.
Third, measuring and optimizing data center traffic is important to maintain the efficiency of applications.
For information, a data center, which hosts thousands of devices like servers, switches and routers, needs an accurate planning of the network architecture.
One can distinguish several architectures such as Fat-tree, VL2 and BCube.
VM placement may depend of these architectures.
As shown in this Figure, our classification of VM placement algorithms is based on four main approaches:
1) Constraint Programming,
2) Bin Packing,
3) Stochastic Integer Programming,
and 4) Genetic Algorithm.
We start by detailing the first approach which is Constraint Programming.
Since this approach can always consider additional constraints, it can always be expandable.
However, in cases where we have several constraints to take into consideration, this approach may take too much time to find the most suitable VM placement.
Therefore, the main challenge consists in finding the optimal solution before any modification in terms of the constraint parameters.
The Bin Packing problem is an NP-hard problem that can be solved using Greedy heuristics, Ant Colony Optimization (ACO) heuristics or Subgraph isomorphism algorithms
Bin Packing can reduce to the half the number of PMs.
In order to do so, this approach may host two interfering VMs on one PM.
For solving the bin packing problem, we distinguish several greedy heuristics, such as FF.
FF places each VM into “the first bin in which it will fit”.
For example, we consider we have 3 VMS with different RAM capacities and 2 servers with 2GB RAM capacity each.
Since the first and the second VM can fit in the first server, we place them there. However, the third VM will be placed on the second server for lack of resources on the first.
FF is very quick but is not likely to lead to an optimal solution.
It is more efficient when first sorting the list of elements into a decreasing order. This is the First-Fit Decreasing.
The second heuristic for the bin packing problem is ACO, which is a probabilistic technique that can be reduced to finding good paths through graphs.
It is inspired from the collective behaviour of social insects.
When searching for food, ants tend to choose paths marked by strong pheromone concentrations.
So as soon as an ant finds a food source, it studies the quantity and the quality of the food and takes some of it back to the nest.
During the return trip, the quantity of pheromones that an ant leaves on the ground may depend on the quantity and quality of the food.
The pheromone trails will guide other ants to the food source
And enables them to find the shortest paths between their nest and food sources
This behaviour, aiming for the shortest paths, can be used for the VM migration optimization between PMs.
Some extensions of ACO algorithms are presented in the literature such as Ant System (AS), Ant Colony System (ACS) and MAX-MIN Ant System (MMAS)
The third heuristic for solving the Bin Packing Problem is the subgraph isomorphism algorithms where two graphs are given as input, and one must determine whether the first graph contains a subgraph that is isomorphic to the second graph.
Recently, several algorithms have used subgraph isomorphism to formulate the problem of VM placement, i.e., to model data center topologies and VM clusters.
The two graphs shown below are isomorphic, despite their different looking drawings.
They the same number of nodes connected in the same way.
In graph theory, we can talk about bijection between the node sets of Server Graph and VM Graph
Stochastic Integer Programming is helpful in estimating the variation in demands and costs.
Thereby, frequent recomputations are not needed, but if there is an error in the estimation, unfortunately users might end up paying more.
Last but not least, we have Genetic Algorithm.
It considers additional constraints while optimizing the cost function, so it solves the VM interference problem encountered in the Bin Packing approach
But it requires more computing time and higher computing resources as compared to Bin Packing
This activity diagram explains the genetic algorithm. We start by choosing the population of server
For a better understanding of the four approaches, we provide this table that explains the optimal case for using each of these approaches.
An objective function is a function to maximize or minimize.
Similarity with the fitness in GA.
This figure summarizes the classification of 17 methods stated in this work.
In the paper, we have identified for each method the approach to which it belongs, the objective, network architecture if this information exists, evaluation type (simulation experiments, simulation in real environments, experiments with real workload…), as well as the competitor approaches.
We can state that Bin Packing is lately the most employed approach. It always generates a good solution in a correct amount of time.
It is crucial to choose a VM placement technique that suits the needs of both the cloud user and cloud provider.
However, due to the presence of several parameters, comparative analysis in a uniform fashion of such techniques becomes quite tricky.
In fact, each of the VM placement algorithm works well under certain specific conditions/objectives.
Thereby, in order to compare the efficiency of the previous algorithms, we propose that future empirical studies will be based on the three following metrics to measure and evaluate the algorithms performance.
Firstly, one should take into account the energy amount consumed by data center resources, due to the application workloads.
The second metric to be considered is the SLA violation percentage, which expresses the level by which performance requirements defined between the resource provider and consumers are violated.
(The SLA violation can happen when VMs sharing the same PM need a CPU performance that cannot be provided because of energy-aware resource management and consolidation. The provider pays a penalty to the client in case of SLA violation.)
The third metric is the number of VM migrations during the adaptation of the VM placement.
VM migrations consume time, energy and network bandwidth. Thus, it is important to minimize the number of VM migrations.
Many future directions and perspectives have not been explored yet and can be contemplated for the future.
For example, nowadays there are several resource managers that are mostly doing the placement of VMs, like vMotion, the commercial product of
VMware and OpenStack, the open-source cloud manager.
Other resource managers like Kubernetes, Swarm, and Mesos to cite a few, are responsible for the placement of containers.
Therefore, it will be interesting to conduct an exhaustive study of the existing resource managers, and to map between them and the placement algorithm(s) they use.
Finally, none of the identified approaches cover the three detailed motivations.
Designing an hybrid solution combining several approaches represents a future challenge.
I would like to mention that this work is supported by the French project OCCIware that aims at managing any kind of cloud resources by using the OCCI standard.
Thank you for your attention.
I will be happy to answer your questions.
The virtual machine cluster makes use of virtual machines as nodes. The main motive behind a virtual machine cluster is to install multiple functionalities on the same server. This works by enhancing the server utilization.
Virtual machine clusters work by protecting the physical machine from any hardware and software failures. When a physical node fails, the virtual machine can access another node, with no time lag. And thus, virtual machine clustering provides a dynamic backup processes. It is therefore widely used in organizations where data is of great value, all thanks to its easy disaster recovery capabilities.