The document proposes a cost-aware virtual machine placement approach across distributed data centers using Bayesian networks. It designs a Bayesian network to model expert knowledge on cloud infrastructure management. It then uses the GQM method to define measures for criteria based on the Bayesian network outputs. Finally, it applies multi-criteria decision analysis to create a utility function for virtual machine allocation and migration decisions. The approach was evaluated using a cloud simulation framework and real workload and infrastructure data, showing improvements of up to 69% in total costs compared to baseline algorithms.
Cost-Aware Virtual Machine Placement acrossDistributed Data Centers using Bayesian Networks
1. Cost-Aware Virtual Machine Placement across
Distributed Data Centers using Bayesian Networks
Dmytro Grygorenko*, Soodeh Farokhi*, and Ivona Brandic
Vienna University of Technology, Austria
(*contributed equally to the paper)
12th International Conference on Economics of Grids, Clouds, Systems, and Services
Cluj-Napoca, Romania
September 15 – 17, 2015
2. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Introduction
• Fast growing trend of Cloud Computing industry of over 300% in the last 6 years
• 86% of companies use more than one type of Cloud Computing services
• 30 millions of Cloud servers geographically distributed all over the world
• Huge environmental impact of Cloud Computing (1-2% of the world electricity usage)
• Not optimal energy usage plans while there are possible cost efficient solutions
• Challenges:
– high Quality-of-Service (QoS) expectations of Cloud customers
– Cloud providers’ challenges for the Costs vs. QoS trade-off
2Introduction Approach Evaluation Conclusion
Fig.1: Windows Azure CDN Locations [1]
[1] https://www.simple-talk.com/cloud/development/an-introduction-to-windows-azure-%28part-2%29/
3. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Agenda
• Motivation
• Contributions & Challenges
• Approach
• Evaluation
• Conclusion & Future Work
3Introduction Approach Evaluation Conclusion
4. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Motivation
• How to reduce high operational cost of running cloud infrastructure and to minimize SLA penalty cost?
Addressed problems so far:
– Virtual Machine (VM) placement
– Temperature-aware energy usage
– Performance of VM migration
BUT what is missing?! Modeling a combination of these problems while tackling the interconnections and
dependencies challenges across geo-distributed DCs.
• How to evaluate the applicability of the proposed solutions?
Existing simulation frameworks (CloudSim, D-Cloud, PreFail, etc.) DO NOT ALLOW to simulate necessary objectives:
– Geo-distributed DCs
– Cooling systems
– Weather data
– Power outages
– SLAs
4Introduction Approach Evaluation Conclusion
5. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Contributions
• An approach to reduce the cloud operating cost by applying VM placement across geo-distributed DCs:
– Leverages the cloud expert knowledge and models them in a Bayesian Network (BN)
– The outputs of the BN are utilized in proposed VM allocation and consolidation algorithms
• A cloud simulation framework CloudNet [1] with the following features:
– Simulation of cloud infrastructure
– Utilization and generation of various application workloads
– Usage of geo-distributed DCs
– Management of cooling systems
– Usage of synthetic and real weather data
– Scheduling of power outages
– SLA-aware simulation
– Prediction of resource usage
5
[1] https://github.com/dmitrygrig/CloudNet/
Introduction Approach Evaluation Conclusion
6. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Challenges of managing Cloud data centers
• Geo-distributed DCs
– Dynamic electricity market
– Various time zones
– Different weather conditions (e.g., temperature)
• Frequent power outages
• VM migration is dependent on dynamic factors such as VM RAM size, bandwidth, etc.
• Trade-off (multi-criteria decision problem): reduction of the DCs energy cost vs. customer satisfaction in terms of QoS
6
Fig1.: Windows Azure CDN Locations [1]
[1] https://www.simple-talk.com/cloud/development/an-introduction-to-windows-azure-%28part-2%29/
Example: Microsoft Azure
• Operates in several regions around the world
• Electrical downtimes: from 6 min/year in Japan till 20 h/year in Brazil
• Energy prices differ more than twice at some locations
• Day/ night energy price rates
• Outdoor temperatures from -35 °C to +35 °C
Introduction Approach Evaluation Conclusion
7. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
VM Placement using Bayesian Network
VM Placement Phases:
• Phase 1: Designing the BN to represent expert domain knowledge on cloud infrastructure management
• Phase 2: Using GQM method to define the underlying measures for the chosen criteria based on the BN’s output
• Phase 3: Applying MCDA method to create the utility function as the final decision making indicator
7
Phase 1
modeling the expert
knowledge in a Bayesian
Network
Phase 2
using GQM method to
quantify the chosen
criteria
Phase 3
applying MCDA to
define utility function
for each decision
VM allocation
(when a new VM
request arrives)
VM migration
across distributed
DCs (periodically)
Introduction Approach Evaluation Conclusion
8. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Phase 1: Designing Bayesian Network
Phase 1: designing the BN to represent expert domain knowledge on cloud infrastructure management
• BN is used as for a decision making model
What are Bayesian Networks?
• graphical models to represent variables of interest (e.g., event occurrences) and probabilistic dependencies
among them
• they simulate the mechanism of exploring causal relations between key factors 𝑃 𝐴 𝐵 =
𝑃 𝐵 𝐴 𝑃(𝐴)
𝑃(𝐵)
Why Bayesian Networks?
• applying knowledge about domain to find hidden and causal relationships
• discovering relationships in raw data
• ability to prove the correctness of built models (based on the powerful mathematical background)
8Introduction Approach Evaluation Conclusion
Input • various observations of cloud infrastructure
Output • possibilities of decision criteria
9. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Designing BN for a simplified decision problem (1)
Problem: where a new VM request should be allocated?
Observations: Data centre location (Europe, Asia, etc.), time of day (day/night), season (winter, summer, etc.)
Hidden factors: Weather conditions
Criteria: Energy price, possibility of power outage
Decision Action: Allocate VM
9
Fig 2.: Designing Bayesian Network. Step 1: Consider energy price and dependent factors Table. 1: CPT of Probability (Energy Price | DC Location, Time of Day)
Introduction Approach Evaluation Conclusion
10. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Designing BN for a simplified decision problem (2)
10
Fig 3.: Designing Bayesian Network. Step 2: Add power outage criterion Table 3: CPT of Probability (Power Outage| DC Location, Weather Conditions)
Table 2: CPT of Probability (Weather Conditions | DC Location, Season)
Introduction Approach Evaluation Conclusion
11. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Designing BN for a simplified decision problem (3)
11
At the same time in South or North America…
Fig. 4: Querying of the BN for a DC in South America Fig. 5: Querying of the BN for a DC in North America
Introduction Approach Evaluation Conclusion
12. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Phase 2: using GQM
Phase 2: using GQM method
• definition of underlying measures for the chosen criteria 𝑔𝑖 𝑎
• based on the BN’s output
12
Power outage
Low 1
Middle 0.3
High 0.1
Energy price
Low 1
Middle 0.7
High 0.5
Table 4: Criteria mapped to values in [0,1] interval using GQM
Introduction Approach Evaluation Conclusion
13. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Phase 3: applying MCDA
Phase 3: applying MCDA method to create the utility function as the final decision making indicator
• quantitative measurement of the benefit of a certain decision
• expressed as an utility function based on set of factors and criteria calculated previously:
𝑈 𝑎 = 𝑤𝑖 𝑔𝑖 𝑎
𝑔𝑖 is a criteria
𝑤𝑖 is an utility weighting that represents relative importance of each criteria
Decision: migrate VM to a PM with the highest value of the utility function
13
Power Outage (2) Energy price (1) Total utility
Europe 1 0.2 2.2
Asia 1 0.2 2.2
North America 1 1 3.0
South America 0.3 1 1.6
Table 5: Weighted utilities of different actions according to MCDA
Introduction Approach Evaluation Conclusion
14. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Summary of the proposed approach
Problem: which PM should be used for allocation/migration of a VM?
Observations: Data center location, PM & VM resources utilization, temperature, cooling mode, electricity price, power outage statistics
Hidden factors: Dirty Page Rate (DPR), partial power usage effectiveness (pPUE), possibility of VM downtime
Criteria: VM unavailability (𝑔1), PM power consumption (𝑔2), PM CPU utilization (𝑔3), VM migration duration (𝑔4), energy price (𝑔5)
Decision Actions: Allocate/Migrate VM, Switch On/Off PM
14
Fig. 6: A snapshot of the designed Bayesian Network
Introduction Approach Evaluation Conclusion
15. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Evaluation input data
Used real data traces:
• temperature (http://forecast.io/)
• cooling modes (Mechanical, Air, Mixed)
• power outage statistics
• electricity prices
• PM power specifications (SPECpower benchmark)
15Introduction Approach Evaluation Conclusion
Fig. 8a: Temperature data traces
Fig. 7: Power outage statistics [1]
http://earlywarn.blogspot.co.at/2013/05/international-power-outage-comparisons.html
Fig. 8b: Cooling modes Fig. 8c: Energy prices
16. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Evaluation Setup
• Simulation period: 1 month (January, 1, 2013 – February, 1, 2013)
• Interval: 1 hour
• VM specs: 1000MIPS, 768MB RAM
• PM specs: 3000MIPS, 4GB RAM (HP ProLiant ML110 G3)
16Introduction Approach Evaluation Conclusion
Table 6: Evaluation setup configuration
http://www.spec.org/power_ssj2008/results/res2011q1/power_ssj2008-20110127-00342.html
Fig. 9: PM power specifications [1]
17. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Evaluation baseline algorithms
• No Migration (NoM): First-Fit VM allocation without migration
• First-Fit-Decreasing
• Agreed (FFD-A): resources agreed by the SLA
• Requested (FFD-R): resources required by the VM at runtime
• Bayesian Network Decision Model
• Last Workload policy (BN-LW): next workload value equals to the last one
• Trend Workload policy (BN-TW): values follow a certain linear trend
• Linear Regression Workload (BN-LRW): applying linear regression on data
17Introduction Approach Evaluation Conclusion
Table 7: CPU-provisioning for different allocation strategies
Provisioned Utilized Agreed
FFD-A 4 CPU 2 CPU >= 4 CPU
FFD-R 2 CPU 2 CPU >= 4 CPU
baseline algorithms
the proposed approach
(wokload prediction policies)
18. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Evaluation Results
• Improvements of total costs
• up to 69% in comparison to NoM
• up to 45% in comparison to FFD-R which has less number of migrations
• up to 18% in comparison to FFD-A which has more number of migrations
• Trend Workload policy
• the best results
• Linear-regression Workload policy
• increases the cost efficiency
• more SLA violations
18Introduction Approach Evaluation Conclusion
Fig. 9: Evaluation results
19. 12th International Conference on Economics of Grids, Clouds, Systems, and Services (GECON’15), Romania , 15-17 Sep, 2015
Conclusion & Future work
Summary
• cost-aware VM placement approach, leveraging domain knowledge to reduces the energy cost
• up to 69% in comparison to NoM, the 1st baseline algorithm
• up to 45% in comparison to FFD, the 2nd baseline algorithm
• evaluated in a novel simulation framework CloudNet with a rich set of cloud simulation opportunities
Ongoing work
• enhancement of VM placement by using more workload prediction techniques
• utilization of hybrid Bayesian Networks to use the analogous data
19Introduction Approach Evaluation Conclusion
20. Thank you for attention!
20
12th International Conference on Economics of Grids, Clouds, Systems, and Service (GECON’15), Romania , 15-17 Sep, 2015
VM Placement using Bayesian Network
VM Placement Phases:
• Phase 1: Designing the BN to represent expert domain knowledge on cloud infrastructure management
• Phase 2: Using GQM method to define the underlying measures for the chosen criteria based on the BN’s output
• Phase 3: Applying MCDA method to create the utility function as the final decision making indicator
Phase 1
modeling the expert
knowledge in a Bayesian
Network
Phase 2
using GQM method to
quantify the chosen
criteria
Phase 3
applying MCDA to
define utility function
for each decision
VM allocation
(when a new VM
request arrives)
VM migration
across distributed
DCs (periodically)
12th International Conference on Economics of Grids, Clouds, Systems, and Service (GECON’15), Romania , 15-17 Sep, 2015
Summary of the proposed approach
Problem: which PM should be used for allocation/migration of a VM?
Observations: Data centre location, PM & VM resources utilization, temperature, cooling mode, electricity price, power outage statistics
Hidden factors: Dirty Page Rate (DPR), partial power usage effectiveness (pPUE), possibility of VM downtime
Criteria: VM unavailability (𝑔1), PM power consumption (𝑔2), PM CPU utilization (𝑔3), VM migration duration (𝑔4), energy price (𝑔5)
Decision Actions: Allocate/Migrate VM, Switch On/Off PM
Fig. 6: A snapshot of the designed Bayesian Network
Dmytro Grygorenko: dmitrygrig@gmail.com
at.linkedin.com/in/dmitrygrig
Soodeh Farokhi: soodeh.farokhi@tuwien.ac.at
www.infosys.tuwien.ac.at/staff/sfarokhi
at.linkedin.com/in/soodehfa