Abstract: The increasing energy consumption of Physical Machines (PM) in cloud data centers is nowadays a major problem, it has a negative impact on the environment while at the same time increasing the operational costs of data centers. This fosters the development of more energy-efficient scheduling approaches. In this study, we study the barriers of knowledge in energy efficiency for cloud data centers.
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
1. A Study on Task
Scheduling in Could Data
Centers for Energy
Efficacy
By:
Ehsan Sharifi Esfahani
Awday Korde
Viktor Bakayov
2. Outline
Motivations
Possible approaches
Different taxonomies
Challenges
Main reasons of high energy consumptions in this data centers
Energy proportionally
Server utilizations
Solutions for the problem
Aggregation
DVFS
Trend in Task Scheduling
System Modeling to Design a New Task Scheduler
Energy Modeling
ECS Algorithm
Future Possible Works
Conclusions
3. Motivations
Cost
In 2013, data centers in the US consumed electricity for about 91 billion
kilowatt-hours, which can accommodate all the households in New York
City for two years. Moreover, this figure will reach 140 billion kilowatt
hours by 2020, or about $11 billion in electricity costs.
Environmental Impact
CO2 emission form Data center worldwide is estimated to increase from
80 Megatons in 2007 to 340 MT in 2020, more than double of CO2
emission in the Netherland (145 MT)
Improve system performance and uses
Negative impacts for density, reliability, and scalability of datacenter
hardware.
4. Global Cloud Data Center IP Traffic Trend
• As the Cloud traffic is increasing, the concern for energy
consumption is also rising
5. Which actions can be taken to reduce
energy consumption?
Simply turning off unused devices.
Reducing unnecessary traffic.
Using energy-aware algorithms.
Adjusting fan speed to server load and temperature.
Using Nano Data centers !!! It is still under study.
Using symbols and elements with more energy efficient (We
named it modification element density). Our suggestion!
Etc …
6. Different taxonomies for possible
approaches
Power-aware
Power-aware technologies either use low power energy-efficient
hardware equipment, or reduce energy usage based on the knowledge of
current resource utilization and application workloads.
Thermal-aware (Explicitly have some other parameter inside)
The thermal-aware approaches take the temperature in their energy
model.
The temperature depends on the power consumption of each processing
element, dimension, and relative location on the embedded system
platform.
The goal of thermal-aware approaches is to minimize peak air inlet
temperature resulting in minimization of the cost of cooling.
7. Single (Single optimization problem)
The aims is only reduce energy consumption
Composite (Multi optimization problem)
Energy function deals with minimizing energy consumption and improving
another parameter such as QoS or execution time of tasks.
Our Suggestion
Software-base approaches
Dynamic power management (DPM)
Energy efficient task scheduling
Hardware-base approaches
Energy efficient hardware
Energy efficient topologies
8. Challenges
Heterogeneous network architectures.
Trade-off between performance, scalability, availability, reliability
and energy consumption.
Designing energy-aware algorithms means moving to multi-
optimization problems and thereby more complex algorithms and
more energy consumption.
Need significant changes in most applications and equipment and how
to incorporate these new energy-aware protocols into real systems is
still an open problem.
10. Main reasons of high energy consumption
in servers
Energy Proportionality (Inefficient trend in server
energy usage)
Utilization
11. Server utilization
The figure is shown that the average CPU utilization of 5,000 Google servers during
a 6-month period in 2007.
A common trend is that, on average, servers spend relatively little aggregate time
at high load levels. Instead, most of the time is spent within the 10–50% CPU
utilization range.
13. Dynamic Voltage and Frequency Scaling
(DVFS)
𝑃𝐶𝑃𝑈 = 𝑃𝑠𝑡𝑎𝑡𝑖𝑐 + 𝑃 𝐷𝑦𝑛𝑎𝑚𝑖𝑐
𝑃 𝐷𝑦𝑛𝑎𝑚𝑖𝑐 = ACV2
f = αV2
f
𝐸 =
𝑃
𝑓
𝑓 = 𝑘.
(𝑉 − 𝑉𝑡ℎ)2
𝑉𝑑𝑑
𝑃 ≈ 𝑉3
, 𝐸 ≈ 𝑉2
, 𝑇 ≈
1
𝑓
≈
1
𝑉
C Capacity Load
A A=Average number of switches
and the circuit in time unit
V Voltage
f Working frequency
14. Dynamic Voltage and Frequency Scaling
For example,
If a program can run with frequency f in 10 seconds, then the energy
consumption and power consumption during the period are P and E,
respectively. Then if voltage get half then frequency get half too,
and execution time become 20 second.
However, energy consumption and power consumption would be as
follows:
𝑃′ = α
𝑉
2
2
𝑓
2
=
1
8
α V2f =
1
8
𝑃
𝐸′
= 𝑃′
𝑡′
=
1
8
. 𝑃. 20 =
5
2
.
𝐸
10
=
1
4
𝐸
15. Trend in Task Scheduling on Physical
Machines (PMs) (Our suggestion)
Single Machine, Single Core per processors
Which tasks should be select?
Single Machine, Multiple processors or core.
Which task, on which processor or core? (Specially on heterogeneous environment)
Single Machine, Multiple core and Multiple processors
Which task, on which core and processor?
Multiple machine, multiple processors, multiple cores
Which task, Which machine, on which processors, on which cores.
Multiple machine, multiple processors, multiple cores with DVFS-enables
capability
Which task, Which machine, on which processor and core, with which frequency.
(Or which VM with which frequency, in many-core systems dark silicon property
also should be taken into account).
16. Task Scheduling
Static (Offline)
Which is usually done before compile time, the characteristics of the program are
known before execution. This method does not cause any overhead on the system
during runtime.
Dynamic (Online)
Which characteristics ought to be determined before execution, then scheduling
process have to be done during the course of scheduling according to the state of
the system. This strategy is good for independent tasks.
17. System Modeling to Design a New Task
Scheduler
Machines Model:
Homogeneous, Heterogeneous
Fully connected or not
If processors are Multi-cores or many-cores
If they are per core DVFS-enabled or per processors
And etc.
Application Model:
If each service should run in serial or they are multi-task or multi-thread
Arrival rate
Duration is unknown or not
And etc.
18. Virtual machine model
How many VM is needed for each arrival service request?
If each VM can run in its own specific range of frequency (𝑓 𝑚𝑖𝑛
𝑠
, 𝑓𝑚𝑎𝑥
𝑠
) or all use the
same protocol.
How many cores will be occupied by each VM
However, if 𝑉𝑀 𝑛,𝑠
𝑡
is the number of VM from type of s at t time is n then:
∀ 𝑥∈ 𝑉𝑀 𝑛,𝑠
𝑡
𝑥 ≤ 𝑇𝑜𝑡𝑎𝑙 𝑟𝑒𝑠𝑜𝑢𝑟𝑐𝑒𝑠
Working load model
Non-periodic
Independent
Have Poisson process property
Server selection model
Goal (Performance, Time complexity, improving efficacy like energy
consumption, or communication cost)
19. Energy Model (First model and most common)
𝑃𝑎𝑐𝑡𝑖𝑣𝑒 =∝ 𝑣𝑓2
𝑃𝑂𝑡ℎ𝑒ℎ𝑟 = 𝑃𝑚𝑎𝑥 − 𝑃 𝑚𝑖𝑛 ∗ 𝑢 + 𝑃 𝑚𝑖𝑛
𝑃𝑖 = 𝑃𝑂𝑡ℎ𝑒𝑟 + 𝑃𝑎𝑐𝑡𝑖𝑣𝑒
𝑇𝐸𝐶𝑡 =
𝑖=1
𝑘
𝑗=1
𝑛
(𝑃𝑗𝑖. 𝑥𝑗𝑖 + 𝑃𝑖𝑑𝑙𝑒,𝑗𝑖. (1 − 𝑥𝑗𝑖)). 𝑦𝑖
𝑃𝑖𝑑𝑙𝑒,𝑖𝑗 is the energy consumption of each processor when it is in idle or sleep
mode.
20. Energy Model (Second and Third Models)
𝑝 = 𝑛𝑒1 + 𝑚 (𝑒2 − 𝑒1)
n=number of idle servers and 𝑒1 is the amount of energy consumed by them
𝑒2=amount of energy consumption by active servers
𝑚=average number of active servers
Disadvantage= Not so accurate!
Generic but accurate
0
𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛
𝑝 𝑢 𝑡 𝑑𝑡
Where u(t) is the immediate PM utilization at the given time t and p(u(t)) is the
power consumption associated with u(t)
21. Energy Conscious Scheduling Algorithm
(ECS)
Firstly, all of tasks will be ordered according to 𝑏 − 𝑙𝑒𝑏𝑒𝑙𝑖 parameter
𝑏 − 𝑙𝑒𝑏𝑒𝑙𝑖 =
𝑚𝑎𝑥
𝑝𝑎𝑡ℎ 𝑚 𝜖𝑆𝑃𝑎𝑡ℎ𝑖
( 𝑛 𝑗∈𝑝𝑎𝑡ℎ 𝑚
𝑢𝑗 𝑤𝑗 + 𝑒 𝑘∈𝑝𝑎𝑡ℎ 𝑚
𝑒 𝑘)
Then, schedule the tasks one by one from the beginning of the queue and assign
them to processors that have more RS(Relative Superiority). It shows which
processor with which frequency is consumed less energy.
23. Conclusion and Future possible works
Task scheduling is a NP-complete problem.
Most of proposed algorithms are dedicated for offline environment
which DAG graph and the other parameters should already
determined. In this environment, heuristic method is a common way
to solve this problem.
So, How about unknown systems with unknown DAG or unknown
parameters?
Our solution can be using MDP and Machine Learning techniques.
Many standard energy efficiency techniques do not work for cloud
computing environments. This is due to the stratification of the cloud
computing infrastructure into a widespread groups.
24. Conclusion and Future possible works
None of the previous works have clearly addressed the energy
efficient resource management problem from application
engineering perspective.
Rarely research have been done with more accurate energy model.
There is a big opportunity to research in real-time systems.
Most of algorithms can not be applied to many-core systems, so we
need to design new ones for this environments.
Only about one streaking research as a Ph.D thesis in 2015 has been
done to improve energy consumption in whole cloud environments so
far!