The article briefly concerns methods of estimating the minimum time of executing tasks at optimal distribution of load among processors. The given methods can be used both for homogeneous and heterogeneous computer systems.
Estimation of the minimum time of executing tasks at optimal distribution of load among processors
1. Estimation of the minimum time of executing tasks at optimal distribution of load among
processors.
Author: Andrey Karpov
Annotation. The article briefly concerns methods of estimating the minimum time of executing tasks at optimal distribution of
load among processors. The given methods can be used both for homogeneous and heterogeneous computer systems.
To the reader
This document is part of a series of articles devoted to the questions of creating quality and effective program solutions for
modern 64-bit multi-core systems. You can see other articles on the site http://www.viva64.com.
Introduction
Despite great computational power of modern computers there are tasks solution of which in sequential mode takes much
time. The time for solving such tasks can be greatly reduced by using abilities of modern multi-core processors for calculations.
In order to fully use the advantages provided by these processors it is necessary to improve algorithms of solving tasks taking into
consideration the possibility of parallel data processing performed by several processors simultaneously. It is also important to
distribute calculations in such a way that each processor be used most fully and the total time of solving a task tend to minimum.
The article gives a review of the methods of estimating the minimum time of executing tasks at their optimal distribution
among computational nodes. Situations are taken into account when several parallel tasks are executed on one system taking
some resources of computational nodes. In this case the system is considered heterogeneous (anisotropic) in relation to the
program we're interested in.
2. 1. Independent calculations of equal difficulty on homogeneous computational nodes
Suppose we have N independent calculations of equal difficulty. We need to distribute them among P processors which
have equal computational power (figure 1).
The natural solution of this task is assigning
P
N
calculations to each processor.
1 2 3 4 5 6 7 8 9 10
Processor 1 Processor 3Processor 2
Calculations
11
Figure 1. Distribution of independent calculations of equal difficulty on homogeneous computational nodes.
But this solution is proper only in that case if N contains P 0mod PN . Otherwise there remain from 1 to 1P non-
assigned calculations. It will be a mistake to assign all the remaining calculations to one processor as in this case the time of
termination of all the calculations will equal PN
P
N
mod . It is better to distribute the remaining calculations by one for each of
PN mod processors. In this case the time of termination of all the calculations will equal 1
P
N
. It is obvious that
PN
P
N
P
N
mod1 , that's why the second method can be much better.
3. 2. Independent calculations of equal difficulty on heterogeneous computational nodes
Suppose we have N independent calculations of equal difficulty. We need to distribute them among P processors which
have different computational powers Pipi ,1, (figure 2).
1 2 3 4 5 6 7 8 9 10
Processor 1 Processor 2
Calculations
11
Figure 2. Distribution of independent calculations of equal difficulty on heterogeneous nodes.
The time the processor with computational power ip spends on executing one calculation equals
ip
1
. Thus, we need to split
the calculations into P groups with Pini ,1, calculations in each so that the time of termination of all the calculations be
minimum, i.e.:
Nn
p
n
P
i
i
i
i
Pi
1
,1
min,max
4. 3. Independent calculations of different difficulty on homogeneous computational nodes
Suppose we have N independent calculations of different difficulty Nici ,1, . We need to distribute them among P
processors which have equal computational power (figure 3).
1 2 3 4 5 6 7 8 9 10
Processor 1 Processor 3Processor 2
Calculations
11
Figure 3. Distribution of independent calculations of different difficulty on homogeneous computational nodes.
For the minimum time of termination of all the calculations it is necessary that all P processors be loaded most evenly, that
is all the processors should be assigned calculations of approximately equal sizes.
Thus, the task comes to the following: it is necessary to split the calculations into P groups with in calculations in each
Pi ,1 , so that:
minmax
1
,1
in
j
ij
Pi
c ,
where ijc — difficulty of j-calculation in in -group.
5. 4. Independent calculations of different difficulty on heterogeneous computational nodes
Suppose we have N independent calculations of different difficulty Nici ,1, . We need to distribute them among P
processors which have different computational powers Pipi ,1, (figure 4).
Processor 1 Processor 2
Calculations
1 2 3 4 5 6 7 8 9 10 11
Figure 4. Distribution of independent calculations of different difficulty on heterogeneous computational nodes.
The time the processor with computational power ip spends on executing one calculation with difficulty jc equals
i
j
p
c
.
For the minimum time of termination of all the calculations it is necessary that all the processors end calculations
approximately at the same time.
Thus, the task comes to the following: it is necessary to split the calculations into P groups with in calculations in each
Pi ,1 , so that:
minmax
1
,1
i
n
j
ij
Pi p
c
i
,
where ijc — difficulty of j-calculation in in -group.
6. 5. Dependent calculations of equal difficulty on homogeneous computational nodes
Suppose we have N dependent calculations so that calculation of k-step for i-calculation demands the result of k-step for
1i -calculation, i.e. 1 ifgif kk . Suppose also we have P processors which have equal computational power. Such
calculations can be performed simultaneously if we split all the calculations into P groups in each of which calculations are
performed sequentially and in the same order as the source calculations.
There are no illustrations to this section and further text because it is difficult to make them clear.
The task of distributing calculations among processors in this case can be formulated as follows: an ordered set of
calculations C should be split into P non-overlapping ordered subsets ic preserving the sequence of elements in such a way that:
minmax 1
1,1
ii
Pi
cc ,
where ic — power of ic subset.
6. Dependent calculations of equal difficulty on heterogeneous computational nodes
Suppose we have N dependent calculations so that calculation of k-step for i-calculation demands the result of k-step for
1i -calculation, i.e. 1 ifgif kk . Suppose also we have P processors which have different computational powers
Pipi ,1, .
The task of distributing calculations among processors in this case can be formulated as follows:
An ordered set of calculations C should be split into P non-overlapping ordered subsets ic preserving the sequence of
elements in such a way that:
minmax
1
1
1,1
i
i
i
i
Pi p
c
p
c
,
where ic — power of ic subset. That is, the maximum difference in time of performing calculations in neighboring subsets must
be minimum.
7. 7. Dependent calculations of different difficulty on homogeneous computational nodes
Suppose we have N dependent calculations so that calculation of k-step for i-calculation demands the result of k-step for
1i -calculation, i.e. 1 ifgif kk . Each calculation is correlated with its difficulty iw . Suppose also we have P processors
which have equal computational powers.
The task of distributing calculations among processors in this case can be formulated as follows:
An ordered set of calculations C should be split into P non-overlapping ordered subsets ic preserving the sequence of
elements in such a way that:
minmax
1
1,1
ii cj
j
cj
j
Pi
ww ,
where icj
jw — difficulty of calculations making part of ic subset.
8. Dependent calculations of different difficulty on heterogeneous nodes
Suppose we have N dependent calculations so that calculation of k-step for i-calculation demands the result of k-step for
1i -calculation, i.e. 1 ifgif kk . Each calculation is correlated with its difficulty iw .
Suppose also we have P processors which have different computational powers Pipi ,1, .
The task of distributing calculations among processors in this case is formulated as follows: an ordered set of calculations
C should be split into P non-overlapping ordered subsets ic preserving the sequence of elements in such a way that:
minmax
1
1,1
1
i
cj
j
i
cj
j
Pi p
w
p
w
ii
where icj
jw — difficulty of calculations making part of ic subset.
8. Additional sources:
1. M.V. Yakobovskiy, S.A. Sukov. Dynamic load balancing // Materials of the conference "High-performance calculations
and their applications", Chernogolovka, 2000, pp. 34-39.
2. V.P. Ivannikov, N.S. Kovalevskiy, V.M. Metelskiy. Of minimum time of implementing competitive processes in
synchronous operations. // Programming. 2000, № 5, pp. 44-52.
3. E. Tanenbaum. Distributed systems. Principles and paradigms. - St. Petersburg: Piter, 2003. - pp. 877.
4. A.A. Bukatov, V.N. Datsuk, A.I. Zhegulo. Programming of multi-processor computer systems. Rostov-on-Don. Publishing
House OOO "VCRU", 2003, pp. 208.
5. S.A. Nemnyugin, O.L. Stesik. Parallel programming for multi-processor computer systems. - St. Petersburg: BHV-
Peterburg, 2002. - pp. 400.
About the Author
Andrey Nikolaevich Karpov, http://www.viva64.com
Develops program solutions in the sphere of resource-intensive applications' quality and performance increase. One of the
developers of Viva64 static analyzer for verifying 64-bit software. Participates in developing VivaCore open library for working
with C/C++ code.