Unit 2: Modeling of Programs
A function maps inputs to outputs in such a way that each input maps to at most
one output. An important class of problems are those that are solved by
implementing and then applying a function. For example, to compute the area of
a circle we supply its radius to a function, which, in turn, computes and returns the
area.
An algorithm is a written description of how to solve a problem using finite
resources (time and space). Algorithms consist of one or more steps, each of which
require finite time and space. Its indented audience is one or more people, who
then translate the algorithm into a computer program.
Since translation can be expensive, and because certain programs must be
efficient in time and space, it is important to estimate the resources (time and
space) required to run the program. Estimating resources required for an algorithm
is the subject of study in analysis of algorithms.
How do we measure running time?
1. Run them and time them by - internal CPU clock, or by the external time clock
2. Count the number of steps needed to complete the algorithm - abstract
clock
Three types of clocks
1. Wall clock (external)
2. CPU clock (internal)
3. Abstract clock (use of counters)
Comparisons of these three clocks
Advantages Disadvantages
A. Readily accessible
No special training is required
Measures real time
Not very accurate for fast programs
Depends upon the computer load
Depends upon language used
Depends upon speed of CPU
Depends upon compiler used
Depends upon input data
Depends upon operating system
B. All machines have one
More accurate
Separate CPU load
learn how to access clock
depends upon speed of CPU
depends upon compiler used
CPU time depends upon input data
depends upon operating system
C. Measures number of steps a
program takes to produce
an answer
Independent of all restrictions
Allows gross comparisons
ignores real time
idea of a "step"
different steps may require
different run times
Essence of a step:
A step is something that requires a (relatively) fixed amount of time.
Each of the following may be considered a single step in C++ :
1. Assignment statement
2. Function calls
3. Procedure calls
4. Input
5. Output
6. Comparisons
Note that the calling of a function or a procedure is a single step, but its execution
may require more than one step.
Summation analysis of simple loops
Consider the following simple loop from the previous class to find the index of the
smallest element in an array :
int a[n]; // array has n elements 0 .. n-1
...
smallind = 0; // guess the first index is the location of the smallest
i = 1;
while (i < n) {
if ( a[i] < a[smallind] )
smallind = i; // correct the guess if it's wrong
i = i + 1;
}
Analysis of the loop: version 1
This might be called synthetic analysis in that we first determine the running times of
the parts of the loop, and then use this analysis to determine the running time of
the who ...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
Unit 2 Modeling of Programs A function maps inputs to out.docx
1. Unit 2: Modeling of Programs
A function maps inputs to outputs in such a way that each input
maps to at most
one output. An important class of problems are those that are
solved by
implementing and then applying a function. For example, to
compute the area of
a circle we supply its radius to a function, which, in turn,
computes and returns the
area.
An algorithm is a written description of how to solve a problem
using finite
resources (time and space). Algorithms consist of one or more
steps, each of which
require finite time and space. Its indented audience is one or
more people, who
then translate the algorithm into a computer program.
Since translation can be expensive, and because certain
programs must be
efficient in time and space, it is important to estimate the
resources (time and
space) required to run the program. Estimating resources
required for an algorithm
is the subject of study in analysis of algorithms.
How do we measure running time?
1. Run them and time them by - internal CPU clock, or by the
external time clock
2. 2. Count the number of steps needed to complete the algorithm -
abstract
clock
Three types of clocks
1. Wall clock (external)
2. CPU clock (internal)
3. Abstract clock (use of counters)
Comparisons of these three clocks
Advantages Disadvantages
A. Readily accessible
No special training is required
Measures real time
Not very accurate for fast programs
Depends upon the computer load
Depends upon language used
Depends upon speed of CPU
Depends upon compiler used
Depends upon input data
Depends upon operating system
B. All machines have one
3. More accurate
Separate CPU load
learn how to access clock
depends upon speed of CPU
depends upon compiler used
CPU time depends upon input data
depends upon operating system
C. Measures number of steps a
program takes to produce
an answer
Independent of all restrictions
Allows gross comparisons
ignores real time
idea of a "step"
different steps may require
different run times
Essence of a step:
4. A step is something that requires a (relatively) fixed amount of
time.
Each of the following may be considered a single step in C++ :
1. Assignment statement
2. Function calls
3. Procedure calls
4. Input
5. Output
6. Comparisons
Note that the calling of a function or a procedure is a single
step, but its execution
may require more than one step.
Summation analysis of simple loops
Consider the following simple loop from the previous class to
find the index of the
smallest element in an array :
int a[n]; // array has n elements 0 .. n-1
...
smallind = 0; // guess the first index is the location of the
smallest
i = 1;
while (i < n) {
5. if ( a[i] < a[smallind] )
smallind = i; // correct the guess if it's wrong
i = i + 1;
}
Analysis of the loop: version 1
This might be called synthetic analysis in that we first
determine the running times of
the parts of the loop, and then use this analysis to determine the
running time of
the whole loop.
1. Analyze the running time of the loop setup.
smallind = 0; // 1 step
i = 1; // 1 step
// 2 steps altogether
2. Analyze the running time of the loop body, during a single
execution.
Check (i < n) // 1 step:
if ( a[i] < a[smallind] ) // 1 step
smallind = i; // 1 step
i = i + 1; // 1 step
// 4 steps max, executing the body
Note that we are taking a pessimistic view that the execution of
the body will
6. always execute the smallind = i; statement. Even though this
can be true
for some arrays (e.g. when the array is sorted in descending
order) it will not
be true for any other array. However, we want our analysis to
reveal any
potential problems with the code, so we take the worst case
assumption that
the condition will be true each time.
3. Analyze the overall runtime behavior of the loop.
Now we can analyze the entire execution of the loop by
recognizing that the
loop merely causes a block of 4 steps to be executed. The
execution pattern
of the body of the loop is therefore (in its worst case):
1+1+1+1 = 4 steps
and therefore the worst case of all executions of the loops will
be:
4 + 4 + 4 + 4 + ... + 4 steps
where the number of terms in this expression is determined by
the number of
times the loop executes. To determine this, look at the
initialization,
7. incrementation and termination components of the loop.
i = 1;
while (i < n) {
...
i = i + 1;
}
In this case, i starts at 1 and is incremented by 1 until it exceeds
n. This then
yields this summation:
=
=
Thus this loop executes steps while running the loop body.
We note also, that the final test of the loop condition for the
loop to exit is not
included in the analysis above, so the total contribution of the
loop to the
steps of the program is 1 greater; steps contributed by
the loop.
4. Determine the overall runtime behavior of the code.
Therefore to get the total number of steps for the whole, we
must add 2 to the
above answer for the first two statements in the code.
Thus the total number of steps is steps.
8. Analysis of the loop: version 2
A second way of approaching the analysis of the program is to
analyze the
contribution of each statement to overall run-time behavior of
the program, and
then to sum all these contributions.
(1) smallind=0;
(2) i = 1;
(3) while (i < n) {
(4) if ( a[i] < a[smallind] )
(5) smallind = i; // correct the guess if wrong
(6) i = i + 1;
}
A quick inspection of the program clearly would indicate that
statements (4) and
(6) will execute the same number of times. So will (5) if we
again employ our worst-
case assumption. Looking again at the loop control, the count
for any one of these
is thus:
steps executing the loop
Thus, these three statements contribute
3(n-1) = 3n-3$ steps
9. to the overall running time of the program.
Additionally, (3) will execute one more time than the other
members of the loop,
because it does one final test to terminate the loop, giving an
additional n steps.
Finally the first two steps of the code must be included for the
total:
steps
Summary
The analysis above can be boiled down to the following
assertion:
The run-time behavior of "find the minimum" is Linear in the
size of , the
size of the portion of the array to be searched.
Since "step" covers many different types of statements, nothing
more specific can
actually be stated. For example, if one algorithm requires steps
and
another requires steps, as tempting as it may be, we cannot
compare the
two two and say that the first is "faster" than the second. Why?
The reason is that the
particular mixture of steps in the one algorithm may be be
different than the
mixture of steps in the other, and it is the particular steps that
determine its real run-
time behavior in a given environment. Thus, it is entirely
possible for the second to
be faster than the first, despite the apparent opposite.
10. Some common linear algorithms
Linear search
Linear search begins on one end and searches through until the
other end is found.
This may be done in the forward direction (moving up through
the array indices) or
in the backward direction (moving down through the array
indices).
1. Forward search
"key" is what is being searched for
{ i=1;
while (i<n && key != a[i])
i = i + 1;
if (i < n)
// key is found
else
// key is not found
}
2. Backward search
// "key" is what is being searched for
{ i = n;
while (i > 0 && key != a[i])
i = i-1;
if (i > 0)
// key is found
else
// key is not found
11. }
A general linear search routine follows this:
start at one end of a list of values
while not at other end and element not found
move toward other end
if still on the list of values
// found
else
// not found
If the list of values to be searched is size , then if you double
the running time will
approximately double.
Binary search
If our array is sorted we can apply binary search, which works
by successively
splitting the array in half, looking only on the side where the
search key might be
found.
"key" is what is being searched for
{ lb = 0; // lower bound -- first element
ub = n-1; // upper bound -- last element
do {
mdpt = (lb + ub)/2; // calculate midpoint
if (key < a[mdpt])
ub = mdpt -1;
else if (key > a[mdpt])
lb = mdpt +1;
else
12. lb = ub; // Get out of the loop
} while (lb < ub);
if (a[mdpt] == key)
// Processing when the key is found
else
// Processing when the key is not found
}
Analysis of binary search
A binary search much more efficient than linear search. Why?
Each iteration of the
loop splits the array in half before continuing the search.
Clearly the loop stops
when lb < ub fails. Since each iteration through the loop
reduces by half the
distance between lb and ub, the iteration can continue at most
iterations where
.
Solving for (the number of iterations of the loop), we have
where means "round up to the next largest
integer".
In a binary search, if you double the size of the array, the
running time will increase
by 1 unit of time (the amount of time necessary to split the
array once).
Comparing the runtime of linear vs binary search
Comparison of runtime of a linear and binary search in terms of
the number of
13. iterations of the loop:
Linear Search Binary Search
2 about 2 1
4 about 4 2
8 about 8 3
16 about 16 4
32 about 32 5
64 about 64 6
512 about 512 9
2^32 about 2^32 32
When To Use Certain Searches
Linear search
1. When the data are not sorted and you are going to do few
searches
2. When the data are few
Binary search
1. Only when the data is sorted
14. 2. When the data are numerous
Big-O notation
Since we are being relatively imprecise in our counting of
primitive operations, the
exact coefficients are not crucial in the final estimate when we
wish to compare
two algorithms. For example, suppose one analysis yield and
other
yield . For large the factor that dominates both values is the ,
because when n is sufficiently large, the dwarfs everything else
in terms of its
impact on the actual running time. In cases such as this we say
the running time is
.
Formally,
is iff there exists two constants and such that for all
.
Note, that the absolute value signs surrounding and are not
necessary if and
both represent running times because a running time function
must always be
positive.
15. The counter-intuitive aspect of this definition is that is just as
is O(n^2). There are some known instances where two
algorithms are the same order, but the constants are so extreme,
that they are
important to consider. Normally, however, we are content to say
that two
algorithms of order are both comparable in efficiency. True
constants can only
be determined for a given compiler, computer and operating
system.
What is clearly the case is that some algorithms are clearly
better than others in the
limit:
Here is a "natural" interpretation of these run times:
Name Running time
What happens when the problem
size ( ) doubles
Constant The run time stays the same
Log
The run time goes up by one unit of
16. time
Linear The run time doubles
Log-linear The run time doubles + "a bit more"
Quadratic The run time quadruples
Cubic
The run time goes up by a factor of
8
Polynomial
The run time goes up by a factor of
Exponential The run time goes up by a
factor of
Clearly, algorithms whose running time are greater that are to
be avoided.
The "a bit more" can be seen by putting in for into , which
simplifies to
. (The is "a bit more");
Example growth rates
The following table illustrates the relative growth rates of
various functions.
---- --------- ---- ---------- ------ ------ ------ -------
1 0 1 0 1 1 2
17. 1 1 2 2 4 8 4
1 2 4 8 16 64 16
1 3 8 24 64 512 256
1 4 16 64 256 4096 65536
1 5 32 160 1024 32768 2,147,483,648
An interesting question ... are there functions whose growth rate
is bigger that
exponential?
Big-Oh proofs
Proving that a particular function is requires finding the
constants and
that ensure .
Polynomials
1. Let and . Show is .
Proof: Let and . Then we must show for all . This is
obviously true:
(by multiplying each side by .)
2. To see the constant coefficients do not matter, let and
18. . Show is .
Proof: Let and . Then we must show for all .
This is obviously true:
(by multiplying each side by .)
3. To see that lower-order terms do not matter, let
and . Choose and . Then we
must show
For , , , and , so it
must be that .
4. Clearly the trick employed here of summing the absolute
values of the
coefficients used in to compute will work for any polynomial.
So, in general
any polynomial of degree less than or equal to is .
General results
is O(g) there is an and , .
1. If is then for any , is .
2. If is and is , then is .
3. If and are then is
19. 4. If and are then is
Based on these rules we see that any polynomial of degree is
where is a
polynomial of degree , where .
Big-Theta
Saying is simply says grows no faster that . If also we know
that is
then grows no faster than . In this case we say and have the
same growth
rate. We write this as is .
When we speak of the running time of an algorithm, we prefer
if it is known,
since this tells us the most narrow category of running time for
the algorithm. is
means and grow closer together as increases.
We can state this formally via a limit:
If for some constant the limit as approaches infinity of
, then is .
Rules for estimating the running time of individual constructs
Code construct Abstract running time
20. read 1
write 1
assignment 1
condition 1
if B then Tmin = 1 + min(Tmin(S1)), Tmin(S2)))
S1 Tmax = 1 + max(Tmax(S1)), Tmax(S2)))
else
S2
while B do Tmin = 1
S Tmax = infinity
We cannot do any better on the formula for while loops unless
we know something
of the structure and behavior of the loop.
Running Time of linear loop
In many cases we can figure out the individual counts for a
loop. For example,
suppose we have a counting loop:
i = m; // 1 unit
while (i <= n) // 1 unit each execution + 1 initial
{
i = i + 1; // 1 unit each execution
}
Tmin = Tmax = 2 + sum (i=m,n)(1+1) = 2 + 2*(m-n+1)
The initial 2 comes from the execution of the initialization
assignment and the first
21. test on the condition to see if the loop should be executed at all.
The sum counts
the repeated execution of the body, followed by the condition,
in this case 2 units
of time.
Generalizing the above, we have the following for a counting
loop:
i = m; // 1 unit
while (i <= n) // 1 unit each execution + 1 initial
{
S; // T(S) units for each execution
}
Tmax = 2 + sum (i=m,n)(1+Tmax(S))
Tmin = 2
More Examples
{1} x = x + 1
Assignment is a primitive operation that requires 1 step.
{1} for (i=0; i <n;i++) do
{2} x = x + 1
The loop will iterate times. Each iteration performs 3 steps (the
conditional test,
the increment, and the body). The initialization requires 1 more
step. Thus:
Note this may also be written as
22. It is .
{1} for (i=0; i < n; i++) {2} for (j =i; j < n; j++) {3} x := x + 1
The outer loop executes times. For each iteration of the outer
loop the inner loop
executes times. The running time would be:
It is .
A nested loop that is still
Not all nested loops are . Consider the following:
{1} for (i=0; i < n; i++)
{2} for (j=0; j < m; j++)
{3} x := x + 1
If is not related to , we have
You definitely need to become fluid in your ability to
understand and manipulate
summations. You are encouraged to study the summation review
sheet.
Analysis of Simple Sorts
Overview of sorting
There are three basic sorting strategies:
1. Selection -- emphasizes selecting the smallest (or largest)
element and
23. moving it to its permanent position
2. Exchange -- emphasizes the idea of exchanging elements that
are out of
order
3. Insertion -- emphasizes the idea of placing elements into an
already sorted list
These are general categories of sorting; each category has
several sorts of varying
complexity.
Sorting Category Slow Version Fast Version
Selection sorts Straight selection sort Tree sort
Exchange sorts Bubble sort Quick sort
Insertion sorts Straight insertion sort Merge sort
The point of this table is that each approach to sorting has both
slow and fast
versions.
Below we look at one slow sort (selection sort) and one faster
sort (merge sort)
Straight Selection sort
Illustration of how the sort works
Selection sort works in the following way, for a list of size n
(a[0]..a[n-1]):
1. Select the smallest element in the list, starting at position 0
24. 2. Swap that element with the element at position 0
3. Select the smallest element in the list, starting at position 1
4. Swap that element with the element at position 1
5. Select the smallest element in the list, starting at position 2
6. Swap that element with the element at position 2
7. Select the smallest element in the list, starting at position 3
8. Swap that element with the element at position 3
9. Etc. ...
10. Select the smallest element in the list, starting at position n-
2
11. Swap that element with the element at position 2
This method is illustrated in the the following list of numbers;
the * indicates the
position being swapped to, and the ^ indicates the position
where the smallest in
the rest of the list is found:
2 6 3 1 9 5 4 7 8
* ^
1 6 3 2 9 5 4 7 8
* ^
1 2 3 6 9 5 4 7 8
*
^
1 2 3 6 9 5 4 7 8
* ^
25. 1 2 3 4 9 5 6 7 8
* ^
1 2 3 4 5 9 6 7 8
* ^
1 2 3 4 5 6 9 7 8
* ^
1 2 3 4 5 6 7 9 8
* ^
1 2 3 4 5 6 7 8 9
* ^
The sorting algorithm
Let's use the variable i to represent the position of * and the
smallind to represent
the position of ^ in the algorithm. We begin with a few
observations:
1. i < smallind <= n-1;
2. 0 <= i < n-1; Setting i to n-1 would violate the first
observation.
3. Swapping occurs only after i has been incremented and
smallind has been
set.
These observations lead to the following algorithm:
# To sort the first n elements or the array a
for each position i in 0 .. n-2
smallind=i; // set guess
for each position j in i+1 .. n-1
if a[j] <= a[smallind]
smallind = j;
swap (a[i], a[smallind)
26. Here is a summation analysis of the above: For the inner loop:
Now for the outer loop:
etc.
We don't have to wait for code to get an estimate of the running
time of a
program, we can analyze the algorithm directly. There are two
simplifying steps in
our analysis:
1. We will only analyze the line of code that will be executed
the most. Since
every other line of code will be executed no more, then this will
give us an
upper bound on the running steps by simply multiplying our
result by the
number of lines in the program. As we have seen , this
multiplying constant is
irrelevant, since it does not affect the category into which we
will classify the
algorithm.
2. The second simplifying method is to analyze the body of the
any loop before
27. analyzing the loop itself and summarize it with . Thus, we never
have to
worry about more than one loop at a time in our analysis.
Here we go:
1. The body of the interior loop is analyzed first, then the outer
loop. The
summation for the interior loop is:
The inner loop is clearly
2. The outer loop executes times as well, so the overall
behavior is
This means the algorithm is .
Insertion sort
Insertion sort works in a straightforward way. We start with the
observation that an
array of length 1 is sorted. We then go through the rest of the
array. shifting each
element toward the beginning of the array until it finds its final
position.
#include <iostream>
#include <iomanip>
#include <string>
using namespace std;
28. void insertionSort (int a[], int size) {
int i, j, copy;
for (i = 0; i < size-1; i++) {
j = i+1;
copy = a[j];
while (j > 0 && copy < a[j-1] ) {
a[j] = a[j-1]; // shift right
j--;
}
a[j] = copy;
}
}
void print(int a[], int size) {
int i;
for (i = 1; i < size; i++)
cout << a[i] << ' ';
cout << endl;
}
int main () {
int a[] = {34, 3343, 334, 644, 33, 31, 112, 119};
int size = sizeof(a) / sizeof(int);
insertionSort (a,size);
print (a,size);
return 0;
}
Your assignment is to do a summation analysis similar to that
which was done
above for selection sort and then determine the .
29. Analyzing the running time for recursive programs
Recursive programs are not readily amenable to a summation
analysis. A typical
recursive program has the following structure:
Solution
solveProblem (Problem p of size n) {
if (p is trivial)
return trivial