2. Agenda
• Introduction
• Performance, does it matter?
• How do we measure performance?
• Analysis of Insertion Sort
• Simplifying things with asymptotic notation
• Designing algorithms
• Solving recurrences
• Questions.
2
3. Introduction
• Gary Short
• Head of Gibraltar Labs
• C# MVP
• gary.short@gibraltarsoftware.com
• @garyshort
• http://www.facebook.com/theothergaryshort
3
4. Performance – Does it Matter?
Performance is the most important thing in
software engineering today...
4
7. How do we Measure Performance?
• What do we care about?
– Memory?
– Bandwidth?
– Computational time?
7
8. We Need a Model to Work With
• RAM Model
– Arithmetic – add, subtract, etc
– Data movement – load, copy, store
– Control – branching, subroutine call, return
– Data – Integers, floats
• Instruction are run in series
– And take constant time
• Not really, but shhh! –Ed.
8
9. Analysis of Insertion Sort
InsertionSort(A)
for j = 2 to A.length
key=[Aj]
i=j-1
while i > 0 and A[i] > key
A[i+1] = A[i]
i=i-1
A[i+1] = key
9
12. Sum Running Time for Each Statement...
T(n) = c1n+c2(n-1)+c3(n-1)+c4 sum(tj) j=2..n+c5
sum(tj-1) j=2..n+c6sum(tj-1) j=2..n+c7(n-1)
12
13. Best Case Running Time
If the input (A) is already sorted then...
A[i] <= key when has initial value of j-1 thus tj=1.
And so...
T(n) = c1n+c2(n-1)+c3(n-1)+c4(n-1)+c7(n-1)
= (c1+c2+c3+c4+c7)n-(c2+c3+c4+c7)
Which can be expressed as an+b for constants a
and b that depend on ci
So T(n) is a linear function of n
13
16. Worst Case Scenario
If the input (n) is in reverse sort order then...
We have to compare each A[j] with each
element in the sub array A[1..j-1].
And so...
T(n) = (c4/2+c5/2+c6/2)n^2 +(c1 +c2+c3+c4/2-
c5/2-c6/2+c7)n-(c2+c3+c4+c7)
Which can be expressed as an^2 + bn + c
So T(n) is a quadratic function of n
16
20. Simplifying Things With Asymptotic Notation
• Asymptotic notation characterises functions
by their growth rates
• Functions with the same growth rates have
the same Asymptotic notation
20
21. How Does That Help Us?
Let’s say we have a function with running time
T(n) = 4n^2 - 2n + 2
If n = 500 then
4n^2 is 1000 times bigger than 2n
So...
We can ignore smaller order terms and
coefficients
T(n) = 4n^2 -2n +2 can be written O(n) = n^2
21
22. A Short Note on The Abuse of “=“
If T(n) = 4n^2 -2n +2
Then saying T(n) = O(n^2) is not strictly correct
Rather T(n) is in the set O(n^2) and the above
should be read as T(n) is O(n^2) and not T(n)
equals O(n^2)
But really on Maths geeks care – Ed.
22
23. So Back to Insertion Sort
So now we can say of Insertion Sort that...
Best case it’s O(n)
And worst case it’s O(n^2)
And since we only care about worst case...
We say that Insertion Sort has O(n^2)
Which sucks! – Ed.
23
25. Optimizing Algorithms is Child’s Play
• Sit at table
• Foreach item in itemsOnPlate
– Eat item
• Wait(MealComplete)
• Foreach dish in dishesUsed
– WashDish
– DryDish
• Resume Play
25
26. Child Will Optimize To…
• Pause Game
• Set Speed = MaxInt
• Run to table
• Take sliceBread(1)
• Foreach item on Plate
– Place item on bread
• Take sliceBread(2)
• Run Outside
• Resume Game
26
27. Divide And Conquer
• Divide
– Divide the problem into sub problems
• Conquer
– Solve the sub problems recursively
• Combine
– Add the solutions to the sub problems into the
solution for the original problem.
27
28. Merge Sort
• Divide
– Divide the n elements into two n/2 element arrays
• Conquer
– Sort the two arrays recursively
• Combine
– Merge the two sorted arrays to produce the
answer.
28
31. So What’s The Running Time?
In the general case...
If the divide step yields ‘a’ sub problems
Each 1/b the size of the original
It takes T(n/b) time to solve one problem of n/b size
So it takes aT(n/b) to solve ‘a’ of them
Then, if it takes D(n) time to divide the problem
And C(n) time to combine the results
Then we get the recurrence...
T(n) = aT(n/b) + D(n) + C(n).
31
32. Apply That to Merge Sort...
• Divide
– Computes the middle of the subarray, taking
constant time so, D(n) = O(1)
• Conquer
– Recursively solve two sub problems each of size
n/2 contributing 2T(n/2) to the running time
• Combine
– Merge procedure O(n)
• Giving us a recurrence of 2T(n/2)+O(n)
32
33. Solve The Recurrence Using The Master Method
For a Recurrence in the form
T(n) = aT(n/b) + f(n)
Then
If f(n) = O(nlogba-k) then T(n) = O(nlogba)
If f(n) = O(nlogba) then T(n) = O(nlogba log n)
if f(n) = Omega(n log b a+k) and if af(n/b) <=
cf(n) then T(n) = O(f(n))
33
34. What?!
• More simply we are comparing f(n) with the
function n log ba and intuitively
understanding that the bigger of the two
determines the solution to the recurrence.
34
35. And So...
• With Merge Sort we are in the third case of
the Master Method thus...
• T(n) = O(n log n)
• Which is much better than the O(n^2) of
Insertion Sort
35
37. What We Learned
• Performance is important
• Therefore algorithmic optimization is too
• We have a model to benchmark
• And a syntax
• Divide and conquer
• Master Method
• Other resources.
37