The document discusses different sorting algorithms:
- Insertion sort is good for small datasets but has poor performance for large datasets.
- Merge sort has predictable, stable performance but uses more memory than quicksort.
- Quicksort uses little extra memory and has average case performance comparable to merge sort, but can have worst case quadratic performance in some situations.
- Benchmarking shows built-in Ruby .sort outperforms implementations of merge sort, quicksort, and insertion sort in Ruby due to optimizations in the C implementation. The document then provides pseudocode to implement quicksort in Ruby.
1. Review!
● Insertion Sort
– Good for small data sets
● Used as the recursive base case for Rubinius' .sort
method
– Terrible big O
– Very minimal memory usage
– Stable
● Merge Sort
– Great all around application
● Extremely predictable
– A bit memory intensive
● O(n) extra memory
– Stable
3. What is this infernal contraption?
● Space-optimized divide and conquer
recursive sorting of course!
● Most direct competitor to merge sort
● Instead of pulling apart the entire array
into bite size chunks, it measures the
entire thing against one value (the
pivot) and determines where it
permanently belongs, then sorts each
“half” on either side the same way
4. What is this infernal contraption?
● Not adaptive, officially
– Algorithm's performance acts the
same no matter what data is passed
to it, but luck on the pivot choice and
repeated values can affect runtime
● Not stable
– Keys of the same value can switch
places with each other
5. What is this infernal contraption?
● When compared to merge sort:
– Worse worst-case runtime
● Mergesort: always O(n log n)
● Quicksort: usually O(n log n) though in rare cases can be
O(n**2)
– Less extra memory required
● Mergesort: O(n) extra memory for auxiliary operations
● Quicksort
–Uses an in-place “pivot” for comparison
–After comparing to the pivot, each half is sorted
recursively, which requires at most O(log n) extra space
7. Built-in .sort still owns though
● 10,000 elements, 1..10,000 already sorted
– 0.000106 seconds vs 0.021049 seconds
● 10,000 elements, 1..10,000 reverse sorted
– 0.000112 seconds vs 0.021986 seconds
● 10,000 elements, random values up to 10,000
– 0.001657 seconds vs 0.026452 seconds
● 20,000 elements, random values up to 20,000
– 0.003481 seconds vs 0.057677 seconds
● 30,000 elements, random values up to 30,000
– 0.005254 seconds vs 0.095844 seconds
● 5mil elements, random values up to 5mil
– 1.288245 seconds vs 20.530810 seconds
8. Built-in .sort still owns though
● Which is funny because I lied accidentally...
– Although Rubinius uses Merge sort with Insertion sort to handle
base cases, MRI Ruby uses quick sort with various pointers
● What the hell, why is mine slower?
–Ruby's .sort is written in C and optimized by people who
know exactly what they're doing
–Since we wrote ours in actual Ruby, it has to deal with
integer (or string) objects instead of just values
● Remember? EVERYTHING in Ruby is an object
11. Bro, do you even logic?
● First method: “Quicksort” (three params: array, left
(default = 0), right (default array.size-1))
– If left < right
● pivot_index = find the median of the size of the array
● new_pivot_index = the result of calling secondary method
“partition”
● Recursively run “quicksort” method on the first half of the
array up to (but not including the “new_pivot_index”)
● Recursively run “quicksort” method on the second half of the
array from the “new_pivot_index” (but not including it) to the
end of the array
– Return the array
12. Bro, do you even logic? Pt. 2
● Second method: “Partition” (four params: array, left, right,
pivot_index)
– Find the pivot value using array[pivot_index]
– Switch the rightmost value of the array with the pivot_value
● (Parallel assignment)
– Create a store_index variable and set it equal to left
– From left to right exclusive do |n|
●
If the value at array[n] is less than pivot_value
– Switch array[n] with array[store_index]
– Increment store_index
– Switch rightmost value of array with store_index
– Return the final value of the store_index variable
● This value gets assigned to the “new_pivot_index” variable from the first
method call, so we can then finish the first call we made