Lecture 9

The Limits of Sorting Algorithms

So far, we have seen two classes of sorting algorithms: those that take O(n2) time and those that take O(n ln n) time. We discard the O(n2) algorithms as inefficient in all but the simplest cases. Those that are left, e.g., merge sort, heap sort, Quicksort, and some other O(n ln n) algorithms seem to all run up against the same lower bound, i.e. (n ln n). Can we do any better than this? Is it just a weird coincidence that all of these "efficient" sorts have the same lower bound asymptotic performance?

Comparison Sorts and General Purpose Sorts

A comparison sort is a sorting algorithm where the final order the items end up in is determined only by comparisons between individual items of input. All of the sorts we have seen so far are comparison sorts: If we ignore the procedural aspects of these algorithms and look only at the data being sorted, we see that each comparison results in at most one change in the order of the array, e.g., maybe two elements may be swapped, or maybe nothing will happen at any one step. Without loss of generality, let's assume that each array element is different. This makes the analysis easier and is often not too far an assumption from the truth. We can think of this process as search through a binary search tree where each node is a permutation (a particular order) of the array. The root of this tree is the order of the array as the algorithm initially encounters it. What we're searching for is the node where the permutation of elements is sorted. The right and left children of a node are the two resulting permutations when the comparison is "less than" and "greater than," respectively. It is up to the algorithm which two elements to compare. For example, the following decision tree shows the movement of data in the bubble sort algorithm performed on three items (the tree is not complete; it is large):
                            { a b c }
                       /                 \
                      /                   \
                     /                     \
            a < b   /                       \   a > b
                   /                         \
               { a b c }                   { b a c }
        b < c  /       \ b > c      a < c  /       \ a > c
              /         \                 /         \
         { a b c }   { a c b }        { b a c }   { b c a }
                       /   \            /   \       /   \
A general purpose sort is a sorting algorithm that works on any kind of ordered data. You provide the algorithm with an ordering on the data, and the algorithm sorts them for you. It is thought that a general purpose sort and a comparison sort are the same thing. You provide the comparison sort with a way to compare two items of data and the algorithms sorts them for you. The standard C function qsort is a good example of a general sort:
#include <stdlib.h>
void qsort(void *base, size_t nel, size_t width,
     int (*compar) (const void *, const void *));
So sorting is like a search from the initial permutation (root) to the sorted permutation (some node in the tree). In the worst case, the sorted permutation may be a leaf node, requiring a number of comparisons proportional to the height of the tree. So a worst case lower bound on comparison sorting is the height of this decision tree. If our algorithm is clever, its decision tree will be an almost-complete binary tree. The height of a decision tree with m nodes is (ln m).

How many nodes are there in the decision tree for an array of size n? Since there is a node for every permutation of the array, there are n! nodes (i.e., n-factorial, n * (n-1) * (n-2) * (n-3) * ... * 1 nodes). So the height of the decision tree is (ln (n!)). In Chapter 2.12, we see that a lower bound on the factorial function is:

(2n)1/2 (n/e)n <= n!
for all n. If we take logarithms on both sides and use the properties that log ab = log a + log b and log a/b = log a - log b, and some asymptotic notation to hide constants, we get:
(1) + ln n + n ln n - (n) <= ln (n!)
which works out to simply
ln (n!) = (n ln n)
So the height of the decision tree has a lower bound of (n ln n). In the worst case, the sorting algorithm will have to "search" all the way down to a leaf node, so (n ln n) comparisons is the best a comparison sort can be expected to do. Since the number of comparisons is at least the number of array accesses or other operations, this is the lower bound on the worst case time-complexity of any comparison sort.

Linear-Time Sorting Algorithms

Any sorting algorithm at all, comparison or not, has a trivial (n) lower bound time complexity; it has to at least examine all n elements of the array before it can guarantee they are sorted. So this is definitely "the best we can do." Are there any sorts that realize this optimistic time complexity? As we have just seen, comparison sorts, which correspond to the notion of a general purpose sort, must take at least (n ln n) time in the worst case. But there are sorts that work on specialized data that work even faster.

Counting Sort

Let's first consider a very simple problem: given an array A [1..n] of bits (0's and 1's), sort them according to the order 0 < 1. We could use Quicksort or merge sort, but these are really overkill. A very easy method is to just count the number m of 0's, then fill A[1..m] with 0's and A[m+1..n] with 1's. Counting the 0's takes (n) time, and filling the array takes another (n), so the whole time to sort is simply (n). We can generalize this notion to sort an array where the elements come from a set of small integers. This is the idea behind counting sort (note that this is different than the version in the book).
// A is the array to sort.
// The array elements may be in the set of integers [0..k].
// C is an array from [0..k]; C[i] will tell how many times i occurs in A

Counting-Sort (A, k)
        for i in 0 to k do
                C[i] = 0                // all counts are initially 0
        end for
        for j = 1 to length(A) do
                C[A[j]]++               // count each element
        end for
                                        // C[i] is now the # of times
                                        // i occurs in A
        i = 1                           // i is the index in A[1..length(A)]
        j = 0                           // j is the index in C[0..k]
        while j <= k do                 // while we have more elements...
                if C[j] != 0 then       // if there are more j's in A
                        A[i++] = j      // place a copy of j into A
                        C[j]--          // one less j
                else
                        j++             // next item in order
                end if
        end while
This sort takes (k+n) time: the times to process C and A. If k is a small constant, particularly small compared to the values of n we expect to see (i.e., k = O(n)), then this sort takes (n) time. We require only "constant" storage and time to store and process the array C. This sort is very sensitive to the kinds of data to be stored; they must be integral (like integers and characters) and they must be in a very small range. Sorting even moderate sized integers, like 32-bit integers in the range -2e9..2e9, is just impossible because the array C would have to contain four billion elements. Of course, we can forget about sorting floats altogether; what is C[3.14159]? But if we're sorting, say, the ages (in years) of people at UTSA, where k is around 100 and n is in the several thousands, counting sort would be much faster than any of the (n ln n) sorts. It turns out we can use a stable version of counting sort as the basis for another sort called radix sort that can sort a much wider range of data, like character strings and numbers with small decimal representations.