RadixSorts Reading

Complete the Reading Quiz by 3:00pm before lecture.

So far the sorting algorithms we have learned in this class rely on comparisons to differentiate each value. While we’ve discussed some optimizations, it’s still true that the lower bound runtime for comparison-based sorting of random values is order N log N. Can we do better than this? The answer is yes, if we use a different approach!

CountingSort

Assume we’re given this list of (wonderful) TAs with corresponding keys. We are guaranteed the keys are unique and include all values 0 - N-1. Our goal is to sort the input in ascending order by the keys.

Key	TA (Value)
3	Anish
2	Jade
5	Farrell
0	Howard
1	Yifan
4	Yuma
6	Amanda

Because we are guaranteed that the keys are unique and 0–6, we know that the keys in our output will be in order 0, 1, 2, 3, 4, 5, 6. Since we know the location at which each key should be placed in our output, we can iterate through the input once and place each item in its correct location in the output. This is radically different from the sorting algorithms we’ve seen before! Notice how we’re using the input data to directly calculate an index; this technique should feel familiar if you recall the data-indexed set introduced in our hashing reading or the data-indexed map introduced in our trie reading.

For now, we’ll refer to this technique as CountingSort, because it does not compare keys to each other but rather locates a position for each item in counting order.

After moving Anish, Jade, and Farrell (remember we go through the given data in order), our auxiliary output will look like this:

Key	TA (Value)
0 (inferred)	?? (not filled yet)
1 (inferred)	?? (not filled yet)
2	Jade
3	Anish
4 (inferred)	?? (not filled yet)
5	Farrell
6 (inferred)	?? (not filled yet)

Note that this is a simplified version of CountingSort because we were guaranteed to have unique keys and that the keys would be the values 0–N-1. We knew that keys 0, 1, 4, and 5 would be filled by the time the sort concluded, so we could infer their presence and their position.

Now, however, let’s look at a slightly more complex problem where we don’t have these guarantees: Each TA in the list now has a card from a standard deck. Our new goal is to sort them by the card suit, in the order {♣, ♠, ♥, ♦}. You can think of clubs (♣) as the smallest and diamonds (♦) as the largest. All TAs within a given suit are considered equal, and we want the sort to be stable. Remember a sort is stable if values with equal keys are in the same order in the output as they were in the input.

Suit (Key)	TA (Value)
♠	Anish
♣	Jade
♦	Farrell
♣	Howard
♥	Yifan
♥	Yuma
♠	Amanda

Notice how this problem will require a more advanced version of CountingSort. We no longer have the guarantee of unique keys, so we can’t assume that any TA with the second-smallest key ♠ (spades) will be in the second index of the sorted array. If multiple TAs have ♣ (clubs), at which index should the ♠ (spades) start?

What will be the index of the first heart ♥? How do we know that?

The cards that come before ♥ are ♣ and ♠, so we can count the number of cards with suit ♣ or ♠ to determine the index of the first ♥. In this case, there are four cards of suit ♣ or ♠, so the first ♥ will be at index 4. In fact, our final sorted output will look like:

Suit (Key)	TA (Value)
♣	Jade
♣	Howard
♠	Anish
♠	Amanda
♥	Yifan
♥	Yuma
♦	Farrell

Using this type of analysis to determine the starting index of each group, we can generalize our CountingSort to work for non-unique, non-consecutive, and non-numeric keys!

We will use CountingSort as the foundation for the RadixSorts we’ll discuss in lecture. The term “radix” means the number of different digits or characters in a given alphabet. For example, the radix for card suits from a standard deck is 4, because there are four “letters” (elements) in our “alphabet” (domain): ♣, ♠, ♥, ♦

In lecture, we will examine properties of various RadixSort flavors and learn for which situations it is well-suited.

Reading Quiz