Comparison Sorts Reading
Sorting
A sort is a permutation (rearrangement) of a sequence of keys that puts the keys into non-decreasing order relative to a given ordering relation.
For example, suppose we want to sort strings by their length from shortest to longest.
Using string length as the ordering relation, give two sorts for ["cows", "get", "going", "the"].
There are two valid sorts since “the” is considered equivalent to “get” when comparing by string length.
- [“the”, “get”, “cows”, “going”]
- [“get”, “the”, “cows”, “going”]
Stability
A sort is considered stable if the relative order of equivalent keys is maintained after sorting.
As we saw above, there are two valid sorts for [“cows”, “get”, “going”, “the”]. However, a stable sorting algorithm is guaranteed to return [“get”, “the”, “cows”, “going”]; “get” and “the” are equivalent-length strings, and “get” appears before “the” in the original input.
Maintaining the relative ordering of equivalent keys can be useful. For example, if a list of email messages is already sorted by date and then is stably-sorted by sender, the result will group messages by sender name, and within each sender’s messages they will be ordered chronologically. Maintaining the relative ordering of equivalent keys may also matter when our data has multiple fields.
Give an example of a Java data type that, when sorted by an unstable sorting algorithm instead of a stable sorting algorithm, won't affect any client programs.
Primitive data types such as int
or double
are not affected by stability since numbers do not have other fields. Mixing up the relative ordering of two equal numbers has no effect on any client programs.
In contrast, objects can have many fields and not all fields might be used when calculating equals
. For objects that are considered equals
, a client program might be surprised by an unstable sort. For this reason, Java’s sorting methods use a faster but unstable sort for primitive types and a slower but stable sort for reference types.
In Place
A sort is considered in place if it only requires O(1) extra memory.
This means that the sorting algorithm will modify the input rather than copying the data into a new data structure. In place sorts are useful when it’s important to minimize memory usage. Conversely, there are times when modifying the input data structure is undesirable or even impossible.
Knowing which sort to choose in different scenarios is a valuable skill. In lecture, we’ll explore sorting algorithms in more detail.