Sort Primer

A sort is any procedure (algorithm) that takes an unordered list of things

[12, 5, 8, 1]

and orders it

[1, 5, 8, 12]

The list does not always have to be numbers; comparison-based sorts can operate on anything that can be compared, such as alphabetical order:

['h', 'e', 'a', 'c'] -> ['a', 'c', 'e', 'h']

Or size:

[Buffalo, Brontasaurus, Bee, Bird] -> [Bee, Bird, Buffalo, Brontasaurus]

There is a subclass of sort called counting sorts which only operate on integers, but that's outside the scope of this primer.

As a human bean, it is pretty trivial to sort lists. You just put the small elements first and the big elements last. It feels automatic. And, like most things human beans do, it becomes nontrivial when the list gets big.

[2, 7, 85, 3, 5, 6, 2, 1, 9, 10, 21, 32, 55, 3, 3, 4, 96, 36, 1, 2, 4, 3, 6, 1, 2, 3, 0, -3, 9, -6, 44, 75]

You can probably still sort that, it'll just take you a long time. The technique you will use to sort it is probably to go through the list, find a few of the smallest value (maybe 5 of them if your memory is good), write them down in order and cross them out, then rinse and repeat.

This is an algorithm:

  1. Scan a list
  2. Find the smallest unsorted element
  3. Write it down at the end of a sorted list

It is called selection sort. We'll visualize it later. The pseudocode for a human selection sort is

function selection_sort(list):
  result = []

  while list has elements not crossed_off:
    number = smallest_element(list)
    result.append(number)
    list.cross_off(number)

  return result

As you might guess, this is a really shitty way to sort big lists. It's not the worst way, not by a long shot, but we've found more clever ways to order things.

Time Complexity

Selection sort is O(n^2), or quadratic in time complexity; whenever our list-to-be-sorted doubles in size, it becomes 4x harder to sort.

You can probably feel this when you do the sort. Let's say we have two lists, A and B:

A = [44, 8, 16, 3, 24]

B = [16, 24, 46, 47, 32, 44, 38, 21, 14, 7, 6, 45, 27, 29, 38, 16, 31, 7, 1, 10, 16, 21, 32, 9, 10, 28, 9, 35, 2, 42, 15, 23, 37, 13, 20, 26, 29, 48, 28, 25, 28, 15, 6, 45, 3, 33, 49, 26, 48, 26, 37, 12, 30, 35, 32, 23, 8, 9, 24, 47, 33, 39, 27, 11, 48, 19, 2, 2, 29, 28, 9, 16, 42, 18, 27, 38, 27, 40, 22, 28, 25, 2, 22, 24, 34, 40, 45, 11, 19, 3, 20, 35, 23, 21, 10, 43, 11, 44, 35, 20, 41, 20, 0, 19, 30, 6, 49, 23, 36, 24, 41, 40, 24, 16, 17, 11, 14, 6, 14, 46, 6, 41, 48, 23, 40, 43, 22, 41, 11, 42, 16, 48, 13, 14, 22, 5, 1, 37, 38, 15, 8, 27, 24, 4, 48, 31, 15, 34, 31, 26, 49, 9, 14, 11, 21, 32, 31, 30, 5, 14, 42, 29, 31, 4, 3, 1, 2, 16, 48, 40, 44, 36, 22, 19, 29, 34, 16, 25, 35, 12, 21, 11, 7, 7, 17, 33, 9, 45, 37, 46, 21, 48, 5, 5, 4, 34, 25, 44, 5, 28]

You've probably already sorted list A in your head. If you were asked to sort list B on a test, you'd probably be pretty damn annoyed. List B is 200 items; List A is 6. List B is 33.33x the size of list A, and so is 33.33^2 = 1089x as hard to sort as List B. O(n^2) time algorithms kinda suck ass.

Compare this to an O(n), or linear time procedure, like max(list). Try finding the maximum value of each list; it's probably not that hard right? Even for list B?

The fastest available sorts are O(n) (sort of) and O(n log n), which is pretty close to O(n) in difficulty; it's a bit peculiar to think that sorting, a problem that we've shown to be somewhat tough for humans on large inputs, to be nearly as efficient as just finding the max value if you do it in a clever way.

Sorts are important, not because computer scientists and software engineers are constantly tripping over unsorted lists of numbers, but because many problems are in fact just unsorted lists of numbers in disguise. Pathfinding (Google Maps), convex hull (shrinkwrapping a mesh), and Huffman coding (an early form of signal compression) all are heavily related to sorting.

Also, perhaps more importantly, it is very satisfying to watch a computer make beep-boop noises as it shuffles and then re-orders rainbow graphs.

Let's head back to the visualizer and do some beep-booping.