DEV Community

Erik
Erik

Posted on

All you need to know about Sorting

Sorting means rearranging a sequence, such as a list of numbers, so that the elements are put in a specific order (e.g. ascending or descending). In computer science sorting is quite a wide topic, there are dozens of sorting algorithms, each with pros and cons and different attributes are being studied, e.g. the algorithm's time complexity, its stability etc. Sorting algorithms are a favorite subject of programming classes as they provide a good exercise for programming and analysis of algorithms and can be nicely put on tests :)

Some famous sorting algorithms include bubble sort (a simple KISS algorithm), quick and merge sort (some of the fastest algorithms) and stupid sort (just tries different permutations until it hits the jackpot).

In practice however we oftentimes end up using some of the simplest sorting algorithms (such as bubble sort anyway, unless we're programming a database or otherwise dealing with enormous amounts of data. If we need to sort just a few hundred of items and/or the sorting doesn't occur very often, a simple algorithm does the job well, sometimes even faster due to a potential initial overhead of a very complex algorithm. So always consider the KISS approach first.

Attributes of sorting algorithms we're generally interested in are the following:

  • time and space complexity: Time and space complexity hints on how fast the algorithm will run and how much memory it will need, specifically we're interested in the best, worst and average case depending on the length of the input sequence. Indeed we ideally want the fastest algorithm, but it has to be known that a better time complexity doesn't have to imply a faster run time in practice, especially with shorter sequences. An algorithm that's extremely fast in best case scenario may be extremely slow in non-ideal cases. With memory, we are often interested whether the algorithm works in place; such an algorithm only needs a constant amount of memory in addition to the memory that the sorted sequence takes, i.e. the sequence is sorted in the memory where it resides.
  • implementation complexity: A simple algorithm is better if it's good enough. It may lead to e.g. smaller code size which may be a factor e.g. in embedded - stability: A stable sorting algorithm preserves the order of the elements that are considered equal. With pure numbers this of course doesn't matter, but if we're sorting more complex data structures (e.g. sorting records about people by their names), this attribute may become important.
  • comparative vs non-comparative: A comparative sort only requires a single operation that compares any two elements and says which one has a higher value -- such an algorithm is general and can be used for sorting any data, but its time complexity of the average case can't be better than O(n * log(n)). Non-comparison sorts can be faster as they may take advantage of other possible integer operations.
  • recursion and parallelism: Some algorithms are recursive in nature, some are not. Some algorithms can be parallelised e.g. with a GPU which will greatly increase their speed.
  • other: There may be other specific, e.g. some algorithms are are slow if sorting an already sorted sequence (which is addressed by adaptive sorting), so we may have to also consider the nature of data we'll be sorting. Other times we may be interested e.g. in what machine instructions the algorithm will compile to etc.

In practice not only the algorithm but also its implementation matters. For example if we have a sequence of very large data structures to sort, we may want to avoid physically rearranging these structures in memory, this could be slow. In such case we may want to use indirect sorting: we create an additional list whose elements are indices to the main sequence, and we only sort this list of indices.

Top comments (0)