Merge sort is a useful algorithm for sorting a list of items. It's an efficient and predictable algorithm. Inevitably it'll come up during your interviews, either you'll be asked directly, or you'll find it useful to solve a code challenge. This is a quick reference for merge sort.
Merge sort is a recursive algorithm that takes an unsorted list and produces a sorted version of it.
It is a stable sorting algorithm.
The divide-and-conquer algorithm splits the input in half, sorts both sides, and then combines the result.
Merge-Sort( input ) if length( input ) < 1 return input left-half, right-half = divide-in-half( input ) left = Merge-Sort( left-half ) right = Merge-Sort( right-half ) out =  while not empty left or not empty right if front( left ) < front( right ) out.append( pop-front(left) ) else out.append( pop-front(right) )
Refer to sample code in Python.
The algorithm has a stack depth of
log N. This is how deeply the recursive calls to
Merge-Sort are nested. At each of these levels, it has half the input size, but the call is made twice. So effectively at each level, across all calls, the whole input is operated on. This is useful for the complexity.
The merge sort algorithm's complexity is not dependent on the input. The worst-case complexity is the same as the best-case complexity. It always performs the same number of comparisons and makes the same number of copies.
The number of comparisons it the number of times two items are compared with each other. This is typically done using only the less than
At each of the
log n recursive levels the recombining requires
n comparisons. This results in
θ(n · log n) comparisons.
The number of copies is the number of times an item is copied to a new location, either from the input or within the output.
At each of the
log n recursive levels all of
n items are copied into new output arrays during recombining. This results in
θ(n · log n) copies.
The initial splitting may also do the same amount of copying. This does not change the upper bound, though in practical terms doubles the copying time. This is language dependent, it's possible to use a view on the input to avoid copying during the division stage.
Each level of the recursion requires a copy of the input, as each level needs to combine the elements into new lists. This requires
θ(n · log n) space.
Merge sort does not require random access to the elements, making it suitable for multiple container types, such as linked lists and arrays.
It allows for external sorting, meaning not all the data needs to be in memory at the same time.
View my video class that covers the design, complexity, and code of the algorithm.