DEV Community

Cover image for Merge Sort Quick Reference
edA‑qa mort‑ora‑y
edA‑qa mort‑ora‑y

Posted on • Originally published at interview.codes

Merge Sort Quick Reference

Merge sort is a useful algorithm for sorting a list of items. It's an efficient and predictable algorithm. Inevitably it'll come up during your interviews, either you'll be asked directly, or you'll find it useful to solve a code challenge. This is a quick reference for merge sort.

Algorithm Design

Merge sort is a recursive algorithm that takes an unsorted list and produces a sorted version of it.

It is a stable sorting algorithm.

The divide-and-conquer algorithm splits the input in half, sorts both sides, and then combines the result.

Merge-Sort( input )
    if length( input ) < 1
        return input

    left-half, right-half = divide-in-half( input )
    left = Merge-Sort( left-half )
    right = Merge-Sort( right-half )

    out = []
    while not empty left or not empty right
        if front( left ) < front( right )
            out.append( pop-front(left) )
        else
            out.append( pop-front(right) )
Enter fullscreen mode Exit fullscreen mode

Refer to sample code in Python.

Algorithm Complexity

The algorithm has a stack depth of log N. This is how deeply the recursive calls to Merge-Sort are nested. At each of these levels, it has half the input size, but the call is made twice. So effectively at each level, across all calls, the whole input is operated on. This is useful for the complexity.

The merge sort algorithm's complexity is not dependent on the input. The worst-case complexity is the same as the best-case complexity. It always performs the same number of comparisons and makes the same number of copies.

Time: Comparisons

The number of comparisons it the number of times two items are compared with each other. This is typically done using only the less than < comparison.

At each of the log n recursive levels the recombining requires n comparisons. This results in θ(n · log n) comparisons.

Time: Copies

The number of copies is the number of times an item is copied to a new location, either from the input or within the output.

At each of the log n recursive levels all of n items are copied into new output arrays during recombining. This results in θ(n · log n) copies.

The initial splitting may also do the same amount of copying. This does not change the upper bound, though in practical terms doubles the copying time. This is language dependent, it's possible to use a view on the input to avoid copying during the division stage.

Space

Each level of the recursion requires a copy of the input, as each level needs to combine the elements into new lists. This requires θ(n · log n) space.

Notes

Merge sort does not require random access to the elements, making it suitable for multiple container types, such as linked lists and arrays.

It allows for external sorting, meaning not all the data needs to be in memory at the same time.

View my video class that covers the design, complexity, and code of the algorithm.

Top comments (4)

Collapse
 
theodesp profile image
Theofanis Despoudis

The merge part is useful in cases where you want to merge two sorted arrays, or find the kth element of two sorted arrays.

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

Yes, the merge part is useful as an independent algorithm. Knowing how these algorithms work are thus useful for constructing new algorithms.

Collapse
 
ovieokeh profile image
Ovie Okeh

I found the most difficult part of implementing it was actually the merge function. After that, it's almost ridiculously easy.

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y • Edited

I admit my pseudo-code ignores some of the details, making it look easier than it is.

My front function has to deal with the end of a list and provider a proper comparison when no item is available. It can be done this way.