DEV Community

loading...

The difference between partitioning an array and partitioning an array in Quicksort

kennethlum profile image Kenneth Lum Updated on ・3 min read

It is not difficult to partition an array in-place into two parts, no matter it is for even / odd numbers, positive / negative numbers, or numbers less than a pivot vs numbers greater than or equal to a pivot.

But when you use Quicksort, the partition is quite a bit trickier.

Quicksort is really simple by the way. It can be just 4 lines:

    function quicksort(arr, lo, hi) {
        if (lo < hi) {
            let iPivot = partition(arr, lo, hi);
            quicksort(arr, lo, iPivot - 1);
            quicksort(arr, iPivot + 1, hi);
        }
    }

The tricky part is partition(). There are two reasons:

  1. You need to pick a pivot that can divide the numbers into two parts. If you just pick a number, such as on the leftmost number of the partition region, then it will be a really bad partition if the array passed in is already sorted. Likewise if you choose the rightmost number. How about if you choose the middle number or a random pivot?

  2. One additional method is to shuffle the whole array before doing any sorting.

  3. However, it still doesn't solve the issue, say, if there are many repeating numbers. For example, what if you have a million numbers of ratings, each one is a rating from 1 to 10? Then this array will have a lot of duplicates. Now imagine if you have 12800 elements in the partition region, and at some stage of the Quicksort recursion, many or all of those numbers in the partition region are the same? Then even if you choose a middle or random pivot, if you return the partition point, you need to be careful not to return the start or end of the partition region, because one of the partition will have size 0, and the other partition having the size of "original array size - 1". This can make Quicksort keep on recursing, reducing the array size 1 by 1, and eventually stack overflow. The Lomuto partition in Cormen, Jon Bentley, and Skiena algorithm books will do exactly that. Quicksort depends on having a good partition point so that the algorithm is efficient.

One of the solutions is to use something that will "burn the candle" at both ends, and reach a midpoint of the partition region, even when there are many or all duplicate numbers. This is C. A. R. Hoare's partition algorithm:

    function partition(nums, begin, end) {
        let iPivot = Math.floor((begin + end)/2);
        let low = begin, high = end + 1, pivot = nums[iPivot];

        [nums[begin], nums[iPivot]] = [nums[iPivot], nums[begin]];

        while (true) {
            while (nums[++low] < pivot) if (low === end) break;
            while (nums[--high] > pivot) if (high === begin) break;
            if (low >= high) break;
            [nums[low], nums[high]] = [nums[high], nums[low]];
        }
        [nums[begin], nums[high]] = [nums[high], nums[begin]];
        return high;        
    }

And we probably want to make a wrapper for quicksort:

    function quicksortWithRange(arr, lo, hi) {
        if (lo < hi) {
            let iPivot = partition(arr, lo, hi);
            quicksortWithRange(arr, lo, iPivot - 1);
            quicksortWithRange(arr, iPivot + 1, hi);
        }
    }

    function quicksort(arr) {
        quicksortWithRange(arr, 0, arr.length - 1);
    }

We can use the same partition when we use Quickselect, which is like a one-sided Quicksort, for finding the median or kth order statistic.

The partition function is quite interesting to write. Try spending some time to write different versions of it that can handle all data cases in O(n log n) time. Google used to ask this question a lot in their interviews (but not any more). Now I understand why they do that.

Discussion (0)

pic
Editor guide