Should we always use quicksort than insertion sort?

#sort #complexity #algorithms

It's often said that "Use always O(n*log(n)) algorhithms". However, depending on the situation, it might not always be right. In this article, I will compare quicksort and insertion sort.

When written in Big O notation, the computational complexity of quicksort is O(n*log(n)), while that of insertion sort is O(n^2). So, it seems like the former would be faster at a glance. However, this notation only describes the behavior when n approaches infinity, so in real-world programs, quicksort is not always faster.

Let's compare the speeds of both algorithms in a program with the following specifications:

Sort an array with a given number of integer elements and output the time it takes to do so.
First argument (len): Number of elements.
Second argument (type): Type of algorithm. 'i' for insertion sort, 'q' for quicksort.

The following sort program implements these specifications.

#include <time.h>
#include <stdio.h>
#include <stdlib.h>

#define NSECS_PER_MSEC 1000000UL
#define NSECS_PER_SEC 1000000000UL

static void insertion_sort(int *a, int len)
{
    int i, tmp;
    for (i = 1; i < len; i++) {
        tmp = a[i];
        if (a[i - 1] > tmp) {
            int j;
            j = i;
            do {
                a[j] = a[j - 1];
                j--;
            } while (j > 0 && a[j - 1] > tmp);
            a[j] = tmp;
        }
    }
}

int pivot(int a[], int l, int r){
    int i = l, j = r-1;
    int p = a[r];
    int tmp;
    for (;;) {
        while (a[i] < p)
            i++;
        while (i < j && p < a[j])
            j--;
        if (i >= j)
            break;
        tmp = a[i];
        a[i] = a[j];
        a[j] = tmp;
    }
    tmp = a[i];
    a[i] = a[r];
    a[r] = tmp;

    return i;
}

void quick_sort_inner(int *a, int l, int r){
    if (l >= r)
        return;
    int v = pivot(a, l, r);
    quick_sort_inner(a, l, v-1);
    quick_sort_inner(a, v+1, r);
}

static void quick_sort(int *a, int len)
{
    quick_sort_inner(a, 0, len-1);
}

static inline long diff_nsec(struct timespec before, struct timespec after)
{
    return ((after.tv_sec * NSECS_PER_SEC + after.tv_nsec)
        - (before.tv_sec * NSECS_PER_SEC + before.tv_nsec));

}

static void prepare_data(int *a, int len)
{
    int i;
    for (i = 0; i < len; i++)
        a[i] = rand();
}

static int comp(const void *a, const void *b)
{
    return *((int *)a) - *((int *)b);
}

static char *progname;

int main(int argc, char *argv[])
{
    progname = argv[0];

    if (argc < 3) {
        fprintf(stderr, "usage: %s <len> <i|q>\n", progname);
        exit(EXIT_FAILURE);
    }

    int len = atoi(argv[1]);

    char type = argv[2][0];

    if (!((type == 'i') || (type == 'q'))) {
        fprintf(stderr, "%s: type should be 'i or q'\n", progname);
        exit(EXIT_FAILURE);
    }

    int *a;
    a = malloc(len * sizeof(int));
    prepare_data(a, len);

    struct timespec before, after;
    if (type == 'i') {
        clock_gettime(CLOCK_MONOTONIC, &before);
        insertion_sort(a, len);
        clock_gettime(CLOCK_MONOTONIC, &after);
    } else {
        clock_gettime(CLOCK_MONOTONIC, &before);
        qsort(a, len, sizeof(int), comp);
        clock_gettime(CLOCK_MONOTONIC, &after);
    }

    printf("%lu\n", diff_nsec(before, after));

    exit(EXIT_SUCCESS);
}

Let's build this progam.

$ CFLAGS=-O3 make sort
$

I ran this program with the following parameters.

Parameter	Value
First argument (`len`)	2^(0, 1, 2, ..., 15)
Second argument (`type`)	"i", "q"

I plotted the results on the following graph. Note that the x-axis is 2^(len) and the y-axis is log(elapsed time).

As len gets larger, quicksort indeed becomes faster, and the difference only grows. However, when len is small, it can be seen that insertion sort is faster up to around 2^8 (=128). This is because quicksort performs more complex processing compared to insertion sort. Therefore, using quicksort is generally not a bad choice, but if you want to fine-tune your program and know that len will not be that large, trying to speed up using insertion sort is one option. However, as the saying goes, "premature optimization", there is no need to start with this kind of fine-tuning.

One last point. For the reasons mentioned above, the qsort() function in glibc, internally uses insertion sort for a somewhat small len (<=MAX_THRESH). There are many other implementations that work this way. If you're interested, it's a good idea to take a look at various quicksort implementations.

DEV Community

Should we always use quicksort than insertion sort?

Oldest comments (0)