Create account

DEV Community

Peter Andrew

Posted on Jan 7 • Edited on Jan 9

Sorting Algorithm Theory VS Implementation 🧐 ?

#algorithms #programming #discuss #showdev

Why Sorting?

I started learning Golang last week and to solidify my understanding of the syntax, I practiced using Leetcode. This led me to revisit the algorithms I learned in my university years, which I had long forgotten. The merge sort algorithm, in particular, sparked my interest and motivated me to learn more about sorting.

Here are the references for the sorting algorithm:

Bubble Sort, which had the time complexity of (n^2)
- more on bubble sort https://www.geeksforgeeks.org/bubble-sort/
Merge Sort, which had the time complexity of (n * log(n))
- more on merge sort https://www.geeksforgeeks.org/merge-sort/
Quick Sort, which had the time complexity of (n * log(n))
- more on quick sort https://www.geeksforgeeks.org/quick-sort/

Since people rarely use bubble sort due to its time complexity and perhaps because they prefer to use the built-in sort function, I wonder how inefficient it actually is . Thus to find out, I conducted a simple experiment.

Experiment

The experiment aims to track the time needed for the merge, bubble, quick, and built-in (sort.Ints) sorting function to sort the array. The whole experiment setup can be seen on my GitHub repo

peterchu999 / sorting-experiment

Comparison Between Sorting Algorithm (with Go)

My Post

Checkout my post on dev.to: https://dev.to/peterchu999/sorting-algorithm-theory-vs-implementation--40ga/

Thought

By theory bubble sort would had the n^2 complexity, and the rest (merge,quick,built-in sort func) would had n*log(n) complexity. here is the graph example :

but when I implemented all of the sorting algorithm in golang, here is the graph for 300 array length test:

Let see closely on the first 50 random array sorting benchmark. We could see up to ~25 array size, bubble sort perform better than merge and built-in sort. However, based on the theory it aren't suppose to be. Here os the graph comparison

Wierd isn't it ? but when we see the bigger picture the theory would align with the implementation. Let see the on 1000 array size

Running Workflow

./run-all.sh [n], change [n] with the number of max array size for benchmark

Running Python Visualization

pip install -r…

View on GitHub

This is how the time tracking works:

func TestSorting(sfc SortFunction, arr []int) int64 {
    scopedTestList := make([]int, len(arr)) 
    copy(scopedTestList, arr) // copy the unsorted array to avoid side effects on other test
    startTime := time.Now() // track start time
    sfc(scopedTestList) // call sorting function
    endTime := time.Now() // track end time
    return endTimeMerge.Sub(startTimeMerge).Nanoseconds() // test return startTime - endTime
}

And here is the test setup:

for i:=1; i < arrLen+1; i++ {
    testCase := GenerateUniqueRandomNumbers, 100) // generate i length of random numbers array
    mergeT := utils.TestSorting(MergeSortVoid, testCase)
    time.Sleep(time.Millisecond * 5) // incase computation degrade machine and make later case worst
    bubbleT := utils.TestSorting(BubbleSortVoid, testCase)
    time.Sleep(time.Millisecond * 5) // incase computation degrade machine and make later case worst
    quickT := utils.TestSorting(QuickSortVoid, testCase)
    time.Sleep(time.Millisecond * 5) // incase computation degrade machine and make later case worst
    biT :=  utils.TestSorting(sort.Ints, testCase)
}

loop until arrLen which was the maximum number of the array length.
for each iteration, generate a random number array with the length of the current iteration index (i)
for each iteration, tracked the time needed for each sorting function to sort the array

the gathered data from the experiment above were saved and visualized by using Python code. The visualization plot the time taken for each array length to be sorted.

Result

Interesting results are shown when using a small array length. Let's examine the visualized graph comparing the complexity of theoretical sorting algorithms with the actual implementation time.

Up to an array length of around 30, bubble sort, which was presumably deemed slower according to theory, was actually faster than the merge and built-in sort functions. It's quite surprising 🤯 !

However, when we look at the bigger picture, the theory and implementation align . Here is the visualization of the theoretical sort complexity versus the sort implementation time for an array length of 1000.

Conclusion

In the end, when the array size is small, the sorting operation takes a very short amount of time (just a few nanoseconds), so no one really notices the difference. What truly matters is that the n * log(n) sort algorithm (merge sort, quick sort) we learned performs significantly better on larger array size.

Please check out my code and feel free to experiment with it. If you find any flaws or errors, please create an issue on GitHub or leave a comment 💬 below!

I'm still unsure why bubble sort performs better on small datasets. My guess is that the operation to create a new array in merge sort takes longer than simply swapping values between indices. What do you think 🤔 ? Please share your thoughts and comments 🙌!

Acknowledgment

https://unsplash.com/photos/long-exposure-photography-of-road-and-cars-NqOInJ-ttqM (cover image)
https://geeksforgeeks.org/
https://blog.boot.dev/golang/quick-sort-golang/ (quick sort)
https://www.tutorialspoint.com/how-to-generate-a-array-of-unique-random-numbers-in-golang (generate random number array)

Top comments (2)

GowthamSagar310 • Jan 9 • Edited

so, the first thing here is the order of growth, bubble sort grows quadratically, where as merge sort has n*log(n). as n becomes large,
(taking limit), n^2 grows faster than n*log(n), but this is just an upperbound / asymptotic analysis, which is done after ignoring the lower order terms, constants and other external factors such as compiler optimizations, and hardware configs. intuitively it feels (atleast for me) that for smaller array sizes, the instrcutions needed for swapping elements are must faster than loading the recursive stack back and forth into memory.

Peter Andrew • Jan 9

Yep, agree with ur thought 🙌.
Thanks for Commenting ❤️