Hi folks, and welcome to my blog.
If you’ve learned something new or found something useful feel free to leave a like, or share the post..
Recently I’ve been delving into the more academic side of software engineering. In my day to day job I’ve been finding myself mostly just going through the motions re-using the same tools day in day out or following the same patterns and strategies that had been in place for years and not really fully understanding the why behind them, well not in much depth anyways. I knew the basics and how it was implemented and why it was needed but not the underlying structures or how efficient they are or if there were other ways of doing the same things that could be used to improve the underlying systems.
As such I’ve decided to start reading and studying up on some of the core concepts that form the backbone of some of these systems and up first is sorting algorithms.
In this post I’m going to go over the most common sorting algorithms that I had found being used and their complexity.
Bubble Sort
Bubble sort is probably the simplest of the sorting algorithm (that I know of).
In it you essentially check each element in the array against each other and swap them if needed. Eventually with enough swaps the array will be arranged in order. See below for an example:
Here is one way to implement it, in C#:
public static int[] BubbleSort(int[] array, bool ascending)
{
//Returning early as you need at least two numbers to swap
if (array.Length <= 1)
{
return array;
}
//The maximum number of swaps possible
int maxSwapCount = array.Length - 1;
//Iterate through all elements in the array
for (int i = 0; i < maxSwapCount; i++)
{
/*Iterates through the newly swapped array, with each
iteration at least 1 element will be in the correct place*/
for (int j = 0; j < maxSwapCount - i; j++)
{
int currentNumber = array[j];
int nextIndex = j + 1;
int nextNumber = array[nextIndex];
if ((ascending && currentNumber > nextNumber) || currentNumber < nextNumber)
{
array[j] = nextNumber;
array[nextIndex] = currentNumber;
Console.WriteLine(string.Join(',', array));
}
}
}
return array;
}
The complexity of the bubble sort is O(N²) as the array needs to be cycled through by two separate for loops each as such it has a quadratic complexity and that grows exponentially.
Bubble sorts are fairly inefficient and really there aren’t too many use cases to use it over other sorting algorithms. Now there are a few tricks that can be implemented to speed up a bubble sort such as reducing the number of iterations under the assumption that at least one number will be in the correct position per inner iteration. I’ve seen them mostly being used on smaller datasets that need to be sorted in either descending or ascending order. Things like the responses from an API where the items in an array need to be sorted by time or ID.
Quick Sort
The quick sort is a recursive algorithm. In it a pivot value is determined; this is used to determine how to partition or split the array of elements into values that are greater than the pivot or values that are less than the pivot. From here the process continues recursively on the sub arrays until the array has been sorted. Below is an example of the output and swaps made in a quicksort:
Below is an example of how a Quick sort can be implemented:
public static int[] QuickSort(int[] array)
{
if (array.Length <= 1)
{
return array;
}
return QuickSort(array, 0, array.Length - 1);
}
static int[] QuickSort(int[] array, int leftIndex, int rightIndex)
{
int pivot = array[leftIndex];
int currentLeftIndex = leftIndex;
int currentRightIndex = rightIndex;
//While the two sorting indices don't meet, this is to localise the sorting to a given range and essential define our partition for the Quick Sort
while (currentLeftIndex <= currentRightIndex)
{
/* Comparing if the left value is less than the pivot value
then moving the left index along until the left value >= the pivot*/
while (array[currentLeftIndex] < pivot)
{
currentLeftIndex++;
}
/* Comparing if the right value is greater than the pivot value
then moving the right index along until it is <= the pivot value */
while (array[currentRightIndex] > pivot)
{
currentRightIndex--;
}
//Swap the values when the number on the left is less then the number on the right
if (currentLeftIndex <= currentRightIndex)
{
Console.WriteLine($"Swapping {array[currentLeftIndex]} at {currentLeftIndex} with {array[currentRightIndex]} at {currentRightIndex}");
(array[currentRightIndex], array[currentLeftIndex]) = (array[currentLeftIndex], array[currentRightIndex]);
currentLeftIndex++;
currentRightIndex--;
}
}
/*If the starting index on the left side is less than the current right index then sort again. This is done using our new indices that will act as a partition of the array and localise sorting to that range.
as there are more swaps to be made before the array has been sorted */
if (leftIndex < currentRightIndex)
{
QuickSort(array, leftIndex, currentRightIndex);
}
/*If the starting index on the right side is greater than the current left index then sort again. This is done using our new indices that will act as a partition of the array and localise sorting to that range.
as there are more swaps to be made before the array has been sorted */
if (rightIndex > currentLeftIndex)
{
QuickSort(array, currentLeftIndex, rightIndex);
}
return array;
}
In terms of complexity the Quick Sort is in the worst case O(N²). However on average the performance of the Quick Sort is closer to O(N log N) where N is the total number of partitions.
Merge Sort
A merge sort is a recursive method to sort an array. It works by subdividing the original array over and over again until each subsequent array has only one element in it:
From here the arrays are merged back together sorted by swapping the elements in the joining arrays in a specified sorting direction(ascending or descending). In the example below the numbers are sorted by ascending order.
Codewise how this can be done is by first splitting the array into two parts representing the left side and right side until the array contains only one element in it. Next the arrays are rejoined back together again sorted using the sorting algorithm.
public static int[] MergeSort(int[] array, bool ascending)
{
if (array.Length == 0)
{
return [];
}
//Determine the middle index of the array
int middle = array.Length / 2;
int[] leftArray = array.Take(middle).ToArray();
int[] rightArray = array.Skip(middle).Take(array.Length - middle).ToArray();
if (leftArray.Length > 1)
{
MergeSort(leftArray, ascending);
}
if(rightArray.Length > 1)
{
MergeSort(rightArray, ascending);
}
MergeArray(leftArray, rightArray, array, ascending);
return array;
}
The merge sort algorithm is fairly straightforward. Essentially, compare the left side and right side and then if left > right (for ascending ordering) the sorted array at the current index equals the number from the left array else it contains the number from the right array. Then if not all elements from either array had been sorted they are added to the end of the sorted array first using the left elements then the right elements as this was how we were comparing them (left to right).
static void MergeArray(int[] leftArray, int[] rightArray, int[] sortedArray, bool ascending)
{
int leftIndex = 0;
int rightIndex = 0;
int sortedIndex = 0;
while (leftIndex < leftArray.Length && rightIndex < rightArray.Length)
{
if (ascending)
{
/* If sorting by ascending order we need to set the number in the sorted at
the current index to the lowest value between the two arrays */
sortedArray[sortedIndex++] = leftArray[leftIndex] <= rightArray[rightIndex] ?
leftArray[leftIndex++] :
rightArray[rightIndex++];
}
else
{
/* If sorting by descending order we need to set the number in the sorted at
the current index to the highest value between the two arrays */
sortedArray[sortedIndex++] = leftArray[leftIndex] >= rightArray[rightIndex] ?
leftArray[leftIndex++] :
rightArray[rightIndex++];
}
}
//If there are any missing elements in the swapped arrays then add them to the sorted array
while (leftIndex < leftArray.Length)
{
sortedArray[sortedIndex++] = leftArray[leftIndex++];
}
while (rightIndex < rightArray.Length)
{
sortedArray[sortedIndex++] = rightArray[rightIndex++];
}
Console.WriteLine($"{string.Join(',', leftArray)} + {string.Join(',', rightArray)} = {string.Join(',', sortedArray)}");
}
To test the algorithm I’ve implement the following in the program.cs file:
int[] numbers = new int[] { 10, 5, 16, 99, 87, 45, 20, 35,17 };
Console.WriteLine($"Input: {string.Join(',', numbers)}");
int[] sortedArray = MergeSortAlgorithim.MergeSort(numbers, true);
Console.WriteLine($"Output: {string.Join(',', sortedArray)}");
Console.ReadKey();
Console log:
In terms of complexity the merge sort algorithm is one of the more efficient ones that we had explored in the blog post.
If you want the Big O notation of it, it is O(N log N).
O represents the complexity of operations. Where N is the length of the array as we need to search the array to sort it. Log N represents the merge process per split array as we need to compare each element in each of the split arrays in the MergeArray method with each other.
Speed Comparisons
Out of pure curiosity, I decided to create a little speed test using the Stopwatch for the algorithms and code written above with the inclusion of the most common method I have seen (in recent times) for sorting an array of integers and that is using Linq Orderby.
Here is the code I wrote to test the performance of the algorithms in the post:
internal class SpeedTest
{
const string LINQ_ORDERBY = "Linq OrderBy";
const string BUBBLE_SORT = "Bubble Sort";
const string QUICKSORT = "Quicksort";
const string MERGESORT = "Mergesort";
public static void TestAlgorithims(int minElements, int maxElements)
{
int[] testArray = generateRandomArray(minElements, maxElements);
//Console.WriteLine($"Length of Array: {testArray.Length}");
Stopwatch stopWatch = new();
testLinqSort(ref testArray, stopWatch);
testBubbleSort(ref testArray, stopWatch);
testQuickSort(ref testArray, stopWatch);
testMergeSort(ref testArray, stopWatch);
}
private static void testMergeSort(ref int[] testArray, Stopwatch stopWatch)
{
stopWatch.Start();
MergeSortAlgorithim.MergeSort(testArray, false);
stopWatch.Stop();
writeResults(stopWatch, MERGESORT, testArray.Length);
stopWatch.Reset();
}
private static void testQuickSort(ref int[] testArray, Stopwatch stopWatch)
{
stopWatch.Start();
QuickSortAlgorithim.QuickSort(testArray);
stopWatch.Stop();
writeResults(stopWatch, QUICKSORT, testArray.Length);
stopWatch.Reset();
}
private static void testBubbleSort(ref int[] testArray, Stopwatch stopWatch)
{
stopWatch.Start();
BubbleSortAlgorithim.BubbleSort(testArray, false);
stopWatch.Stop();
writeResults(stopWatch, BUBBLE_SORT, testArray.Length);
stopWatch.Reset();
}
private static void testLinqSort(ref int[] testArray, Stopwatch stopWatch)
{
stopWatch.Start();
testArray.OrderBy(x => x);
stopWatch.Stop();
writeResults(stopWatch, LINQ_ORDERBY, testArray.Length);
stopWatch.Reset();
}
private static void writeResults(Stopwatch stopWatch, string sortingAlgorithim, int arrayLength)
{
TimeSpan ts = stopWatch.Elapsed;
//string elapsedTime = String.Format($"Total Nanoseconds: {ts.TotalNanoseconds}");
// Console.WriteLine($"Execution Time for {sortingAlgorithim} {elapsedTime}");
Console.WriteLine($"{sortingAlgorithim} : {arrayLength} : {ts.TotalNanoseconds}");
}
private static int[] generateRandomArray(int minElements, int maxElements)
{
Random r = new();
int count = r.Next(minElements, maxElements);
int[] testArray = new int[count];
int currentIndex = 0;
while (currentIndex < count)
{
testArray[currentIndex] = r.Next(0, maxElements);
currentIndex++;
}
return testArray;
}
}
Here is the output from the testing (the green columns are the most efficient for that number of elements). The time is measured in nanoseconds (one nanosecond = one billionth of a second). You’ll notice from the table below that there are two entries for 4 elements for each algorithm, this is because during testing I noticed that the first test for each algorithm would always have inflated times but, seem to be corrected on the second set of tests:
As you can see from the testing, Linq Orderby is much more efficient at larger datasets then smaller ones afterwards it’s the bubble sort for what seems to be datasets of less than 100. Viewing the graph overall on average Linq does come on top followed by, mergesort, quicksort and lastly bubble sort.
In conclusion, use Linq Orderby. (・‿・)ノ⌒*:・゚✧
Viewing the graph above, all other algorithms seem to be outperformed by Linq Orderby after the number of elements in the array grew beyond 100 elements.
Now is any of the performance testing a definitive guide on what to use when, no, I don’t think so but, it does seem to indicate what could work well under specific conditions.
If you’ve learned something new or found something useful feel free to leave a like, or share the post and I will see you in the next post >>
Sources:
Top comments (0)