Continuing on with my series in sorting algorithms, next I'd like to talk about another crowd favorite: merge sort!
Merge sort is another one of the algorithms that is high on the interview study sheet, so I thought it'd be great to go over this one for anyone out there interviewing at this time.
Overview
If you've read the first post in this series, you'll be able to recognize that merge sort shares quite a few similarities with quick sort. Merge sort also follows the "divide and conquer" strategy and involves recursion.
However, merge sort is not an in-place sorting algorithm. This means that it does not take a constant amount of memory and therefore takes O(n) space. The amount of memory it takes is proportional to the amount of elements in the array.
Unlike the space complexity, the time complexity is always O(n log n), in average case run time as well as worst case. This is actually quite fast compared to quick sort.
Theory/Break Down
Alright, let's break down the basic idea behind merge sort.
Essentially, merge sort has two main functionalities going on. The original array is being broken down by halves until only one value remains in each half and then the values are sorted and merged back together.
First, we'll go over the breaking down of the original array. The array is split into two halves, and these halves are split into two more halves, and this continues recursively until we reach one value in each array. In theory, if there is only one value remaining, it has to be sorted right?
Upon merging, the value that is smaller gets chosen to be placed first into the new, correctly sorted array. This continues until both of the sorted halves are empty and all of the values have been placed into the new array. In the end, the original array can be pieced back together with correctly sorted sub arrays (or the smaller halves).
The merging function is being called multiple times, but you can see how it works at each level with the diagrams below. We are starting with the final merge, the two original sorted halves are now going to be merged together into the final original array. The two sorted arrays take turns checking which value is smaller. When they've decided, that value is put into the original array, replacing its previous value.
I've labeled the diagram below for clarity as we continue on. You can see that the left pointer begins at index 0 of the left array, the right pointer begins at index 0 of the right array, and our array pointer begins at the 0 index of our final array.
First, we will check left pointer value against the right pointer value. Here 1 is smaller than 7, so 1 will be put in the 0 index of the final array. Then we will increment the left pointer value and the array pointer value.
Next, we'll compare the left pointer value to the right pointer value again, and again, 2 is smaller than 10 so it will be put into the 1 index in the final array.
This continues until one of the arrays are empty or the array is correctly sorted.
Actual Code
def merge_sort(array)
if array.length <= 1
return array
end
array_size = array.length
middle = (array.length / 2).round
left_side = array[0...middle]
right_side = array[middle...array_size]
sorted_left = merge_sort(left_side)
sorted_right = merge_sort(right_side)
merge(array, sorted_left, sorted_right)
return array
end
def merge(array, sorted_left, sorted_right)
left_size = sorted_left.length
right_size = sorted_right.length
array_pointer = 0
left_pointer = 0
right_pointer = 0
while left_pointer < left_size && right_pointer < right_size
if sorted_left[left_pointer] < sorted_right[right_pointer]
array[array_pointer] = sorted_left[left_pointer]
left_pointer+=1
else
array[array_pointer] = sorted_right[right_pointer]
right_pointer+=1
end
array_pointer+=1
end
while left_pointer < left_size
array[array_pointer] = sorted_left[left_pointer]
left_pointer+=1
array_pointer+=1
end
while right_pointer < right_size
array[array_pointer] = sorted_right[right_pointer]
right_pointer+=1
array_pointer+=1
end
return array
end
In the code above, the majority of the merge sort function is splitting up the array into the two halves and then passing them into the merge function. Originally I had the merge function named sort, but I felt that it should be more correctly named merge as that is what is taking place inside the function. The array that is returned from the merge function is passed back up via recursion to get our final, sorted array.
This is just my implementation, so feel free to play around with it and leave your implementation in the comments! It's amazing and I love to see everyone's different ideas and knowledge come together.
Resources
Here are a few videos behind the theory and implementation of merge sort in case you'd like more explanation:
Also a bonus new resource I found for recursion:
I hope that this could be helpful for you and I'll catch you for the next installment of my sorting algorithm series! Happy coding 😊
Top comments (2)
Hi Megan,
thanks for sharing this article, this is great work.
I'm not sure if it's more readable, but I think you can merge (no pun intended :)) the three while as they look a lot similar in your merge method. I tried the following refactor and it seems to work:
I love it! Thanks a bunch for that Matthieu!
And I appreciate you following along with the series as well. 😊