Continuing our recent dive into search algorithms, we will now be covering the Jump Search algorithm. Similar to Binary Search the Jump Search algorithm is a searching algorithm for pre-sorted arrays.
The basic idea is to check fewer elements than Linear Search by jumping ahead by a fixed increment and backtracking when a larger value than the one searched for is found, running a Linear Search from there to find if the value exists. This may sound strange at first but by the end of this article, all will be clear.
This algorithm saves time and is thus faster when compared to a Linear Search since we do not look at all elements in the collection. It is however slower than Binary Search since the Jump Search algorithm runs with a best case Big O of
Jump search has one advantage over Binary Search however and that is that we only go backwards once whereas Binary Search can jump back as many as
O(Log n) times. This being the case, in situations where Binary Search seems to be running slow due to backward lookups, Jump Search might be able to help you in such a scenario to speed things up.
from main import jumpSearch; def test_strings(): assert jumpSearch(["Dave", "James"], "Dave") == 0; assert jumpSearch(["Dave", "James"], "abc") == -1; def test_integers(): assert jumpSearch([1,2,3,4], 3) == 2; assert jumpSearch([1,2,3,4], 5) == -1; def test_floats(): assert jumpSearch([3.14], 3.14) == 0; assert jumpSearch([3.141], 3.14) == -1; def test_booleans(): assert jumpSearch([False, True], True) == 1; assert jumpSearch([False, True], False) == 0; def test_lists(): assert jumpSearch([[3.14]], [3.14]) == 0; assert jumpSearch([[3.141]], [3.14]) == -1; def test_nested_lists(): assert jumpSearch([[3.14, ]], [3.14, ]) == 0; assert jumpSearch([[3.141]], [3.14]) == -1; def test_sets(): assert jumpSearch([set()], set()) == 0; assert jumpSearch([set()], set([1,2])) == -1; def test_tuples(): assert jumpSearch([tuple()], tuple()) == 0; assert jumpSearch([tuple()], tuple([1, 2])) == -1;
The difference in tests comes, as with binary search, in the comparison operators we are using. In our case the tests for
dict were removed and the test for
None have also been removed. These are not possible to have due to these not being able to compare to one another using the less than (
<) operator which is used in the Jump Search algorithm implementation.
You can run and explore through the tests via the repl below:
from math import sqrt; from typing import List, TypeVar; from operator import eq as deep_equals, ge as greater_than_or_equal, ne as not_equal, lt as less_than; T = TypeVar('T') def linearSearch(collection: List[T], value: T) -> int: for index, item in enumerate(collection): if deep_equals(item, value): return index; return -1; def jumpSearch(collection: List[T], value: T) -> int: n = len(collection); step = sqrt(n); prev = 0; while less_than(collection[int(min(step, n) - 1)], value): prev = step; step *= 2; if greater_than_or_equal(prev, n): return -1; foundIndex = linearSearch(collection[int(prev):], value); if not_equal(foundIndex, -1): foundIndex += int(prev); return foundIndex;
Note 1: Since Jump Search utilises Linear Search, we will use the implementation we built previously in this article series for that.
jumpSearch function, we require a
collection and a
value to be provided and will return an
int (integer) representing the index of the
value if it is found and
-1 if it is not. The
value itself is what we will be looking for in the
Note 2: The
collectionshould be a
Listof items matching the same type as the
valuewe are searching for and thus, if the
valueis of type
str(string) then we expect the
collectionto be a
Whence we enter the
jumpSearch function body, we declare 3 variables:
- The variable
nto represent the amount of items in the
- The variable
stepto be the value of how far we should jump each time, this is defined as the square root of
- The variable
prevto represent the last index visited when the last
From here we run a
while loop as long as the current item at
int(min(step, n) - 1) is less than the
value we are searching for. You may be wondering why we use
int(min(step, n) - 1). Let's break it down from the inside out. Let's use the first iteration of the
while loop for a Pseudocode example:
collection = [1, 2, 3]; value = 2; n = len(collection); # 3 step = sqrt(n); # 1.73205080757 prev = 0; # 0 nextIndex = min(step, n) - 1; # 0.73205080757 indexAsInteger = int(nextIndex) # 0 itemAtIndex = collection[indexAsInteger]; # 1 condition = less_than(itemAtIndex, value); # True while condition is True: # now we run the loop since the condition is True
That is the essential breakdown of the
while loops condition and hopefully that makes sense as to why we run
int(min(step, n) - 1) in the implementation now.
Inside the loop we set the
prev to equal the current
step and then add the
step to itself for the next iteration of the
while loop. During the loop execution we check that the
prev variable is not greater than or equal to the length of the
collection which we stored in the variable
n. If it is, we haven't found the
value in the
collection and return
-1 to represent that it wasn't found.
If we exit the loop and didn't return
-1, we must have the
value still in the
collection and so we then run the
linearSearch function with a new array created from a slice of the
collection running from the
prev step index to the end of the collection and pass in the
value we are looking for also. If the
linearSearch run doesn't return
-1, we have found the
value and all we need to do is add the
prev step index value to the
Note 3: We do this because if
-1wasn't returned and thus the
valuewas found. If you remember we took a slice of the
collectionbeginning from the
prevstep to the end of the
linearSearchto run over and thus if the value was found by
1then it would be in the
prev + 1.
This works well for us since we return the
foundIndex either way and so if the
value is not found we will just return the
Jump Search is useful in places where Binary Search is potentially slow due to backwards lookups. It is faster than a regular Linear Search and tries to maximise efficiency by using jumps through the sorted collection.
I hope you found some value in this article and if you have any questions or comments, feel free to write them below!