Jump Search

#computerscience #algorithms #data #python

Continuing our recent dive into search algorithms, we will now be covering the Jump Search algorithm. Similar to Binary Search the Jump Search algorithm is a searching algorithm for pre-sorted arrays.

The basic idea is to check fewer elements than Linear Search by jumping ahead by a fixed increment and backtracking when a larger value than the one searched for is found, running a Linear Search from there to find if the value exists. This may sound strange at first but by the end of this article, all will be clear.

This algorithm saves time and is thus faster when compared to a Linear Search since we do not look at all elements in the collection. It is however slower than Binary Search since the Jump Search algorithm runs with a best case Big O of O(√n).

Jump search has one advantage over Binary Search however and that is that we only go backwards once whereas Binary Search can jump back as many as O(Log n) times. This being the case, in situations where Binary Search seems to be running slow due to backward lookups, Jump Search might be able to help you in such a scenario to speed things up.

Tests

from main import jumpSearch;

def test_strings():
  assert jumpSearch(["Dave", "James"], "Dave") == 0;
  assert jumpSearch(["Dave", "James"], "abc") == -1;

def test_integers():
  assert jumpSearch([1,2,3,4], 3) == 2;
  assert jumpSearch([1,2,3,4], 5) == -1;

def test_floats():
  assert jumpSearch([3.14], 3.14) == 0;
  assert jumpSearch([3.141], 3.14) == -1;

def test_booleans():
  assert jumpSearch([False, True], True) == 1;
  assert jumpSearch([False, True], False) == 0;

def test_lists():
  assert jumpSearch([[3.14]], [3.14]) == 0;
  assert jumpSearch([[3.141]], [3.14]) == -1;

def test_nested_lists():
  assert jumpSearch([[3.14, [1]]], [3.14, [1]]) == 0;
  assert jumpSearch([[3.141]], [3.14]) == -1;

def test_sets():
  assert jumpSearch([set([1])], set([1])) == 0;
  assert jumpSearch([set([1])], set([1,2])) == -1;

def test_tuples():
  assert jumpSearch([tuple([1])], tuple([1])) == 0;
  assert jumpSearch([tuple([1])], tuple([1, 2])) == -1;

You may notice that these are similar to the tests for Linear Search and Binary Search that we wrote in previous articles and this is on purpose.

The difference in tests comes, as with binary search, in the comparison operators we are using. In our case the tests for dict were removed and the test for boolean vs None have also been removed. These are not possible to have due to these not being able to compare to one another using the less than (<) operator which is used in the Jump Search algorithm implementation.

You can run and explore through the tests via the repl below:

Implementation

from math import sqrt;
from typing import List, TypeVar;
from operator import eq as deep_equals, ge as greater_than_or_equal, ne as not_equal, lt as less_than;

T = TypeVar('T')

def linearSearch(collection: List[T], value: T) -> int:
  for index, item in enumerate(collection):
    if deep_equals(item, value): return index;
  return -1;

def jumpSearch(collection: List[T], value: T) -> int: 
  n = len(collection);
  step = sqrt(n);
  prev = 0;

  while less_than(collection[int(min(step, n) - 1)], value):
    prev = step;
    step *= 2;
    if greater_than_or_equal(prev, n): return -1;

  foundIndex = linearSearch(collection[int(prev):], value);
  if not_equal(foundIndex, -1): foundIndex += int(prev);
  return foundIndex;

Note 1: Since Jump Search utilises Linear Search, we will use the implementation we built previously in this article series for that.

For the jumpSearch function, we require a collection and a value to be provided and will return an int (integer) representing the index of the value if it is found and -1 if it is not. The value itself is what we will be looking for in the collection.

Note 2: The collection should be a List of items matching the same type as the value we are searching for and thus, if the value is of type str (string) then we expect the collection to be a List[str] for example.

Whence we enter the jumpSearch function body, we declare 3 variables:

The variable n to represent the amount of items in the collection
The variable step to be the value of how far we should jump each time, this is defined as the square root of n
The variable prev to represent the last index visited when the last step was taken

From here we run a while loop as long as the current item at int(min(step, n) - 1) is less than the value we are searching for. You may be wondering why we use int(min(step, n) - 1). Let's break it down from the inside out. Let's use the first iteration of the while loop for a Pseudocode example:

collection = [1, 2, 3];
value = 2;
n = len(collection); # 3
step = sqrt(n); # 1.73205080757
prev = 0; # 0
nextIndex = min(step, n) - 1; # 0.73205080757
indexAsInteger = int(nextIndex) # 0
itemAtIndex = collection[indexAsInteger]; # 1
condition = less_than(itemAtIndex, value); # True
while condition is True: # now we run the loop since the condition is True

That is the essential breakdown of the while loops condition and hopefully that makes sense as to why we run int(min(step, n) - 1) in the implementation now.

Inside the loop we set the prev to equal the current step and then add the step to itself for the next iteration of the while loop. During the loop execution we check that the prev variable is not greater than or equal to the length of the collection which we stored in the variable n. If it is, we haven't found the value in the collection and return -1 to represent that it wasn't found.

If we exit the loop and didn't return -1, we must have the value still in the collection and so we then run the linearSearch function with a new array created from a slice of the collection running from the prev step index to the end of the collection and pass in the value we are looking for also. If the linearSearch run doesn't return -1, we have found the value and all we need to do is add the prev step index value to the foundIndex.

Note 3: We do this because if -1 wasn't returned and thus the value was found. If you remember we took a slice of the collection beginning from the prev step to the end of the collection for the linearSearch to run over and thus if the value was found by linearSearch at index 1 then it would be in the collection at index prev + 1.

This works well for us since we return the foundIndex either way and so if the value is not found we will just return the -1 anyway.

Conclusions

Jump Search is useful in places where Binary Search is potentially slow due to backwards lookups. It is faster than a regular Linear Search and tries to maximise efficiency by using jumps through the sorted collection.

I hope you found some value in this article and if you have any questions or comments, feel free to write them below!

DEV Community

Jump Search

Tests

Implementation

Conclusions

Latest comments (0)

Read next

Bubble Sort Algorithm in JavaScript

What is RAG, and why should I care?

CPP VS Python Benchmark Testing

Python Data Engineering: Comprehensive Workflow for Data Modeling, Analytics with DuckDB