Binary search is a blazingly-fast search algorithm that has a worst case Big O of `Ξ(log n)`

, the average case is `O(log n)`

and the best case is `O(1)`

. In short it's a search algorithm that is about as fast as you can get.

Binary Search utilises divide and conquer to reduce much of the potential runtime. In fact, it is a lot like merge sort in how it functions but of course one algorithm is used to sort data and this one is used to find data.

For the algorithm to run properly, the data must be pre-sorted and be have elements of the same or similar types held within. So if your collection contains strings and integers, it won't work, you must have elements using a consistent data type withinin your collection.

When the binary search is executed, it will take an input `collection`

and a `value`

to find and if the value exists we will return the index of that `value`

within the `collection`

and if it doesn't we will return `-1`

out of convention to represent that the `value`

doesn't exist in the `collection`

after all.

## Tests

```
from main import binarySearch;
def test_strings():
assert binarySearch(["Dave", "James"], "Dave") == 0;
assert binarySearch(["Dave", "James"], "abc") == -1;
def test_integers():
assert binarySearch([1, 2, 3,4], 3) == 2;
assert binarySearch([1, 2, 3, 4], 5) == -1;
def test_floats():
assert binarySearch([3.14], 3.14) == 0;
assert binarySearch([3.141], 3.14) == -1;
def test_booleans():
assert binarySearch([False, True], True) == 1;
assert binarySearch([True], False) == -1;
def test_lists():
assert binarySearch([[3.14]], [3.14]) == 0;
assert binarySearch([[3.141]], [3.14]) == -1;
def test_nested_lists():
assert binarySearch([[3.14, [1]]], [3.14, [1]]) == 0;
assert binarySearch([[3.141]], [3.14]) == -1;
def test_sets():
assert binarySearch([set([1])], set([1])) == 0;
assert binarySearch([set([1])], set([1,2])) == -1;
def test_tuples():
assert binarySearch([tuple([0,2]), tuple([1])], tuple([1])) == 1;
assert binarySearch([tuple([1])], tuple([1, 2])) == -1;
```

In python, as with many other languages, some elements can't compare with one another in a non-hacky manner, even if they are both the same type. An example of this is that types such as `dict`

cannot run comparisons like `{'a': 1} > {'a': 1}`

.

For binary search we do need to use comparison operators like that though and so I represented only the types that are comparible with the `==`

, `>`

, `>=`

, `<`

and `<=`

operators that binary search utilises.

Here is a repl to run and play with the tests:

## Implementation

There are a few ways to skin this particular cat. For that reason, I will show you two implementations for the Binary Search algorithm. One will be using an imperative approach and another using Recursion.

Both implementations expect uniform data and are typed for this reason. We expect data to contain consistent types in the input collection so that the search is guaranteed to be able to compare the items in the collection against one another. If we didn't do this we may get unexpected errors such as this:

`TypeError: '>' not supported between instances of 'list' and 'str'`

Just remember to normalise and sort your data before you begin using the implementations that will be outlined below or if you decide to build your own implementation in the future as these traits are a requirement of the algorithm.

### The imperative approach

Imperative programming is a programming paradigm which describes *how* a programme should run. This paradigm uses statements to alter a programmes state and generate values.

```
from typing import List, TypeVar;
from operator import eq as deep_equals, gt as greater_than, ne as not_equal, le as less_than_or_equal;
from math import floor;
T = TypeVar('T');
def binarySearch(collection: List[T], value: T) -> int:
start = 0;
end = len(collection) - 1;
middle = floor((start + end) / 2);
while(not_equal(collection[middle], value) and less_than_or_equal(start, end)):
if greater_than(value, collection[middle]): start = middle + 1;
else: end = middle - 1;
middle = floor((start + end) / 2);
return middle if deep_equals(collection[middle], value) else -1;
```

We begin by importing some helpers from the python standard library which has a whole swath of helper functions for us to use in our code.

A point worth noting is the use of

`TypeVar`

from the`typing`

module to align our input types for the`collection`

and`value`

.All this means is that when we recieve the collection, it should have items of consistent type

`T`

and the`value`

we want to search for should also be of type`T`

.So if we have the

`collection`

of type`List[str]`

for example then value must also be an`str`

but instead of hard-coding this, we let`TypeVar`

do the work for us.

Our `binarySearch`

function will recieve a `collection`

to search within and a `value`

to look for in that `collection`

. If the `value`

is found then we return it's location (index) in the `collection`

.

To find the index we setup the `start`

, `end`

and `middle`

indexes of the `collection`

. While the `value`

is not equal to the value in the `middle`

of the `collection`

and the `start`

index is less than or equal to the `end`

index, we check if the `value`

is bigger than the `middle`

value, if it is, we set the `start`

index to begin on the next iteration from the `middle`

, otherwise we set the end to be the `middle`

.

We do this because if the

`value`

is greater than the value in the`middle`

of the`collection`

, the`value`

must be on the right half of the collection and vice versa for the opposite case.

Either way the condition falls we will then reset the `middle`

index to be in between the new `start`

and `end`

indexes. We continue with this process until the condition on the while loop is no longer true.

When the while loop exits, we check if the `middle`

indexes value is the `value`

we are looking for. If it is, we return the `middle`

index and if it isn't then the `value`

wasn't in the input collection and we return `-1`

to represent this case.

### The recursive approach

```
from typing import List, TypeVar, Union;
from operator import eq as deep_equals, gt as greater_than, ne as not_equal, le as less_than_or_equal, ge as greater_than_or_equal;
from math import floor;
T = TypeVar('T');
def binarySearch(collection: List[T], value: T, start: int = 0, end: Union[int, None] = None) -> int:
if deep_equals(end, None):
end = len(collection) - 1;
if greater_than_or_equal(end, start):
middle = floor((start + end) / 2);
if deep_equals(collection[middle], value):
return middle;
if greater_than(collection[middle], value):
return binarySearch(collection, value, start, middle - 1);
return binarySearch(collection, value, middle + 1, end);
return -1;
```

Exactly like the imperative approach outlined above, we begin by importing some helpers from the python standard library which has a whole swath of helper functions for us to use in our code.

A point worth noting is the use of

`TypeVar`

from the`typing`

module to align our input types for the`collection`

and`value`

.All this means is that when we recieve the collection, it should have items of consistent type

`T`

and the`value`

we want to search for should also be of type`T`

.So if we have the

`collection`

of type`List[str]`

for example then value must also be an`str`

but instead of hard-coding this, we let`TypeVar`

do the work for us.

The recursive version naturally calls itself again and again and requires a couple more parameters to be provided on each call than the imperative approach does. Namely we need to know where to the `start`

index and `end`

index are for the current recursive call of the `binarySearch`

function.

Initially the `start`

will be at `0`

which makes sense since we want to begin at the start of the `collection`

and the `end`

initially will be set to `None`

but will be set to the length of the `collection`

mins one once the function first runs.

Note: We set

`end`

to initialise itself with a value of`None`

because in python you can't reference other parameters to use as default values of other parameters. To get around this, we set it's type to`Union[None, int]`

which just means this can be of type`None`

or`int`

. Once the function first executes we set the`end`

value to the length of the`collection`

mins one because if a`collection`

contains 5 items the last index would be 4 since lists/arrays are always 0 indexed.

If the `end`

index is greater than or equal to the starting index we calculate where the `middle`

index of the current `collection`

is and consider 3 cases:

- If the item at the
`middle`

index equals the value we are looking for, return the`middle`

index - If the item at the
`middle`

index is greater than the value we are looking for then run a recursive call to`binarySearch`

to check the left half of the`collection`

for the`value`

- If the item at the
`middle`

index is less than the value we are looking for then run a recursive call to`binarySearch`

to check the right half of the`collection`

for the`value`

If none of these recursive calls manage to find the `value`

in the `collection`

then we return `-1`

by default.

## Conclusions

The Binary Search algorithm is always a fantastic choice to find values within sorted and type-consistent data sets.

Personally I try to always keep data within applications as uniform as possible and thus, I find myself using this algorithm relatively often in my day to day work. I highly recommend looking into it further and seeing how it can benefit you and your work too!

I hope you learned something new today and if you have any questions, ideas or comments, feel free to comment them below!

## Latest comments (0)