## DEV Community is a community of 890,377 amazing developers

We're a place where coders share, stay up-to-date and grow their careers. # How to find an impostor binary search implementation in Python! :-)

Recently I have been working on writing STL algorithms of C++ in Python (here). I came across a typical problem, which was how to test the implementation of binary search algorithm? Let us write some tests first.
You can write tests using any Python testing framework like `pytest` , `unittest` etc, here I am using `unittest` which is part of Python Standard Library.

``````import random
import unittest

from binary_search import binary_search

class BinarySearchTestCase(unittest.TestCase):

def test_empty(self):
arr = []
self.assertFalse(binary_search(arr, 5))

def test_true(self):
arr = [1,2,3,4,5]
self.assertTrue(binary_search(arr, 4))

def test_false(self):
arr = [1,2,3,4,5]
self.assertFalse(binary_search(arr, 99))

def test_on_random_list_false(self):
arr = [random.randint(-500, 500) for _ in range(500)]
arr.sort()
self.assertFalse(binary_search(arr, 999))

if __name__ == '__main__':
unittest.main()
``````

The testcases are divided as follows:

• Searching for any element in an empty list should result False.
• Searching for an element present in the list should result True.
• Searching for an element not present in the list should result False.

The above testcases seem reasonable. To be more robust about writing the testcases we should use hypothesis library which is the Python port of QuickCheck library in Haskell. You can simply install it using `pip install hypothesis`.
The tests using hypothesis are as below:

Hypothesis automatically generates different testcases given the specification, which in this case is a list of integers.

Now the fun part is the binary search code:

Let us run the test now.

``````\$ python test.py
...
----------------------------------------------------------------------
Ran 3 tests in 0.380s

OK
``````

The above code is no where near the binary search implementation, but passes all the tests! The linear search algorithm passes the binary search testcases! What?? Now how can we rule out this impostor code?

The problem with these tests are that it doesn't use any of the property of binary search algorithm, it just checks the property of a searching algorithm.

We know one property of binary search that at maximum log2(n) + 1 items will be seen, as it discards half the search space at every iteration.
Here `n` is the total number of elements in the array.

So we write a class which behaves like a list, by implementing `__iter__` and `__getitem__` special methods.

``````class Node:
def __init__(self, arr):
self.arr = arr
self.count = 0

def __iter__(self):
for x in self.arr:
self.count += 1
yield x

def __getitem__(self, key):
self.count += 1
return self.arr[key]

def __len__(self):
return len(self.arr)
``````

We now have a `Node` class which is similar to `list` class but additionally has a count variable, which increments every time an element is accessed. This will help to keep track of how many elements the binary search code checks.

In Python, there is a saying, if something walks like a duck, quacks like a duck, it is a duck.

We add this extra testcase using the above `Node` class.

``````import math

@given(st.lists(st.integers(), min_size=1))
def test_binary_search_with_node(self, arr):
arr.sort()
target = arr[-1]
max_count = int(math.log2(len(arr))) + 1
arr = Node(arr)
ans = binary_search(arr, target)
self.assertTrue(ans)
self.assertTrue(arr.count <= max_count)
``````

Let us run the tests again now:

``````\$ python test.py
..Falsifying example: test_binary_search_with_node(
self=<__main__.BinarySearchTestCase testMethod=test_binary_search_with_node>,
arr=[0, 0, 1],
)
F.
======================================================================
FAIL: test_binary_search_with_node (__main__.BinarySearchTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "code.py", line 48, in test_binary_search_with_node
def test_binary_search_with_node(self, arr):
File "/home/tmp/venv/lib/python3.6/site-packages/hypothesis/core.py", line 1162, in wrapped_test
raise the_error_hypothesis_found
File "code.py", line 54, in test_binary_search_with_node
self.assertTrue(arr.count <= math.log2(len(arr)) + 1)
AssertionError: False is not true

---------------------------------------------------------------------------
Ran 4 tests in 0.435s

FAILED (failures=1)
``````

This code fails because each and every element will be checked once, which is not true for binary search. It discards half the search space at every iteration. Hypothesis also provides the minimum testcase which failed the test, which in this case is an array of size 3.
Impostor code found!

### Complete test code

``````import random
import math
import unittest

from hypothesis import given
import hypothesis.strategies as st

from binary_search import binary_search

class Node:
def __init__(self, arr):
self.arr = arr
self.count = 0

def __iter__(self):
for x in self.arr:
self.count += 1
yield x

def __getitem__(self, key):
self.count += 1
return self.arr[key]

def __len__(self):
return len(self.arr)

class BinarySearchTestCase(unittest.TestCase):

@given(st.integers())
def test_empty(self, target):
arr = []
arr.sort()
self.assertFalse(binary_search(arr, target))

@given(st.lists(st.integers(), min_size=1))
def test_binary_search_true(self, arr):
arr.sort()
target = random.choice(arr)
self.assertTrue(binary_search(arr, target))

@given(st.lists(st.integers(), min_size=1))
def test_binary_search_false(self, arr):
arr.sort()
target = arr[-1] + 1
self.assertFalse(binary_search(arr, target))

@given(st.lists(st.integers(), min_size=1))
def test_binary_search_with_node(self, arr):
arr.sort()
target = arr[-1]
arr = Node(arr)
max_count = int(math.log2(len(arr))) + 1
ans = binary_search(arr, target)
self.assertTrue(ans)
self.assertTrue(arr.count <= max_count)

if __name__ == '__main__':
unittest.main()
``````

### Where to go from here?

• Check out this awesome talk by John Huges on Testing the hard stuff and staying sane, where he talks about how he used `QuickCheck` for finding and fixing bugs for different companies.
• Check out this talk on `hypothesis`, the port of `QuickCheck` in Python by ZacHatfield-Dodds.
• Read more on `unittest` framework here.

Happy learning!