DEV Community

Cover image for Bogosort
James Robb
James Robb

Posted on

Bogosort

In this article we will be taking a look at one of the worst sorting algorithms in existence. Bogosort is an algorithm that has a best case big O of O(n) which might make you think it's not that bad but when you see the average case big O which is O((n + 1)!) then you begin to realise the problem.

If you don't know much about big O notation it is basically just a way to calculate and measure the time and space complexity of our implementations. In this case the best case performance is dependant on the size of the collection being sorted but the average case is the size of the collection plus 1 factorial.

So, how does Bogosort actually work? Well, it takes a collection and while the collection is not in order it shuffles the whole collection on each iteration until it is sorted. Imagine this like taking a pack of cards, throwing them into the air, picking up each card randomly until all are collected and then checking if they are in order and if they aren't then repeat the process until they are in order. 😅

Clearly this is highly inneficient and actually I have found that collections over 15-20 items tend to take so long to sort that you might aswell give in. To show why this is, let's do the math on how many comparisons bogosort has to make on average for collections of 0 - 20 items in size:

Collection size Comparisons run
0 1
1 2
2 6
3 24
4 120
5 720
6 5040
7 40320
8 362880
9 3628800
10 39916800
11 479001600
12 6227020800
13 87178291200
14 1307674368000
15 20922789888000
16 355687428096000
17 6402373705728000
18 121645100408832000
19 2432902008176640000
20 51090942171709440000

Ok, so, let's take a second to comprehend these numbers, take a deep breath and repeat the immortal words of Ron Burgundy:

Well, that escalated quickly

Tests

<?php

declare(strict_types=1);

use PHPUnit\Framework\TestCase;

final class BogoSortTests extends TestCase {
  public function testSort(): void {
    $expected = [0, 1, 2, 3, 4];
    $actual = bogosort([3, 1, 2, 4, 0]);
    $this->assertEquals($expected, $actual);
  }

  public function testExceptionForInvalidInput(): void {
    $this->expectException(TypeError::class);
    bogosort("oops!");
  }
}

final class InOrderTests extends TestCase {
  public function testFalsePath(): void {
    $expected = false;
    $actual = in_order([3, 1, 2, 4, 0]);
    $this->assertEquals($expected, $actual);
  }

  public function testTruePath(): void {
    $expected = true;
    $actual = in_order([0, 1, 2, 3, 4]);
    $this->assertEquals($expected, $actual);
  }

  public function testExceptionForInvalidInput(): void {
    $this->expectException(TypeError::class);
    in_order("oops!");
  }
}
Enter fullscreen mode Exit fullscreen mode

Since I want the tests to run without too much runtime required I have kept the array to be tested at only 5 items in size. The tests are written in PHPUnit which is an awesome testing framework for PHP so if you haven't used it before I recommend it.

Anyway, effectively we just test that the sorting actually works when the bogosort function is run and that invalid input throws as expected. We also test the true, false and invalid input paths of the in_order function.

Implementation

<?php

declare(strict_types=1);

function bogosort(array $array) {
  while (!in_order($array)) {
    shuffle($array);
  }

  return $array;
}

function in_order(array $array) {
  for ($i = 1; $i < count($array); $i++) {
    if ($array[$i] < $array[$i - 1]) {
      return false;
    }
  }

  return true;
}
Enter fullscreen mode Exit fullscreen mode

When the bogosort function is ran it expects an array as input and while that array is not in order it uses PHPs default array shuffle function to pseudo-randomly shuffle the collection. This is inkeeping with the throwing a deck of cards in the air and randomly collecting them again analogy that was used in the introduction to this article. When the in_order function eventually returns true the sorted array is returned.

The in_order function itself loops over the array and checks if the current item is less than the item to the left of it and if it is returns false since the array cannot be in ascending order. If all items are in the right order then it will return true.

Conclusions

Although bogosort is a fun algorithm to ponder and see in action, it is of course one that shouldn't be used unless your aim is to seize up your system or production environment the second a middling collection comes along.

If I were to change one thing about my implementation though, it would be to use something like the Fisher-Yates shuffle algorithm over the PHP default shuffle function since the default shuffle mutates the input collection and this is something we would want to generally avoid.

The implementation of Fisher-Yates I would implement would look something like this:

<?php

declare(strict_types=1);

function fisherYates(array $array) {
  $output = $array;

  for($index = count($output) - 1; $index >= 0; $index--) { 
    $randomIndex = rand(0, $index + 1);

    if(array_key_exists($randomIndex, $output)) {
      $tmp = $output[$index]; 
      $output[$index] = $output[$randomIndex]; 
      $output[$randomIndex] = $tmp; 
    }
  }

  return $output;
}
Enter fullscreen mode Exit fullscreen mode

If we did use this implementation then the bogosort function would keep the original input array unaffected and thus preserve immutability. The new implementation of the bogosort function would end up looking like this if we were to use the fisherYates function to handle the shuffling:

function bogosort(array $array) {
  $output = $array;

  while (!in_order($output)) {
    $output = fisherYates($output);
  }

  return $output;
}
Enter fullscreen mode Exit fullscreen mode

I hope you found some value in this article and that you go forth and use your new knowledge wisely but remember:

Knowledge is power

Oldest comments (6)

Collapse
 
aminnairi profile image
Amin

Hi James, thanks for your article.

Can you explain why you need to check if the generated random index is in the output's indexes in your Fisher Yates implementation?

Collapse
 
jamesrweb profile image
James Robb • Edited

Good question Amin!

Imagine the following code:

$input = [1, 3, 4, 6, 2, 6];

print_r(
  fisherYates($input) // Example output: [1, 4, 3, 2, 6, 6]
);

There is a chance that we get a notice which starts off like this PHP Notice: Undefined offset: 6.... This becomes understandable when we go through what is happening on each line of the fisherYates implementation.

Let's look at our example code from above in action assuming we didn't have the array_key_exists conditional with a focus on the first loop iteration:

function fisherYates([1, 3, 4, 6, 2, 6]) {
  $output = [1, 3, 4, 6, 2, 6];

  for($index = 5; 5 >= 0; $index--) { 
    $randomIndex = rand(0, 6); // Let's say 6 comes back

    $tmp = $output[5]; 
    $output[5] = $output[6]; // Notice triggered here!
    $output[6] = $tmp;
  }

  return $output;
}

Clearly in this example 6 is not a valid offset in the array since arrays are indexed from 0 and thus the array in this example can only go up to offset 5. We still need the $index + 1 to be passed to the rand() function call though to increase the chances and possibilities for a swap, especially at lower indexes.

To resolve this we add the check for array_key_exists for the minority of cases where the rand() call returns an invalid offset and move to allow iteration to continue without a notice being triggered.

From my testing this case with the notice appearing happens around once in every 4 fisherYates calls if the array_key_exists conditional isn't part of the implementation but ofcourse it's hard to accurately averange since it depends on the output of the rand() call which is inherintly random. Either way with the implementation I wrote it will never happen anyway but hopefully you understand why it is there now.

Hopefully that makes sense and was helpful to you!

Collapse
 
aminnairi profile image
Amin

I see, indeed that makes totally sense, thanks for the clarification.

I still don't understand why we need to increase the index to increase our chances but there must be something I'm missing in the Fisher Yates' algorithm.

Thread Thread
 
jamesrweb profile image
James Robb • Edited

Don't worry, that's a simple one too!

Imagine the following:

$index = 1;
$randomIndex = rand(0, $index);

In this case only index 0 or 1 would be valid to come back but if we do $index + 1 instead then the possibile indexes for swapping are 0, 1 and 2.

This way we have more possible indexes that can be used to swap with the current item and more than that it also helps us to make the output a little more random during the shuffle since more indexes can be made available for the swap during each iteration.

Thread Thread
 
aminnairi profile image
Amin

Thank you. That was a great explanation.

Thread Thread
 
jamesrweb profile image
James Robb

Anytime, glad to of been able to clear that up for you 😊. Thanks for your comments and for stopping by to read the article! 📚