Backtracking can be confusing at times and solving problems using backtracking can be even a tedious job. Through this post let's decode how to write backtracking algorithms.
The most important prerequisite of backtracking is recursion. Though beginners usually ignore this but through this post let's change the perspective. Let's learn about recursion.
In layman's terms, recursion is a function calling itself from itself with different parameters. Such a function that calls itself is called recursion function.
def function(a): b = # perform calculations or operations on a function(b)
But we don't want it to go on forever, so we will introduce a base condition to stop recursion to go forever. This is what we call a base case. A recursion function can have multiple base cases.
def function(a): if a == # some base value return #some value b = # perform calculations or operations on a function(b)
Let's discuss Staircase problem. Here is the link. In this problem, we require to find the no. of ways to climb the top of the staircase and we can either climb 1 or 2 steps at a time. Let's take the top of the staircase 4 steps and solve this problem using recursion.
So we have the following ways to climb stairs:
- Ground-1 step→ A -1 step→ B -1 step→ C -1 step→ D
- Ground-2 step→ B -1 step→ C -1 step→ D
- Ground-1 step→ A -2 step→ C -1 step→ D
- Ground-2 step→ B -2 step→ D
- Ground-1 step→ A -1 step→ B -2 step→ D
For this problem, we need to think of the base cases. The base cases are the cases in which we are aware of the output of the function for certain input.
In recursion, we try to break a problem into subproblems and when we hit the base cases, we return the output to that instance of the function which called the current instance of the function. We will discuss this in the next section.
Coming back to the problem, we know that if the top of the staircases is 1, the no. of ways will always be 1 and if the top of the stack is 2, the no. of ways will always be 2. Here we get are 2 base cases and half of our job is over.
# n: no. of staircase def staircase(n): if n == 1: return 1 if n == 2: return 2 staircase()
The next thing is to find the new value of n with keeping in mind we need to covert this problem into sub problem. So, at a time we can either climb 1 step or 2 steps. If I climb 1 step the no. of reaming steps will be n - 1 and if I climb 2 step the no. of reaming steps will be n - 2. We can climb n - 1 and n - 2 again either taking 1 or 2 steps at a time. So if I already climbed 1 stair we will be left with n - 3 steps if I take 1 more step and n - 4 if I take 2 steps. This goes on till we n as 1 or 2.
We deduced problem into sub problems. Now the staircase function will look like this.
# n: no. of staircase def staircase(n): if n == 1: return 1 if n == 2: return 2 result = staircase(n - 1) + staircase(n - 2) return result
Each instance of the function will return the value of the result to its previous instance and when we visit the first instance the function will return the final value of result to its caller function.
2. Call stack
We just need to focus on stack here. Every process maintains a stack in the memory that is also known as call stack. Every function that need to be executed or called during execution of the program is pushed in this stack and when it completes is execution, it is poped out of the call stack. for example, if we have following code.
def another_function(a, b): return a + b def function(a, b): a = b % a b += 1 return another_function(a, b) if__name__== "__main__": function(10, 20)
So, our call stack will look like this:
The main function which is called first will be in the last of the stack and the another_function which is recently called will be at the top.
Let's sneak peek into the call stack for the recursion function staircase. We know different sub-problems for the staircase problem. We have n stairs, so the stack will look like:
Then we again call staircase with parameters n - 1 and the function with the parameter will subsequently call the function with parameter n - 2. When we reach the base case, the functions on the top of the stack will get executed and poped outh of the stack.
Now, each instance of recursive function in the stack can be viewed as a choice we make.
staircase(n - 2) means we choose to take 2 steps and
staircase(n - 1)means we choose to take 1 step. Call stack maintains all of the states or instances of function with the parameters as the choices we make before calling it. Here is an example of the call stack in between the execution of the Sudoku solver program.
1. The Big Picture
In the previous section, we learned about recursion and call stack. Now let's use them to write a backtracking algorithm. We start with 3 basic rules.
Make the choices
In this part, we just make a choice from the choice space without caring about it's validity. We set the choice space based on the problem we are solving.
- For N Queens Problem where we have to place queens in n x n board such that no two queens attack each other, our choices will be every square coordinate in the chess board.
- For Sudoku Solver where we have to solve a Sudoku, choices are number 1 - 9 that need to be filled in the empty cell.
- For Maze Problem where we have to traverse a path to reach the goal, choices are the coordinates of the available square.
Once we make a choice, we now need to validate it. If the current instance of the recursive function runs out of the choices, then backtracking will come into play. We will then backtrack to the instance of a recursive function that is at the top of the call stack which called the current one since the current one exhausted all it's choices and completed its execution. The function we backtracked then will make the next choice and try to change the fate of successive functions to be called.
Validate the choices
Now, we need to validate every choice we make, if the choice is valid we recursively call the function and pass next point as the parameter for which we need to make another choice.
- For N Queens Problem, a choice is valid if the queen is placed not in the same row, column or in diagonal.
- For Sudoku Solver, a choice is valid if the number we place in an empty cell is not found in the same row, column and 3 x 3 square.
- For Maze Problem a choice is valid if the path or coordinate we choose isn't blocked.
Setting the goals
We need to specify the goals else the algorithm keeps on running forever. The goal tells that with current choices that fulfilled the constraints the problem is now solved. We can tell the algorithm the goal is reached by mentioning the final position of the pointers we use for iteration.
- For N Queens Problem, a goal is reached when the pointer reaches the last column or n - 1 column (assuming index start from 0).
- For Sudoku Solver, a goal is reached when the pointer reaches the n - 1, n - 1 coordinate (assuming index start from 0).
- For Maze Problem, a goal is reached when the pointer reaches the n - 1, n - 1 coordinate (assuming index start from 0).
After all the 3 steps are completed, voila the backtracking algorithm is completed.
2. Putting the pieces together
Enough of theory let's write a Sudoku solver using backtracking.
The structure of most of the backtracking algorithm is as follows:
def is_valid(n): # write the constraints here def algo_util(n): # we set goal here # we make choices here # we validate choices by calling is_valid # we determine whether to recursively call algo_util with the new point as parameter. # we return solution to driver function def driver_function(n): # calls algo_util with first basic choice.
Let's implement the above code structure to solve Sudoku. Starting with the driver function, we need to start Sudoku from the 1st cell (0,0).
sudoku_util is the
algo_util function from above template.
def driver_function(n): sudoku_util(0, 0)
We will start by setting the goals. The goal is if we reach the last cell of the sudoku (8,8).
# i is iterating row and j is iterating column def sudoku_util(i, j): if j > 8: j = 0 i += 1 if i == 9: return True
After setting the goal let's think of the choices, we can add 1 - 9 in each cell
# check element already exist # If entry exists the recursively calling the function with next coordinates. if board[i][j] != '.': return sudoku_util(i, j + 1) else: for num in ["1", "2", "3", "4", "5", "6", "7", "8", "9"]: # Here are the choices we make from choice space.
Let's continue the function by validating choices.
for num in ["1", "2", "3", "4", "5", "6", "7", "8", "9"]: if is_valid(i, j, num): # check for choice validity board[i][j] = num if sudoku_util(i, j + 1): # recursively call the function with next coordinate return True # inform previous function instance that the choice else: board[i][j] = '.'
To put the final nail in the coffin, we need to write the is_valid function that's validates the choice we make.
def is_valid(i, j, number): # check that row for col in range(n): if board[i][col] == number: return False # check that column for row in range(m): if board[row][j] == number: return False x = (i // 3) * 3 y = (j // 3) * 3 # check that box for row in range(x, x + 3): for column in range(y, y + 3): if board[row][column] == number: return False return True
And we are done.
Here is the full code:
board=  # global variable def is_valid(i, j, number): # check that row for col in range(n): if board[i][col] == number: return False # check that column for row in range(m): if board[row][j] == number: return False x = (i // 3) * 3 y = (j // 3) * 3 # check that box for row in range(x, x + 3): for column in range(y, y + 3): if board[row][column] == number: return False return True def sudoku_util(i, j): if j > 8: j = 0 i += 1 if i == 9: return True # check element already exist if board[i][j] != '.': return sudoku_util(i, j + 1) else: for num in ["1", "2", "3", "4", "5", "6", "7", "8", "9"]: if is_valid(i, j, num): board[i][j] = num if sudoku_util(i, j + 1): return True else: board[i][j] = '.' def driver_function(sudoku): # type sudoku: List[List[str]] board = sudoku sudoku_util(0, 0)
Happy coding :)