If you understand the underlying concept of recursion but struggle to implement it like myself, or you don't even know what it is, then this post if for you.
Recursion is defined as solving a complex problem by breaking the problem into smaller versions of itself that can be solved. In programming, this is done by having a function call itself.
The underlying principle of recursion are these two cases:
Base Case: this case defines the smallest instance of a problem. In certain recursive solutions, there can be multiple base cases but for the purpose of this post's explanation, my example will only incorporate one. When coding, the base case prevents an infinite loop of the function calling itself.
Recursive Case: this case manipulates the problem in order to approach the base case. In programming, this is the case in which the function calls itself.
Put simply, the recursive case will breakdown the problem until it arrives at the base case, after which the the sub solutions build up to solve the original problem.
Do not worry if your understanding of recursion is hazy because the above definitions will make sense when implementing a recursive solution in Python.
In order to truly understand how recursion is implemented in Python, it is necessary to familiarize yourself with the call stack.
The call stack uses the stack data structure to keep track of local variables and previous function calls. The stack is a last-in, first-out data structure. This means that the last item to be placed in the stack will be the first to be removed. To help visualize, think of a stack of plates:
Whatever plate is placed last on the stack is the first one to be removed.
In terms of programming, the stack has two simple operations: push and pop. Push adds an item onto the stack and pop removes an item from the stack, returning its value.
In python, the call stack consists of frames. The bottom most frame is the global or module frame. This consists of all the global variables. Every time a function is called a new frame is pushed onto the stack containing its local variables and function arguments. When a function call returns, the frame is popped off the stack and its value is passed to where it was called in the previous frame.
The factorial of a number is defined as so:
4! = 4 * 3 * 2 * 1 = 24
If we were to define the factorial of a number n as a mathematical equation, it would be:
factorial(n) = n * factorial(n-1)
The right side of the equation will be the recursive case of our solution. Notice, how the recursive case takes the original problem, factorial(n), and makes it a simpler problem of multiplying n by whatever factorial(n-1) is.
We can define a function factorial that takes a parameter n. However, if we only defined this function with the above equation, it would call itself infinitely leading to an error.
This is because we did not define a base case. The most basic factorial is the factorial of 1 which equals 1. So we can tell the program that once it reaches the factorial of 1, instead of calling 1 * factorial(0), to just return 1. After which, the Python call stack will handle multiplying out the numbers.
After defining both the recursive and base cases, we arrive at the Python program below:
def factorial(n): if n <= 1: return 1 else: return n * factorial(n-1)
Before calling the factorial function, the Python call stack would look like:
Then, we call the factorial function passing in an argument of 4.
A frame for the factorial function would be pushed onto to the call stack, setting the argument n equal to 4.
The Python interpreter will then process the function and since n is not <= 1, it will evaluate n * factorial(n-1). However before the function can return a value, it must evaluate factorial(n-1). This will cause another factorial frame to be pushed on the stack, with the argument n set to 3.
Again, since n is not <= 1, the function will evaluate n * factorial(n-1). Likewise, the function must evaluate factorial(n-1) before returning a value.
This pattern will continue until the factorial frame with the argument n equal to 1 is pushed onto the stack.
When this occurs, the function reaches the base case, so it instead returns 1. As I mentioned early when explaining the call stack, when a function returns, it pops its frame off the stack, transferring its value to where it was called in the previous frame.
Now since the value for factorial(n-1) has been evaluated, the factorial function can evaluate the product and return it to the previous frame, pushing the current frame off the stack.
This pattern continues until we reach the original function call...
As you might have guessed, the factorial function with argument 4 returns a value of 24.
When writing a recursive solution in Python, think of the problem in terms of a base case and recursive case. Ask yourself: "How can I define a recursive case that will help me reduce the problem towards a base case?"
When trying to understand a recursive solution, I find it best to visualize the call stack in my head. Every time a function is called an item is added onto the stack and any time you encounter the return keyword, it pops an item off the stack and returns its value to the previous function call.