Recursion Algorithm Practice: Fibonacci Sequence Generator

kaelscion on December 19, 2018

for more of our great content, visit the Coding Duck Blog at: ccstechme.com/coding-duck-blog Hey there Dev community! So, I've got an interview ... [Read Full]
markdown guide
 

All the best for the interview!

Another interesting approach using iteration is making the function into a generator which can be used as part of for loops, list comprehensions, etc

def gen_fib():
    a,b = 1,1
    yield a
    yield b
    while True:
        a,b = b,a+b
        yield b
g = gen_fib()
# Generate the first 200,000 Fibonacci numbers
fibs = [next(g) for _ in range(200000)]

I am generally wary of writing code using pure recursion in Python as it can easily blow the stack due to recursion limit. For example if we try to solve using recursion as:

def fib(n):
    if n <= 1:
        return 1
    return fib(n-2)+fib(n-1)

This takes exponential time because it recalculates fib(n-2) for fib(n-1).

If we try to make it linear time using functools.lru_cache which will cache the previous calls at cost of O(n) memory, this still gives maximum recursion depth exceeded in comparison error for fib(200000).

@lru_cache(maxsize=None)
def fib_3(n):
    if n <= 1:
        return 1
    return fib_3(n-2) + fib_3(n-1)

if Tail call optimization was supported in Python, the following recursive variant will be "optimized" to iterative version:

def fib_tail(n):
  def fib_helper(a, b, n):
    return fib_helper(b, a + b, n-1) if n > 0 else a
  return fib_helper(0, 1, n)

But due to no TCO, it still results in same maximum recursion depth exceeded in comparison for fib_tail(200000)

 

I'd probably fail this interview...

I look forward to reading thoughts other folks have.

 

hahaha! I'm sorry to hear that. I've always described myself as an engineer, nothing more, nothing less. Machines: easy. People: Too hard...Not doing it. I look forward to what people have to say too. I'm self-taught for the most part in that I've never spent a day on a University Campus as a student. And I live in Portland, ME now which is very much a .NET and Node area of the country. So up until about 6-weeks ago, I've lived in my little corner or Portland in total obscurity. But now, Boston has gotten too expensive for a lot of startups and they've moved to Portland. As such, suddenly being a Mid-Senior level Python Engineer is hugely in demand in my area and I'm one of a dozen developers locally with that skillset. So, almost overnight, I've gone from "We don't do Python here" obscurity to "Holy crap, there's a dev that does Python here? And he's a freelancer so, therefore, available!?!?! Could you come in and chat with us for a minute?" 😆😆😆

 

That's amazing!

Being in the right place, at the right time, with the right skill :)

 
 

Oh man, every time I see the Fibonacci sequence I remember one of my interviews when I was to write such generator using JavaScript on a piece of paper. I got it working, even though my writings aren't very legible and most of the time I had to read it to my interviewer.

Also, have you tried doing it immutable? That way you'll be doing real recursion.

 

I have not gone immutable with it. Python has a recursion depth limit that, while usually a good safety net, tends to become an obstacle when working with more and more recursions. Finding the 200,000th fib number in Python with true recursion would hit the depth limit really early and cProfiles only measures to the thousands/sec decimal place so, even if I didn't hit the limit, the iterative solution I have would have the same exact level of performance as true recursion. My solution though, on second look, seems to be fast because I leverage RAM quite a bit by storing the current fib number, the previous fib number, and their sum in variables in each iteration. Granted, the same variables are assigned different data on each pass, but the numbers get larger and larger meaning that each variable goes from storing a single-digit int to an int with thousands of digits, using more and more RAM per pass.

Another commenter suggested a generator function which would be likely to leverage the CPU to calculate current, previous, and next fib number rather than simply recalling them from RAM, which would be more efficient memory-wise, but put a heavier load on the CPU. In this particular use case, that wouldn't be a huge deal. But in more complex calculations with numbers that big, the CPU could strain big time if it doesn't have a good deal of horsepower. I'd like to learn the generator function anyway though for the sake of understanding both environments. In a cheap Digital Ocean droplet, 1GB of RAM will go further than 1 CPU core, so iteration would be best. On an expensive EC2 instance, you've got CPU power for days and RAM can get expensive, so a generator would be best. Either way, it couldn't hurt to know both :D

 

That's a great explanation of your choice though!

Hired :P !

 

Good luck kaelscion!

I would have chosen the version with a generator wrote by @rrampage with memoization.

Yours is definitely good!

Anyhow, no recursion, 200k is definitely past the default recursion limit:

>>> import sys
>>> sys.getrecursionlimit()
1000

Python doesn't optimize tail recursions so you're kind of into the unknown with a numeric algorithm like that

Let us know how did it go! :D

 

As it turns out, the solution proposed by @rrampage included a list comprehension that demonstrated a potential use case of the generator (they responded to let me know). Admittedly, I simply copy/pasted that solution and ran the tests on it without actually reading it 😑 .

bad-llama

The second solution I proposed didn't have that list comp in it neither did their revised solution. There was a slight performance hit in x-time, but the RAM usage was much better without the generator being appended to a list. So for raw performance, I still favor the iterative solution. But for a well-balanced approach that is lighter on RAM but trivially different in x-time, I agree the generator is best.

As far as the interview went, I feel it was good. It was a 3-hour interview process. I met with 8 people from different teams who all questioned me on the topics they found most important. There was a discussion on brands, Big-O quizzing and data structure questions, product and culture-fit questions, comfort level with Agile, and finally, the fourth team of 2 gave me a whiteboard challenge.

Here is where, I feel, I struggled mightily for two reasons: 1) I was totally fried from the previous 3 teams and their questioning and 2) This was my first official engineering interview in at least 5 years. I also did not have a whiteboard available in that conference room so I was figuring out my solution in my notebook. This meant that I kept running out of room on a page, and would flip to the next page and re-write what was on the first page so that the interviewers could see what I was doing. I also overengineered the algorithm. Big time. In the end, I "couldn't see the forest through the trees" and ended up with a quasi-solution that I didn't particularly care for. Both interviewers chuckled along with me though and told me that they had done the same exact thing I did the week before when they had first seen that problem in preparation for my interview.

Ultimately, I think my thought process was good and I did talk them through my thoughts the whole time. But writing code from memory on a piece of notebook paper in the span of 30-minutes or so is never easy so I think I'll be forgiven. The thought process is what most interviewers really look for (I hope).

Either way, I had a blast, loved the people and the interviewers, and am really hoping things work out. If nothing else, it was a bit of a confidence boost because the majority of my programming experience is in the freelance space and I have no degree, just curiosity. So to be able to go in there and stand toe-to-toe with career software engineers and make any sort of impression is great.

 

So for raw performance, I still favor the iterative solution. But for a well-balanced approach that is lighter on RAM but trivially different in x-time, I agree the generator is best.

Yeah, that's usually it. The advantage with the generator is that you can pause it and resume it. They are basically a simplified version of continuation passing style.

Both interviewers chuckled along with me though and told me that they had done the same exact thing I did the week before when they had first seen that problem in preparation for my interview.

That's cool. If they saw that, it means they empathize and recognized that you were working towards a solution

Either way, I had a blast, loved the people and the interviewers, and am really hoping things work out.

Well, that's really important. Sometimes we go into interviews hoping the company will like us but forgetting that we have to like them as well :D

Crossing fingers, especially now that your skills are in demand!

 

Thank you all so much for your comments so far! I'm really enjoying this discussion and collaboration because it is forcing me to try things I previously wrote off, or research methods I wasn't actively aware of. Please, keep them coming! If nothing else, I am really enjoying not only testing your solutions and researching the principles they are based on but returning to my own solution and verifying why it is I like it in the face of all of these other awesome suggestions. Please keep the discussion rolling! I never thought a CS101 algo like Fib numbers would get so much discussion from so many. Love it! 😁😁😁

 

They want to see that you understand memoization, and that you have a general idea how big its impact can be for a problem like computing fibonacci numbers. Some example code:

import time

known = {1:1,2:1}
def fibonacci_memo(n):
    global total_calls
    total_calls += 1
    if n in known:
        result = known[n]
    else:
        result = fibonacci_memo(n-1) + fibonacci_memo(n-2)
        known[n] = result
    return result

def fibonacci(n):
    global total_calls
    total_calls += 1
    if n < 3:
        result = 1
    else:
        result = fibonacci(n-1) + fibonacci(n-2)
    return result

def fibtest(f,n):
    t1 = time.time()
    rc = f(n)
    t2 = time.time()
    print("%s(%d) = %d\n- total function calls = %d\n- elapsed time = %3.6f seconds\n\n" %
        (f.__name__, n, rc, total_calls, (t2 - t1)))


total_calls = 0
fibtest(fibonacci_memo,40)

total_calls = 0
fibtest(fibonacci,40)

The results on a recent Macbook Air:

$ python fib.py
fibonacci_memo(40) = 102334155
- total function calls = 77
- elapsed time = 0.000025 seconds


fibonacci(40) = 102334155
- total function calls = 204668309
- elapsed time = 31.633700 seconds

Calling the recursive function less than 100 times vs calling it over 200 million times. I would focus on making it clear you understand how and why this works the way it does. Then you can go on to explain why the lack of tail call elimination in python makes either solution problematic for large values of n.

 

For sure! I ran your suggestions through the same series of tests the others have gone through so far and came up with the following:

If you take a look at my first solution that deals with iteration rather than "true" recursion, the 3 variables used are a stand-in for the dictionary known as a form of memoization, giving us the performance boost of allowing our previously known values to be stored and re-assigned with each iteration, rather than stored separately in an ever-growing dictionary. I find this way of going about it simply more to my personal taste because we are never allocating more RAM than those three values take up combined on that particular pass through the loop. But, if we needed to not only find the nth fib number but also store all previous fib numbers up until that point, I wouldn't have even considered using a dictionary as a cache so thanks for pointing out that use case to me because I'm sure they will want to know these types of solutions with "one-off-calculation" use cases, as well as persistent, referenceable use cases.

Also, using "true recursion" is the better solution for numbers of passes in the hundreds. But the use-case I'm gunning for (hundreds of thousands of passes), true recursion hits that stupid Python depth limit well before I can get where I want to go. The iterative solution is a nice, safe, and quick-to-implement O(n) algo that I can write quickly, debug with minimal fuss (or really, no fuss) and have the peace of mind that I know what I'm going to get with a fairly predictable, linear increase in execution time with an increase in the dataset's size.

 
 

Thank you! I'm confident that I know the tech aspect of it, but that whole "I haven't worked with a team in almost 6 years maybe I'm not as knowledgeable as I thought" thing is making me quite nervous. Thanks for the well wishes!

 

Good luck! Couple of commenters already pointed out some good resources, I have written one on Tail Call Optimization so that might be helpful (Python doesn't have it, but it helps to understand)

There's also dynamic programming that you could look into - quora.com/What-is-the-difference-b...

 
code of conduct - report abuse