DEV Community

Do you even recurse?... And if you do, do you do it safely?

J. Pichardo on June 05, 2018

Background Have you ever written a factorial function? If you have then you might have done something like: function factorial(n) { ...
Collapse
 
nestedsoftware profile image
Nested Software • Edited

I think this solution will keep a growing linked list of all the ret callbacks in heap memory for each loop of the trampoline. Once ret(1) is executed in the base case for factorial, the whole chain of length n ret functions will get called one after the other:

function factorial(n, ret = res => res) {
  return n > 1 
    /* here the closure that is returned will keep a 
       reference to the above 'ret' parameter that is 
       passed in to 'factorial'. Repeatedly calling 
       'factorial' from 'trampoline' will create a 
       chain of these references */
    ? () => factorial(n - 1, res => ret(n * res)) 
    : ret(1);
}

This is rather wasteful: Yes, we get around the problem of overflowing the stack, but we're just moving that memory overhead from the stack over to the heap.

However, on reflection, it occurs to me that we can implement tail call optimization for the trampolined version of the code too. That way we don't have to keep all those callbacks around! Here is the code with the tail call optimization added:

let stackTrace = require('stack-trace');

//tail call optimization added here
function factorial(n, acc = 1) {
    const trace = stackTrace.get();
    console.log(`Call Stack: ${trace.length}`);
    return n > 0 ? () => factorial(n - 1, n*acc) : acc;
}

//remaining code stays the same
function trampoline(fn) {
    return function(...args) {
        let result = fn(...args);

        while (result && typeof result === 'function') {
            result = result();
        }

        return result;
    };
}

const tcoFact = trampoline(factorial);

const finalResult = tcoFact(50);

console.log('final result = ' + finalResult)

I think this is the way to go. We realize the full benefit of tail call optimization: We're neither wasting space on the stack nor on the heap.

@mortoray seems to suggest that the overhead of using the trampoline is itself a performance penalty, but I don't really agree with that idea. To put it another way: If we're concerned about that level of performance, we should be coding in something other than JavaScript in the first place, like C, C++, Rust, Go, etc...

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

My concern is about the creation of the closure () => factorial(n - 1, n*acc). This involves the creation of an object, presumably on the heap. Your tail-call-optimization won't change this: you still allocate the same number of objects.

For a low-call count function like factorial the performance loss is likely negligible. I worry about such structures though because it's easy to start creating too many such wrappers and start suffering layers of overhead. As you do this you likely avoid the ability of the optimizers to make up for JavaScript's innate inefficiency. If you were to start doing high-call count functions this way you might start seeing performance issues.

I know premature optmization is bad, but there's also a problem if we blindly ignore potential performance issues.

Collapse
 
pichardoj profile image
J. Pichardo • Edited

That is right, the thunks would be allocated on the heap, so for a simple function like factorial a solution like this might just be overkill, however for more complex algorithms it could work. Still, I'm doing more research on the topic of continuation in javascript in order to find a way to improve efficiency.

@nestedsoftware @mortoray

Collapse
 
nestedsoftware profile image
Nested Software • Edited

That's an interesting point that having higher-order functions can potentially obscure what's going on from the VM and thereby prevent the JIT from performing certain optimizations. I hadn't thought about that at all. That makes sense and it wouldn't surprise me. It's hard to know what effect it would have, but I guess you're right that it's an area where one should tread with some caution if performance is a consideration.

Yes, adding the tail call optimization doesn't reduce the number of factorial 'thunks' that are created. I believe it (hopefully) will make a thunk available for garbage collection each time that the trampoline loop runs rather than having to wait for the very end of the calculation.

At one point I did some reading on how objects are represented in memory in JavaScript, and it seems to allocate a lot of stuff on the heap, so I'm pretty certain that you're right: The thunks would likely be allocated on the heap as well.

Collapse
 
bgadrian profile image
Adrian B.G.

Actually TCO is not implemented in the majority of the mainstream languages, so I avoid recursion by all means. Hopefully things will change now that Java and C# are adding FP goodies.

You can do hacks like these in dynamic languages but you'll not convince a lot of ppl to adopt them.

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

I continue to use recursion where it's difficult to implement an algorithm without (lots of tree structures come to mind), or where unrolling the recursion makes it harder to understand. So long as it isn't too deep it's fine.

Collapse
 
bgadrian profile image
Adrian B.G.

Ah yes, if the amount of data is pretty deterministic and small and the iterative version is too cumberstone I do the same.

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

This is definitely interesting, and cool, but I'm not sure it's practical.

You might be trading stack space for heap space by using the trampoline. Closures, those returned functions, will require some kind of allocation and tracking. That is likely the dominant cost in the trampoline setup. I would thus argue that efficiency can't be the motivation.

If the motivation is clarity, I also tend to see the trampoline version as less clear than the original recursive form, and the loop form.

Collapse
 
pichardoj profile image
J. Pichardo

You are absolutely correct, closures can lead to memory leaks, specially in this type of situation. That's why it should be used carefully and only when necesary, being that as it may, I would have seriosly loved for TCO to have been implemented.

Collapse
 
skuzzle profile image
Simon Taddiken

Just wondering: has anyone ever run into a stackoverflow that was not caused by a faulty implementation but by actually running out of stack space? If so I'm very interested in that actual case.

Collapse
 
pichardoj profile image
J. Pichardo

Hi Simon, actually I just ran into this stackoverflow question