It is an art of conquering complicated problems with a combination of small logic pieces. And we often handle issues like this with loops or function iterations.
When meeting complicated cases, I prefer using function iterations more than loops. It is a lot more graceful, readable, and straightforward in theory. However, limitations in the real environment may introduce more complexity than we imagined. I believe more or less you may have encountered before, like:
The "Maximum call stack" exception will make the program inexecutable. The call stack does not care how simple or effective logic inside the function at all. It throws if the iteration runs many times before getting the final output. This means functional programming fans could suffer a lot from it.
If the iterating path is deep enough, even non-functional programming with some JS APIs has callbacks will suffer.
Here are some tricks to help:
Trick 1: turning function iterations to loops
Fewer functions called, fewer call stacks used. For example, we could convert the iterations like this:
function retry(doSth, times) {
try {
return doSth();
} catch (e) {
if (--times) {
return retry(doSth, times);
}
throw e;
}
}
To loop like this:
function retry(doSth, times) {
while (times--) {
try {
return doSth();
} catch(e) {
if (!times) throw e;
}
}
}
Most of the time, we may find the loop version is more nesting and miserably hard to read, unlike the above simple example. Besides, it sometimes takes a lot of effort to change the code structure entirely.
Here comes an interesting solution called trampoline. It could avoid resulting wide refactoring, and unreadable big loops by calling functions in a loop:
function trampoline(fn) {
var op = fn;
while (op != null && typeof op === 'function') {
op = op();
}
}
However, this requires the code always to return a function to run the next part of the logic. We may not handily convert all the logic to that style.
Trick 2: running in multiple micro or macro tasks
Javascript has this well-known event loop mechanism, which introduces 2 task queue - micro and macro tasks. They rely on the call stack size to determine whether one task completes when it could continue the next task (more detailed explained here). Which means the maximum call stack limitation only lives in one task.
Running in multiple Promises
The simplest way of task splitting is turning synchronous to asynchronous(microtask) with async
, await
keywords. They introduce minimal changes to the existing code structure.
For example(the retry function above):
function retry(doSth, times) {
try {
return doSth();
} catch (e) {
if (--times) {
return retry(doSth, times);
}
throw e;
}
}
Could be modified into async by just adding one keyword:
async function retry(doSth, times) {
try {
return doSth();
} catch (e) {
if (--times) {
return retry(doSth, times);
}
throw e;
}
}
However, we have embraced so much on the remarkable performance improvements of asynchronous and forget there is overhead behind.
As explained here by Microsoft for C#, asynchronous is not the natural way how the system works. There is plenty of logic running behind the scene. What we could see is just the crazy increment of memory.
You may only have the chance to see this "Javascript heap out of memory" exception in the NodeJS app, as the browser will hang or crash before showing anything. And in NodeJS, you could solve it by increasing the memory size via the option --max-old-space-size=<memory in MB>
.
Manually enqueue micro/macrotasks
When searching solutions for maximum call stacks, I often get recommendations(like this blog) of using:
-
process.nextTick()
- microtask in NodeJS -
setTimeout()
- macrotask -
setImmediate()
- macrotask
They all result in pushing a new micro/macrotask since the maximum call stack only restricts within one task, as explained above.
But, but, it is easier said than done. In the promise solution discussed above, we could retrieve all the output at the end if we want, while the manual task splitting is not the same story.
Thinking about how will you get the output from this:
function runMacroTask(input) {
setTimeout(() => {
return 'output';
});
}
We may use a Promise wrapper like this:
function runMacroTask(input) {
return new Promise(resolve => {
setTimeout(() => {
resolve('output');
});
});
}
Well, it just turned the direction back to the previous solution. Nothing gets better.
Another possible solution is similar to the state machine, keep updating a global state to identify the completion. For example:
let completed = false;
function runMacroTask(times) {
times--;
setTimeout(() => {
if (times) {
runMacroTask(times);
} else {
completed = true;
}
});
}
waitForCompleteAndDo(() => {
// do next thing
});
Or, just simply make one set of macro task flow isolated, making sure nothing external relying on it or affecting the result, then you could happily run all the macro tasks till the end:
function runMacroTask(times) {
times--;
setTimeout(() => {
if (times) {
runMacroTask(times);
} else {
fs.writeFileSync('Completed!', process.cwd() + '/output.txt');
}
});
}
runMacroTask(10000);
Summing up
Note that all tricks come with prices. It could be arduous but fruitless to refactor logic and end up with hard to read and maintain code, making us pay more in the long run.
Maybe what you need is just re-thinking whether this is the right solution, whether it is over-engineered instead of applying any of the above tricks.
In my case of a NodeJS app, I switched back to synchronous, reduced callbacks, converted some parts into loops, and the app worked even faster than using any tricks above.
Top comments (0)