This article is based on a Brown Bag session I did at comparethemarket.com on “Five Misconceptions on How NodeJS Works”.
NodeJS is a massive platform built with a bunch of interesting building blocks as the above diagram describes. However, due to the lack of understanding of how these internals pieces of Node JS work, many Node JS developers make false assumptions on the behaviour of Node JS and develop applications which lead to serious performance issues as well as hard-to-trace bugs. In this article, I’m going to describe five such false assumptions which are quite common among many Node JS developers.
NodeJS EventEmitter is intensively used when writing NodeJS applications, but there’s a misconception that the EventEmitter has something to do with the NodeJS Event Loop, which is incorrect.
NodeJS Event Loop is the heart of NodeJS which provides the asynchronous, non-blocking I/O mechanism to NodeJS. It processes completion events from different types of asynchronous events in a particular order.
(Please check out my article series on the NodeJS Event Loop, if you are not familiar with how it works!)
In contrast, NodeJS Event Emitter is a core NodeJS API which allows you to attach listeners functions to a particular event which will be invoked once the event is fired. This behaviour looks like asynchronous because the event handlers are usually invoked at a later time than it was originally registered as an event handler.
EventEmitter instance keeps track of all events and listeners associated with an event within the
emit function is called on the
EventEmitter instance, the emitter will SYNCHRONOUSLY invoke the listener functions registered to the event in a sequential manner.
If you consider the following snippet:
The output of the above snippet would be:
handler1: myevent was fired! handler2: myevent was fired! handler3: myevent was fired! I am the last log line
Since the event emitter synchronously executes all the event handlers, the line
I am the last log line won’t be printed until all the listener functions are invoked.
Whether a function is synchronous or asynchronous depends on whether the function creates any asynchronous resources during the execution of the function. With this definition, if you are given a function, you can determine that the given function is asynchronous if it:
- Performs a native NodeJS async function(e.g, async functions in
- Uses Promise API (includes the usage of async-await)
- Calls a function from a C++ addon which is written to be asynchronous (e.g, bcrypt)
Accepting a callback function as an argument does not make a function asynchronous. However, usually asynchronous functions do accept a callback as the last argument (unless it’s wrapped to return a
Promise). This pattern of accepting a callback and passing the results to the callback is called the Continuation Passing Style. You can still write a 100% synchronous function using the Continuation Passing Style.
Synchronous functions and Asynchronous functions have a significant difference in terms of how they use the stack during the execution.
Synchronous functions occupy the stack during the entire duration of its execution, by disallowing anyone else to occupy the stack until it returns.
In contrast, asynchronous functions schedule some async task and return immediately hence removing itself from the stack. Once the scheduled async task is completed, any callback provided will be called and the callback function will be the one who occupies the stack again. At this point, the function which initiated the async task will no longer be available on the stack since it has already returned.
With the above definition in your mind, try to determine whether the following function is asynchronous or synchronous.
In fact, the above function can be synchronous and asynchronous depending on the value passed to the
If data is a falsy value, the
callback will be called immediately with an error. In this execution path, the function is 100% synchronous as it does not perform any asynchronous task.
If data is a truthy value, it’ll write data into
myfile.txt and will call the
callback after the file I/O operation is completed. This execution path is 100% asynchronous due to the async file I/O operation.
Writing function in such an inconsistent way (where the function behaves both synchronously and asynchronously) is highly discouraged because it will make an application’s behaviour unpredictable. Fortunately, these inconsistencies can easily be fixed as follows:
process.nextTick can be used to defer the invocation of callback function thereby making the execution path asynchronous.
Alternatively, you can use
process.nextTickin this case, which will more or less give the same result. However,
process.nextTickcallbacks have a higher priority comparatively thereby making it faster than
If you need to learn more about the difference between
setImmediate, have a look at the following article from my Event Loop series.
It is a widely known fact that CPU-intensive operations block the Node.js Event Loop. While this statement is true up to a certain extent, it is not 100% true as there are some CPU-intensive functions which do not block the event loop.
In general, cryptographic operations and compression operations are highly CPU-bound. Due to this reason, there are async versions of certain crypto functions and zlib functions which are written in a way to perform computations on the
libuv thread pool so that they do not block the event loop. Some of these functions are:
However, as of this writing, there’s no way to run CPU-intensive operation on the
Modern operating systems have built-in kernel support to facilitate native asynchrony for Network I/O operations in an efficient way using event notifications (e.g, epoll in linux, kqueue in macOS, IOCP in windows etc.). Therefore, Network I/O is not performed on the libuv thread pool.
However, when it comes to File I/O, there are a lot of inconsistencies across operating systems as well as in some cases within the same operating system. This makes it extremely hard to implement a generalised platform-independent API for File I/O. Therefore, File system operations are performed on the
libuv thread pool to expose a consistent asynchronous API.
dns.lookup() function in
dns module is another API which utilises the
libuv thread pool. The reason for that is, resolving a domain name to an IP address using
dns.lookup() function is a platform-dependent operation, and this operation is not a 100% network I/O.
You can read more about how NodeJS handles different I/O operations here:
This is not really a misconception, but rather was a well-known fact about NodeJS which is now obsolete with the introduction of Worker Threads in Node v10.5.0. Although it was introduced as an experimental feature,
worker_threads module is now stable since Node v12 LTS, therefore suitable for using it in production applications with CPU-intensive operations.
Each Node.js worker thread will have a copy of its own v8 runtime, an event loop and a libuv thread pool. Therefore, one worker thread performing a blocking CPU-intensive operation does not affect the other worker threads’ event loops thereby making them available for any incoming work.
If you are interested in learning how Worker Threads work in detail, I encourage you to read the following article:
However, at the time of this writing, the IDE support for worker threads is not the greatest. Some IDE’s does not support attaching the debugger to the code run inside a worker thread other than the main worker. However, the development support will mature over time as a lot of developers have already started adopting worker threads for CPU-bound operations such as video encoding etc.
I hope you learned something new after reading this article, and please feel free to provide any feedback you have by responding to this.
- Designing APIs for Asynchrony (Isaac Z. Schlueter) https://blog.izs.me/2013/08/designing-apis-for-asynchrony
- My Event Loop article series https://blog.insiderattack.net/event-loop-and-the-big-picture-nodejs-event-loop-part-1-1cb67a182810