DEV Community

Kassim
Kassim

Posted on

Notes on Node

What is node?

We'll start off with the dictionary definition before we start getting into the finer details of things -- Node.js is an open-source, cross-platform, back-end JavaScript runtime environment that runs on the V8 engine and executes JavaScript code outside a web browser. So what does this mean exactly? We'll start off with a diagram that gives us an outline of Nodes architecture.

image
So here, at the top level we have the actual Javascript programs that we would write. When we write these programs we eventually run at the command line.

So when we run node index.js, we are invoking the Node project, and much like many Javascript project, it is backed by dependencies which it uses to actually execute our code, 2 of the most important of these being V8, and libuv.

libuv

libuv gives Node access to the operating system in order to perform tasks related to the filesystem or time scheduled tasks etc.

V8

V8 interprets and executes the Javascript code, allowing it to run outside the browser.

Node Event Loop

Whenever we execute a Node program, Node creates a single thread and executes all our code within that thread, and within that thread lies the event loop. The event loop essentially dictates, what task our program will be carrying out any given time.

How does the event loop work?

When we execute a node program in the command line, the entire content of the file is first executed, and then the event loop is initiated.

We can sort of think of the event loop as a while loop that checks a few conditions, before continuing execution. As long as the condition maintains true, the loop executes again and again, each lifecycle of the loop is known as a 'tick'.

So what conditions does the event loop check, to determine whether it should continue for another tick?

First the event loop will check if there are any pending timer events, such as setTimeout and setInterval.

Then it'll check if there are any pending OS tasks, such as a server listening on a given port.

As well as checking if there any pending operations, such as reading as fs module operations like reading from a file.

Once Node determines it should process another tick, what then actually happens?

So the fist step is that node looks at pending timers, and see's if any functions are ready to be called. So node looks at these setTimeout and setInterval functions and looks to see if any of functions passed in them are ready to be executed.

Node then follows this up by looking at any pending OS tasks and operations, and also called the associated callbacks for these tasks if they're ready to be executed.

After this step, execution is paused temporarily whilst Node waits for new events occur. Following this, setImmediate timer, function callbacks are executed. Finally, 'close' event call backs are handled eg: socket.on(‘close’, …)

So this is how each tick, of an event loop is handled.

Is Node single threaded?

Single threaded, means that instructions are executed in a single sequence, so in essence it means one thing happens at a time. Which essentially can be a bottleneck on performance, especially on multicore processors, having a single thread wouldn't take advantage of this.

So is Node single threaded and is that a bad thing? Well Node isn't single threaded per se. The event loop of Node is single threaded, but some of the node framework and standard library are not single threaded.

For some functions, such as filesystem (fs) module function, some crypto module functions and amongst others. Libuv, one of the C++ aspects of Node creates a thread pool, allowing node to take advantage of multiple threads.

const crypto = require('crypto');

const start = Date.now();

crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
  console.log('1:', Date.now() - start);
});

Enter fullscreen mode Exit fullscreen mode

Take this program for instance, named threads.js, when I execute this program this is the output. It takes around 400ms to complete execution.

Screenshot 2021-09-21 at 09.16.13

Now if we look at the following program, this same function is replicated 5 times. Assuming Node was entirely single threaded, this would essentially take five times as long.

const crypto = require('crypto');

const start = Date.now();

crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
  console.log('1:', Date.now() - start);
});

crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
  console.log('2:', Date.now() - start);
});

crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
  console.log('3:', Date.now() - start);
});

crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
  console.log('4:', Date.now() - start);
});

crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
  console.log('5:', Date.now() - start);
});
Enter fullscreen mode Exit fullscreen mode

However, when executed we have the following,
Screenshot 2021-09-21 at 09.19.23

Well, something interesting happens here. The first 4 functions execute all nearly around the same time, but the fifth takes a bit longer, why is this? Well the thread pool that libuv creates, by default has 4 threads. We can edit this though, by using process.env.UV_THREADPOOL_SIZE, let's edit the threadpool size to 5 threads, and see if there's any difference.

Now our program looks like this.

process.env.UV_THREADPOOL_SIZE = 5;
const crypto = require('crypto');

const start = Date.now();

crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
  console.log('1:', Date.now() - start);
});

crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
  console.log('2:', Date.now() - start);
});

crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
  console.log('3:', Date.now() - start);
});

crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
  console.log('4:', Date.now() - start);
});

crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
  console.log('5:', Date.now() - start);
});

Enter fullscreen mode Exit fullscreen mode

When executed we get the following:
Screenshot 2021-09-21 at 09.22.54

We can now see that all functions take roughly about the same amount of time to execute. This doesn't mean you can infinitely create more threads to get better performance, the amount of threads you can take advantage of is a function of your computers resources, therefore it is limited, and spamming new threads will lead to diminishing returns.

Threadpools aren't the only way that Node isn't single threaded. For example for some tasks such as networking, which can be carried out using Node's http module, are actually handled by the operating system. Libuv delegates this task to the OS so there is no blocking on the code.

const https = require('https');
const crypto = require('crypto');
const fs = require('fs');

const start = Date.now();

function doRequest() {
  https
    .request('https://www.google.com', (res) => {
      res.on('data', () => {});
      res.on('end', () => {
        console.log('Network:', Date.now() - start);
      });
    })
    .end();
}

function doHash(e) {
  crypto.pbkdf2('a', 'b', 100000, 512, 'sha512', () => {
    console.log(`Hash: ${e}`, Date.now() - start);
  });
}t

doRequest();

fs.readFile('multitask.js', 'utf8', () => {
  console.log('FS: ', Date.now() - start);
});

doHash(1);
doHash(2);
doHash(3);
doHash(4);
Enter fullscreen mode Exit fullscreen mode

If we look at this program above multitask.js, we have a network request, using the http module, a hashing function using the crypto module, and file system function. The network request is called first, followed by the file reading, followed by the hashing function. Any idea how these will execute? Take a minute to see if you can figure it out.

Well this is what we get when we execute the program.
Screenshot 2021-09-21 at 09.34.27

But wait I thought you said network requests are delegated to the OS so why is it taking so much longer than the other tasks? Well this here is probably a function of my internet connection as I write this article, if you copy the program and try to run it yourself, chances are you'll have a much better result.

Why is the reading the file taking just as long as the hashing functions? Surely reading a file from my hard drive should be faster? This is a function of the default threadpool size, we have 4 hashing functions and a readFile operation. The reason why they take nearly the same time, is that there are some idle points in the readFile process, at these points, a hashing function will be allocated that thread so the thread isn't completely idle. If we increase the threadpool size to 5, like we did previously. This is our result.
Screenshot 2021-09-21 at 09.41.24

As we can see the file system operation is carried out much much faster.

These are just some interesting things I've learned whilst learning about Node, I hope you find them useful as well.

Top comments (2)

Collapse
 
igbominadeveloper profile image
Favour Afolayan

Thanks for this detailed write up. I can't seem to get enough on event loops

Collapse
 
r0zar profile image
Ross Ragsdale

awesome explanation with fantastic examples.