DEV Community

loading...
Cover image for Running CPU Intensive task in Nodejs

Running CPU Intensive task in Nodejs

Badewa kayode
Software Dev. | I create for fun(ds) 🤓
・8 min read

Moving my articles from Medium to Dev.to

This article was originally posted here:

The code for the article can be found here.

Nodejs is good for IO intensive tasks but bad for CPU intensive tasks. The reason Nodejs is bad for CPU intensive task is that it runs on the event loop, which runs on a single thread.

The event loop is responsible for everything that runs on the user-land of Nodejs. This event loop runs on a single thread. When this thread is blocked all other tasks would have to wait for the thread to be unlocked before they can be executed.

I am not an expert on this issue, I am only giving a way in which I achieved this, so if anyone has something to add or some corrections to make about the post I’m open to advice.

Running Fibonacci

In this article, I would be using Fibonacci as my CPU intensive task (it takes time to get the Fibonacci number of numbers above 45). I am going to create a server that serves
a simple response for any URL that does not match /fibo, and when the URL matches /fibo I will serve a Fibonacci result.

In this article I will not use any npm module; I will just be using core node modules in this article.

The Server

The server for this article would only return two types of response:

  • A Fibonacci number for the req.headers.fibo value when the URL route is equal to fibo
  • A hello world string for any URL route that does not equal fibo

Lets run the fibo normally

First to show how Fibonacci blocks the event loop, I will create a server that serves a Fibonacci that runs on the same process as the simple hello world response.

Create a file called fibo_in_server.js. This file would return the Fibonacci number of a number passed into the
req.headers.fibo when the URL route is equal to the /fibo and return’s hello world for any other URL match.

        const http = require("http");

        function fibo(n) { 

            if (n < 2)
                return 1;
            else   return fibo(n - 2) + fibo(n - 1);
        }

        const server = http.createServer((req, res) => {
            "use strict";
            if (req.url == '/fibo') {
                let num = parseInt(req.headers.fibo); 
                console.log(num)
                res.end(`${fibo(num)}`) 
            } else {
                res.end('hello world'); 
            }
        });

        server.listen(8000, () => console.log("running on port 8000"));
Enter fullscreen mode Exit fullscreen mode

We can run the above code and check the response. When the req.url is not /fibo the response is hello world and the Fibonacci number of the number passed into the header fibo field for a req.url that is equal to /fibo.

I’m using the Postman Chrome extension for requesting the server.

If we send a number like 45 to the server, the request would block the event loop until it is done getting the Fibonacci number. Any request to get the hello world string would have to wait until the long-running Fibonacci is done.

This is not good for users who want to get only a simple response, because they have to wait for the Fibonacci response to be completed.

In this article, what I am going to do is look at some ways to fix this problem. I am not a Pro Super NodeJs Guru User, but I can give some methods of dealing with this problem.

Methods of dealing with this problem

  • running Fibonacci in another Nodejs process
  • using method 1 with a batch queue to process the Fibonacci
  • using method 2 with a pool to manage the processes

Method 1: Running in another process

What we can do is run the Fibonacci function in another Nodejs process. This would prevent the event loop from getting blocked by the Fibonacci function.

To create another process we use the [child_process]() module. I am going to create a file, fibonacci_runner.js, that runs as the child
process, and another file called server_method1.js, the parent process.

The server_method1.js serves the response to the client. When a request to the /fibo is made the server gives the work to its child process fibo_runner.js to
handle. This prevents the event loop on the server from getting blocked, making it easier for a smaller request to be handled.

Here is the code for fibonacci_runner.js

process.on("message", (msg) => {
    "use strict";
    process.send({value: fibo(parseInt(msg.num)),event:msg.event})
});

function fibo(n) { // 1
    if (n < 2)
        return 1;
    else   return fibo(n - 2) + fibo(n - 1)
}
Enter fullscreen mode Exit fullscreen mode

And here is the code for server_method1.js:

const http = require("http");
const {fork} = require('child_process');
const child = fork(`${__dirname}/fibonacci_runner.js`);
let {EventEmitter} = require('events');

let event = new EventEmitter();


const server = http.createServer(function(req, res){

    if (req.url == '/fibo') {
        let rand = Math.random() * 100; //generate a random number

        child.send({num:req.headers.fibo,event:rand});  //send the number to fibonacci_running

        event.once(rand, (value) => { //when the event is called
            res.end(`${value}`)
        })
    } else {
        res.end('hello world');
    }
});

child.on("message",(msg)=> event.emit(msg.event,msg.value)); //emit the event event sent

server.listen(8000, () => console.log("running on port 8000"));
Enter fullscreen mode Exit fullscreen mode

Now if we visit the URL route /fibo with a value >= 45 in the req.headers.fibo value, it won’t block the request for the hello world. Better than what we had before.

The next step is to reduce the amount of computation the fibonacci_runner does. One way of reducing this is by using a batch queue with/or a cache (Note:
there are still other methods of doing this).

In this article, I am going to discuss the batch queue alone.

You can check out these articles to know more about the cache :

https://community.risingstack.com/redis-node-js-introduction-to-caching/amp/
https://goenning.net/2016/02/10/simple-server-side-cache-for-expressjs/

Method 2: Batching queue

When dealing with asynchronous operations, the most basic level of caching can be achieved by batching together a set of invocations to the same API. The idea is very simple: 
if I am invoking an asynchronous function while there is still another one pending, we can attach the callback to the already running operation, instead of Creating a brand new request. — “Nodejs Design Patterns”
Enter fullscreen mode Exit fullscreen mode

From the definition above, we want to batch requests with the same req.headers.fibo value together, Instead of calling a new Fibonacci call while one with the same req.headers.fibo value
is still pending.

I am still going to use the fibonacci_runner.js to run the Fibonacci operation, but I’m going to create a new file, server_method2.js, that has
an asyncBatching function that sits between the fibonacci_runner.js and the call to process the req.headers.fibo.

Here is the code for server_method2.js

const http = require("http");
const {fork} = require('child_process');
const child = fork(`${__dirname}/fibonacci_runner.js`);
let Queue = {}//1

function asyncBatching(num, cb) {
    if (Queue[num]) {
        Queue[num].push(cb) //2
    } else {
        Queue[num] = [cb]; //3
        child.send({num: num, event: num})//4
    }
}

const server = http.createServer(function (req, res) {

    if (req.url == '/fibo') {
        const num = parseInt(req.headers.fibo)
        asyncBatching(num,(value)=>res.end(`${value}`))
    } else {
        res.end('hello world');
    }
});

child.on("message", (msg) =>{
    "use strict";
    let queue = [...Queue[msg.event]];
    Queue[msg.event] = null;  //empty the Queue
    queue.forEach(cb=>cb(msg.value))
    console.log(`done with ${msg.event}`)
});

server.listen(8000, () => console.log("running on port 8000"));
Enter fullscreen mode Exit fullscreen mode

I would use the Apache benchmark to run this test

$ ab -n 10 -c 10 -H 'fibo: 39' http://localhost:8000/fibo
Enter fullscreen mode Exit fullscreen mode

It takes 3.196 on my machine for method2,and 32.161 for method1. This means method2 responds n times faster than method1
(number of concurrent users sending the same req.headers.fibo value).

To improve method2 further we can use a cache to save the value of the Fibonacci but am not going to touch caching in
this article :(.

What is going to do here is improve on method2 by increasing the number of child processes. I am going to use a pool that
would manage the distribution of work among the child processes.

Method 3: Pooling and managing multiple processes

Creating multiple child processes to handle the Fibonacci operation would make it respond faster and better. You have to know that running many processes is making
use of system resources. Creating too many processes is bad; Just create enough.

The Pool is responsible for handling child processes. First, let’s create a Pool file, Pool.js, that exports a Pool class.

Code for Pool.js file:

const child = require('child_process');

class Pool {
    constructor(file, maxPool, messageCb) {
        this.pool = [];
        this.active = [];
        this.waiting = [];
        this.maxPool = maxPool;

        let releaseWorker = (function (worker) {
            //move the worker back to the pool array
            this.active = this.active.filter(w => worker !== w);
            this.pool.push(worker);
            //if there is work to be done, assign it
            if (this.waiting.length > 0) {
                this.assignWork(this.waiting.shift())
            }
        }).bind(this);

        for (let i = 0; i < maxPool; i++) {
            let worker = child.fork(file);
            worker.on("message", (...param) => {
                messageCb(...param);
                releaseWorker(worker)
            });
            this.pool.push(worker)

        }
    }

    assignWork(msg) {

        if (this.active.length >= this.maxPool) {
            this.waiting.push(msg);
            console.log(this.waiting)
        }

        if (this.pool.length > 0) {
            let worker = this.pool.pop();
            worker.send(msg);
            this.active.push(worker)
        }
    }

}

module.exports = Pool;
Enter fullscreen mode Exit fullscreen mode

The Pool class

As said before, the Pool is responsible for handling the child process. It has only one method, the assignWorker method. The assignWorker method
assigns work to a worker (child process) to handle. If all the workers are busy the work would be done as soon as one is free.

The Pool Object takes three parameters on creation. These arguments are :

  • the file to run as the child process
  • the number of processes to create
  • the function to call when the workers send a message back

Now let’s create server_method3.js file that makes use of the Pool Object.

The code for server_method3.js:

const http = require("http");
let Queue = {};
const Pool = require("./Pool");

let Pooler = new Pool(`${__dirname}/fibonacci_runner.js`,2, (msg) => {
    "use strict";
    let queue = [...Queue[msg.event]];
    Queue[msg.event] = null;  //empty the Queue
    queue.forEach(cb => cb(msg.value));
    console.log(`done with ${msg.event}`)
});

//responsible for batching
function asyncBatching(num, cb) {
    if (Queue[num]) {
        Queue[num].push(cb)
    } else {
        Queue[num] = [cb];
        Pooler.assignWork({num: num, event: num})
    }
}

const server = http.createServer(function (req, res) {

    if (req.url == '/fibo') {
        const num = parseInt(req.headers.fibo);
        asyncBatching(num, (value) => res.end(`${value}`)) // 
    } else {
        res.end('hello world');
    }
});


server.listen(8000, () => console.log("running on port 8000"));
Enter fullscreen mode Exit fullscreen mode

server_methodw3.js runs more than one child process, so we can run multiple Fibonacci operations at the same time,
instead of waiting for the one to finish.

The number of Fibonacci we can run at the same time depends on the number passed as the second parameter to the Pool
constructor.

Note: limit the number of processes you spawn ups.

Conclusion

Running heavy task on node event loop is a bad idea, and remember to pass the task to another process to handle, be it Nodejs or not (you can start a C++ to handle
very heavy operations).

Remember to always keep the event loop from getting blocked by any operation.

Read this article for more about the Event Loop.

Badewa Kayode, peace out :).

Discussion (0)