So I recently had to add a new feature to an existing app. The new feature did some data heavy stuff like processing large documents of which the content was to be saved into a database.
Naturally, I queued the data from the file and consumed the queue in a forked child process then saved the info to the database in the child process. To send a progress report on the status of the processing, I decided to use socketio to fire events to the client. This approach presented me with several problems because for one the processing was fast and the socketio instance didn't capture most of the events another problem was how to use the same socketio Instance between parent and child.
The approach I later settled for was to use Redis Pub/Sub to fire events from the child processes, listen on the main process and send said events to the client. The approach works, scales well and gives a really good performance.
I will assume you have an existing nodejs app and the data has been queued already. We need to install the following
- Redis Nodejs Client (I use https://www.npmjs.com/package/redis)
Both can be installed using npm.
npm i -S socket.io redis
Though this is out of scope for this article, I wrote a RabbitMq Helper which I use in my apps.
The feature required processing different queues that had different types of information but they both required the same underlying action; Saving into the database. So, I wrote a base child process and the specifics of each child process extended this
The main or parent process will fork the child processes anytime it starts. Starting a few child processes isn’t very difficult, but imagine having several child processes, it can be stressful getting the path to each and running them one after the other. So for that, I like to use a glob to find all child processes.
For that, I will use a npm module called
npm i -S glob
The code for the main process looks like this.
And that is it. Please leave your comments and opinions. Enjoy!