DEV Community

Cover image for Ultimate guide to concurrent logging in Node.js
Gajus Kuizinas
Gajus Kuizinas

Posted on

Ultimate guide to concurrent logging in Node.js

Here is a problem that everyone will recognise: logs produced by concurrent tasks are useless without their initiation context.

If you need an example, consider a program that executes some code that produces logs after receiving an event. It could be an incoming HTTP request, a job queue, etc., e.g.

process.on('task', (task) => {
  executeTask(task);
});

const executeTask = async (task) => {
  try {
    // [..]

    console.log('task completed');
  } catch (error) {
    console.log('task failed');
  }
};

Enter fullscreen mode Exit fullscreen mode

As it is, this program will produce logs that look like this:

task completed
task completed
task failed
task completed
task failed
Enter fullscreen mode Exit fullscreen mode

The point is that, if we want to produce meaningful logs, then we have to somehow associate task with each log message.

const executeTask = async (task) => {
  try {
    await download(task.url);

    console.log({task}, 'task completed');
  } catch (error) {
    console.log({task}, 'task failed');
  }
};

Enter fullscreen mode Exit fullscreen mode

The problem is that, in order to achieve this, you have to pass-down task object down to every single function that produces logs. It is easy enough to add task to console.log in our example, but usually functions that produce codes are deeply nested, or they are third-party modules and there is no way to pass additional context to the logger.

const download = (url) => {
  if ([..]) {
    return console.error('invalid url');
  }

  if ([..]) {
    return console.error('unsupported protocol');
  }

  // [..]
};

Enter fullscreen mode Exit fullscreen mode

As there are no conventions for passing-down logger context, you will end up with a log trail that lists random events without providing the necessary context required to associate those events with the asynchronous task that it belongs to.

invalid url
task completed {id:6, url: [..]}
task completed {id:4, url: [..]}
unsupported protocol
task completed {id:5, url: [..]}
task completed {id:3, url: [..]}
task failed {id:2, url: [..]}
task completed
task failed {id:1, url: [..]}
Enter fullscreen mode Exit fullscreen mode

The above log would not be sufficient to identify which task failed for which reason. Luckily, there is a solution.

Using Node.js domains

Using Node.js Domain, we can add context to a chain of asynchronous calls without passing them down explicitly (think of it as a closure or Context in React), e.g.

import domain from 'domain';

process.on('task', (task) => {
  domain
    .create()
    .run(() => {
      process.domain.context = {
        task,
      };

      executeTask(task);
    });
});

const download = (url) => {
  if ([..]) {
    return console.error(process.domain.context, 'invalid url');
  }

  if ([..]) {
    return console.error(process.domain.context, 'unsupported protocol');
  }

  // [..]
};

const executeTask = async (task) => {
  try {
    await download(task.url);

    console.log({task}, 'task completed');
  } catch (error) {
    console.log({task}, 'task failed');
  }
};

Enter fullscreen mode Exit fullscreen mode

This way, each log message is associated with the asynchronous context that initiated the chain of the asynchronous calls.

invalid url {id:1, url: [..]}
task completed {id:6, url: [..]}
task completed {id:4, url: [..]}
unsupported protocol {id:2, url: [..]}
task completed {id:5, url: [..]}
task completed {id:3, url: [..]}
task failed {id:2, url: [..]}
task completed {id:2, url: [..]}
task failed {id:1, url: [..]}
Enter fullscreen mode Exit fullscreen mode

In theory, domains can be even nested, i.e. a domain that is aware of a parent domain could be used to associate a log message with the entire chain of asynchronous calls that led to the log message. Node.js does not provide this functionality out of the box. However, Node.js can be monkey-patched to explicitly bind parent domain to the active domain, e.g.

const domain = require('domain');

const originalCreate = domain.create;

domain.create = (...args) => {
  const parentDomain = process.domain || null;

  const nextDomain = originalCreate(...args);

  nextDomain.parentDomain = parentDomain;

  return nextDomain;
};

Enter fullscreen mode Exit fullscreen mode

After which parentDomain becomes a reference to the parent domain:

const d0 = domain.create();

d0.run(() => {
  const d1 = domain.create();

  d1.run(() => {
    d1.parentDomain === d0;
  });
});

Enter fullscreen mode Exit fullscreen mode

In order to use the logger that I am about to introduce, you will need to patch Node.js using domain-parent.

Domain deprecation

Everyone is pointing out the obvious deprecation notice.

Domains are implemented using async hooks. Domain module has been pending deprecation since 2015-02-28. It has a deprecation status of documentation-only deprecation. However, loads of popular NPM modules and Node.js internals depend on domain module a lot. It will be long time (if ever) before they get Runtime deprecation and then a long time before they are assigned End-of-Life deprecation. Until that time it safe to use domain module.

Roarr

Finally, this all adds up to an introduction to a convention driven, context-aware logger that has been already developed and ready to us: Roarr.

Roarr adopt method creates a domain that efficiently binds user-defined context to all log messages that will produced during the asynchronous call. In order to associate context with your asynchronous code, simply execute your routine using Roarr adopt method, e.g.

import Logger from 'roarr';

process.on('task', (task) => {
  Logger
    .adopt(
      () => {
        return executeTask(task);
      },
      // Here we define the context that we want to associate with
      // all the Roarr loggers that will be called when executing
      // `executeTask` with the current `task` value. 
      {
        task,
      }
    );
});

const download = (url) => {
  if ([..]) {
    return log.error('invalid url');
  }

  if ([..]) {
    return log.error('unsupported protocol');
  }

  // [..]
};

const executeTask = async (task) => {
  try {
    await download(task.url);

    log.info('task completed');
  } catch (error) {
    log.info('task failed');
  }
};

Enter fullscreen mode Exit fullscreen mode

The above would produce log equivalent to:

invalid url {id:1, url: [..]}
task completed {id:6, url: [..]}
task completed {id:4, url: [..]}
unsupported protocol {id:2, url: [..]}
task completed {id:5, url: [..]}
task completed {id:3, url: [..]}
task failed {id:2, url: [..]}
task completed {id:2, url: [..]}
task failed {id:1, url: [..]}

Enter fullscreen mode Exit fullscreen mode

Of course, in order for all logs to include their asynchronous context, all of the dependencies would need to be using Roarr logger (or they would have to read logger context from process.domain.roarr.context). However, unlike the other loggers, Roarr does not have configuration and can be used in distributable packages and top-level programs alike. As a result, there are already thousands of packages using Roarr.

In order for you to start using Roarr, read what makes Roarr the perfect logger from Node.js and have a look at example libraries that implement the logger (Slonik is a good start).

To recap:

  1. Logs without context in a concurrent execution environment are meaningless.
  2. Domains can be used to associated context with asynchronous call chains.
  3. Roarr leverages domains to inherit context that describes the asynchronous call chain that led to the log message. This allows Roarr to produce logs that describe full execution context even when multiple tasks are executed concurrently.

Give Roarr a try. It even has a neat CLI program with a built-in pretty-print and filtering.

Top comments (3)

Collapse
 
victorioberra profile image
Victorio Berra

Glad to see there's already a solution out there in the form of an easy to use npm module. Lots of other languages like dotnet core have similar logging context features.

Collapse
 
cedpoilly profile image
Cedric

Insightful ๐ŸŽŠ

I have a question/suggestion for the longer-term:
Is it possible to refactor & use the Promise API for chaining asynchronous events? - I think it should solve the issue of deeply nested contexts.

@gajus thoughts?

Collapse
 
gajus profile image
Gajus Kuizinas

The problem that Domains solve is inheritance from any asynchronous context, e.g.

Suppose that your application uses Slonik to run PostgreSQL queries, and you have updatePost function that is run concurrently:

const createPool = require('slonik').createPool;
const sql = require('slonik').sql;

const updatePost = async (pool, postId) => {
  await pool.query(sql`SELECT ${postId}`);
};

(async () => {
  const pool = createPool('postgres://');

  await Promise.all([
    updatePost(pool, 1),
    updatePost(pool, 2),
    updatePost(pool, 3),
  ]);
})();

The above program would produce logs:

[2020-02-01T01:43:31.710Z] DEBUG (20) (@slonik): created a new client connection
poolId:    01DZZ6NN2S2P4TD0CM8YJZKPK3
processId: 77955
stats:
  idleConnectionCount:  0
  totalConnectionCount: 3
  waitingRequestCount:  0

[2020-02-01T01:43:31.710Z] DEBUG (20) (@slonik): client is checked out from the pool
poolId:    01DZZ6NN2S2P4TD0CM8YJZKPK3
processId: 77955
stats:
  idleConnectionCount:  0
  totalConnectionCount: 3
  waitingRequestCount:  0

[2020-02-01T01:43:31.712Z] DEBUG (20) (@slonik): created a new client connection
poolId:    01DZZ6NN2S2P4TD0CM8YJZKPK3
processId: 77953
stats:
  idleConnectionCount:  0
  totalConnectionCount: 3
  waitingRequestCount:  0

[2020-02-01T01:43:31.712Z] DEBUG (20) (@slonik): client is checked out from the pool
poolId:    01DZZ6NN2S2P4TD0CM8YJZKPK3
processId: 77953
stats:
  idleConnectionCount:  0
  totalConnectionCount: 3
  waitingRequestCount:  0

[2020-02-01T01:43:31.712Z] DEBUG (20) (@slonik): created a new client connection
poolId:    01DZZ6NN2S2P4TD0CM8YJZKPK3
processId: 77954
stats:
  idleConnectionCount:  0
  totalConnectionCount: 3
  waitingRequestCount:  0

[2020-02-01T01:43:31.712Z] DEBUG (20) (@slonik): client is checked out from the pool
poolId:    01DZZ6NN2S2P4TD0CM8YJZKPK3
processId: 77954
stats:
  idleConnectionCount:  0
  totalConnectionCount: 3
  waitingRequestCount:  0

[2020-02-01T01:43:36.725Z] DEBUG (20) (@slonik): client connection is closed and removed from the client pool
poolId:    01DZZ6NN2S2P4TD0CM8YJZKPK3
processId: 77955
stats:
  idleConnectionCount:  2
  totalConnectionCount: 2
  waitingRequestCount:  0

[2020-02-01T01:43:36.725Z] DEBUG (20) (@slonik): client connection is closed and removed from the client pool
poolId:    01DZZ6NN2S2P4TD0CM8YJZKPK3
processId: 77954
stats:
  idleConnectionCount:  1
  totalConnectionCount: 1
  waitingRequestCount:  0

[2020-02-01T01:43:36.726Z] DEBUG (20) (@slonik): client connection is closed and removed from the client pool
poolId:    01DZZ6NN2S2P4TD0CM8YJZKPK3
processId: 77953
stats:
  idleConnectionCount:  0
  totalConnectionCount: 0
  waitingRequestCount:  0

If there was an issue with one of the query executions, the above logs would not be particularly useful, and there is no way to pass additional details to the logger.

Let's add Roarr.adopt:

const createPool = require('./src').createPool;
const sql = require('./src').sql;
const Roarr = require('roarr').default;

const updatePost = async (pool, postId) => {
  await Roarr.adopt(
    async () => {
      await pool.query(sql`SELECT ${postId}`);
    },
    {
      postId,
    },
  );
};

(async () => {
  const pool = createPool('postgres://');

  await Promise.all([
    updatePost(pool, 1),
    updatePost(pool, 2),
    updatePost(pool, 3),
  ]);
})();

The only change we did was wrap updatePost body in Roarr.adopt and now the logs contain meaningful context information (the postId):

[2020-02-01T01:49:41.046Z] DEBUG (20) (@slonik): created a new client connection
postId:    2
poolId:    01DZZ70XRJDS3W02FR36JZEJK5
processId: 78010
stats:
  idleConnectionCount:  0
  totalConnectionCount: 3
  waitingRequestCount:  0

[2020-02-01T01:49:41.046Z] DEBUG (20) (@slonik): client is checked out from the pool
postId:    2
poolId:    01DZZ70XRJDS3W02FR36JZEJK5
processId: 78010
stats:
  idleConnectionCount:  0
  totalConnectionCount: 3
  waitingRequestCount:  0

[2020-02-01T01:49:41.048Z] DEBUG (20) (@slonik): created a new client connection
postId:    1
poolId:    01DZZ70XRJDS3W02FR36JZEJK5
processId: 78011
stats:
  idleConnectionCount:  0
  totalConnectionCount: 3
  waitingRequestCount:  0

[2020-02-01T01:49:41.048Z] DEBUG (20) (@slonik): client is checked out from the pool
postId:    1
poolId:    01DZZ70XRJDS3W02FR36JZEJK5
processId: 78011
stats:
  idleConnectionCount:  0
  totalConnectionCount: 3
  waitingRequestCount:  0

[2020-02-01T01:49:41.048Z] DEBUG (20) (@slonik): created a new client connection
postId:    3
poolId:    01DZZ70XRJDS3W02FR36JZEJK5
processId: 78012
stats:
  idleConnectionCount:  0
  totalConnectionCount: 3
  waitingRequestCount:  0

[2020-02-01T01:49:41.049Z] DEBUG (20) (@slonik): client is checked out from the pool
postId:    3
poolId:    01DZZ70XRJDS3W02FR36JZEJK5
processId: 78012
stats:
  idleConnectionCount:  0
  totalConnectionCount: 3
  waitingRequestCount:  0

[2020-02-01T01:49:46.065Z] DEBUG (20) (@slonik): client connection is closed and removed from the client pool
postId:    2
poolId:    01DZZ70XRJDS3W02FR36JZEJK5
processId: 78010
stats:
  idleConnectionCount:  2
  totalConnectionCount: 2
  waitingRequestCount:  0

[2020-02-01T01:49:46.067Z] DEBUG (20) (@slonik): client connection is closed and removed from the client pool
postId:    1
poolId:    01DZZ70XRJDS3W02FR36JZEJK5
processId: 78011
stats:
  idleConnectionCount:  1
  totalConnectionCount: 1
  waitingRequestCount:  0

[2020-02-01T01:49:46.068Z] DEBUG (20) (@slonik): client connection is closed and removed from the client pool
postId:    3
poolId:    01DZZ70XRJDS3W02FR36JZEJK5
processId: 78012
stats:
  idleConnectionCount:  0
  totalConnectionCount: 0
  waitingRequestCount:  0

In a real-world scenario, Roarr.adopt would wrap large parts of your codebase whenever there is additional context information that needs to be added to the logs of descended operations. The added context information will appear with all the logs regardless of however deep in your code those asynchronous calls are made.