If you ever wrote back-end applications in Node.js, you know that tracing the same HTTP request through log entries is a problem. Usually your logs look something like this:
[07/Nov/2018:15:48:11 +0000] User sign-up: starting request validation
[07/Nov/2018:15:48:11 +0000] User sign-up: starting request validation
[07/Nov/2018:15:48:12 +0000] User sign-up: request validation success
[07/Nov/2018:15:48:13 +0000] User sign-up: request validation failed. Reason:
...
Here, log entries are mixed up and there is no way to determine which of them belong to the same request. While you would probably prefer to see something like this:
[07/Nov/2018:15:48:11 +0000] [request-id:550e8400-e29b-41d4-a716-446655440000] User sign-up: starting request validation
[07/Nov/2018:15:48:11 +0000] [request-id:340b4357-c11d-31d4-b439-329584783999] User sign-up: starting request validation
[07/Nov/2018:15:48:12 +0000] [request-id:550e8400-e29b-41d4-a716-446655440000] User sign-up: request validation success
[07/Nov/2018:15:48:13 +0000] [request-id:340b4357-c11d-31d4-b439-329584783999] User sign-up: request validation failed. Reason:
...
Notice the [request-id:*] part that contains request identifiers here. These identifiers will allow you to filter log entries that belong to the same request. Moreover, if your application is composed of microservices communicating with each other over HTTP, request identifiers may be sent in an HTTP header and used for tracing request chains across logs generated by all microservices on the chain. It’s difficult to overestimate how valuable this feature is for diagnostic and monitoring purposes.
As a developer, you would probably want to configure your web framework and/or logger library in a single place and get request ids generated and printed to logs automatically. But unfortunately, it may be a problem in Node.js world.
In this article we are going to discuss the problem and one of possible solutions.
OK, but is it Even a Problem?
In many other languages and platforms, like JVM and Java Servlet containers, where HTTP servers are built around a multi-threaded architecture and blocking I/O, it’s not a problem. If we put an identifier of the thread that processes HTTP request into logs, it already may serve as a natural filter parameter for tracing a particular request. This solution is far from being ideal, but it can be enhanced further with using Thread-local storage (TLS). TLS is basically a way to store and retrieve key-value pairs in a context associated with the current thread. In our case, it may be used to store ids (and any other diagnostic data) generated for each new request. Many logging libraries have features built around TLS. As an example, check out docs for SLF4J’s Mapped Diagnostic Context.
Due to the asynchronous nature of Node.js, which is based on the event loop, there is simply nothing like the thread-local storage, as your JS code is being processed on a single thread. Without this or similar API, you would have to drag a context object, containing the request id, all across your request handling calls.
Let’s see how it might look like in a simple Express application:
In this imaginary app we had to pass the req object into the fakeDbAccess function, so that we would be able to output the request id to logs. Think of how redundant and error-prone this approach would be in real-world apps that usually have a way more routes and modules.
Luckily, folks from the Node.js community were thinking of alternatives to TLS a long ago. And the most popular of those alternatives is called Continuation-local storage (CLS).
CLS to the Rescue!
The very first implementation of CLS is continuation-local-storage library. It has the following definition of CLS:
Continuation-local storage works like thread-local storage in threaded programming, but is based on chains of Node-style callbacks instead of threads.
If you check the API of the library, you may find it a bit more complicated than, for example, Java’s TLS. But at the core, it’s quite the same. It allows you to associate a context object with a chain of calls and obtain it later.
The initial library was based on using process.addAsyncListener API, which was available until Node.js v0.12, and its polyfill for node v0.12+. The polyfill does a lot of monkey patching which is aimed on wrapping built-in node APIs. That’s the main reason why you should not consider using the original library with modern versions of Node.js.
Fortunately, there is a subsequent version of the CLS library (or a fork to be more precise), which is called cls-hooked. In Node.js >= 8.2.1 it’s using async_hooks, a node’s built-in API. And even though the API is still experimental, this option is much better than the one with a polyfill. If you want to know more about Async Hooks API, check out this post.
Now, when we have the right tool, we know how to approach our initial problem, which is tracing request ids in app logs.
How About Out-of-the-box solutions for my Belowed Express/Koa/another-web-framework?
As you already know, if you want to have request ids in your Node.js app, you may use cls-hooked and integrate it with the web framework that you are using. But probably you would want to use a library that would be doing this stuff.
Recently I was in search of such library and failed to find a good match for the task. Yes, there are several libraries that integrate web frameworks, e.g. Express, with CLS. On the other hand, there are libraries that provide request id generation middlewares. But I did not find a library that would combine both CLS and request ids in order to solve the request id tracing problem.
So, ladies and gentlemen, please meet cls-rtracer, a small library that solves a not-so-small problem. It provides middlewares for Express and Koa that implement CLS-based request id generation and allows to obtain request ids anywhere on your call chain.
The integration with cls-rtracer basically requires two steps. The first one — attach the middleware in the appropriate place. The second — configure your logging library.
That’s how it might look like in case of an Express-based app:
When being run, this app generates a console output similar to the following:
2018-12-06T10:49:41.564Z: The app is listening on 3000
2018-12-06T10:49:49.018Z [request-id:f2fe1a9e-f107-4271-9e7a-e163f87cb2a5]: Starting request handling
2018-12-06T10:49:49.020Z [request-id:f2fe1a9e-f107-4271-9e7a-e163f87cb2a5]: Logs from fakeDbAccess
2018-12-06T10:49:53.773Z [request-id:cd3a33a9-32cb-453b-a0f0-e36c65ff411e]: Starting request handling
2018-12-06T10:49:53.774Z [request-id:cd3a33a9-32cb-453b-a0f0-e36c65ff411e]: Logs from fakeDbAccess
2018-12-06T10:49:54.908Z [request-id:8b352536-d714-4838-a372-a8e2cfcb4f53]: Starting request handling
2018-12-06T10:49:54.910Z [request-id:8b352536-d714-4838-a372-a8e2cfcb4f53]: Logs from fakeDbAccess
Notice that the integration itself includes attaching Express middleware generated by rTracer.expressMiddleware() function call and obtaining request id with rTracer.id() call. So, it’s as simple as it can be. You may also find an example for Koa apps here.
Hopefully, cls-rtracer will help you to solve the request id tracing problem and make diagnosing your Node.js apps a bit easier.
P.S. I must notice that there is a certain performance impact of using Async Hooks. However, it should not be critical for most applications. Check out this comment and the follow-up (thanks Marek Kajda!).
P.P.S. Feel free to request support for additional web frameworks and report found issues.
Top comments (0)