Ayman Osman

Posted on Apr 17, 2022

Structured (JSON) logs in Elixir

#elixir #logging

The desire to produce structured logs is common. In this article, I will survey the major approaches one could take to achieve this goal using the built-in Elixir Logger.

As an example, I will use JSON as the "structured" format, as it seems to be the most common. The same points apply to any structured format you choose.

First, a quick recap of the important pieces of the Elixir Logger.

Log events are produced via the API functions such as Logger.info and Logger.warn. A log event is a "message", which is usually a string, paired with "metadata", which is a list of key-value pairs.

> Logger.info("200 OK", request_id: 42)

Log events are consumed by "backends".

import Config

config :logger,
  backends: [:console, OtherBackend]

Each backend is independent, making its own decisions about which log events to filter out and how to format them.

Finally, there is the configuration for each backend, with format and metadata being the only relevant ones for this article.

import Config

config :logger, :console,
  format: # describes the output format
  level: # sets the minimum log level to output
  metadata: # what metadata keys (if any) will be included in the output

Let's quickly see what kind of behaviour we get out-of-the-box.

With the following configuration:

import Config

config :logger, :console,
  metadata: [:request_id]

this is the kind of output you can expect to see:

> Logger.info("200 OK", request_id: 42)

12:00 request_id=42 [info] 200 OK

The output is determined by the format configuration parameter, which defaults to $time $metadata[$level] $message.

You may be wondering what happens if we provide a map instead of a string for the message argument. Here is what it looks like:

> Logger.info(%{status: 200, request_id: 42})

12:00 [info]  [request_id: 42, status: 200]

I'll explain what is going on here at the end of the article, but, for now, I will ignore this because it won't turn out to be usable.

Now let's look at the different approaches to structured logging.

Serializing into `$message` (the non-approach)

One approach that may occur to you is to forget about how the logging framework works and serialize messages into JSON yourself, like so:

> Logger.info(encode(%{status: 200, request_id: 42}))

12:00 [info] {"status": 200,"request_id":42}

Almost there, now you just need to get rid of the extra stuff in the way of your JSON formatted log.

import Config

config :logger, :console
  format: "$message\n"

> Logger.info(encode(%{status: 200, request_id: 42}))

{"status": 200,"request_id":42}

There we go. We now have JSON logs. But we lost the timestamp and log level. So we add them back.

> Logger.info(encode(%{timestamp: "12:00", level: :info, status: 200, request_id: 42}))

{"timestamp": "12:00","level": "info","status": 200,"request_id":42}

And it seems we have accomplished our task.

Obviously, I think this is a bad approach. Here are some of the reasons why:

You have to remember to include "level" and "timestamp" fields every time you produce a log event.

Something the logger is designed to do for you. If you think that you can solve that problem by wrapping the Elixir logger, then you should know that that is also a bad idea.

But most importantly,

it does not play well with third-party libraries.

Any other library that produces log events in a sane way,
i.e. by calling Logger.info("some message", ... metadata ...) will not participate in this scheme. So you will end up with a mixture of JSON formatted logs produced by your application and raw strings produced by other applications.

{"level":"info","timestamp":"12:00","message":"did something"}
Raw log from third-party library
{"level":"info","timestamp":"12:01","message":"did something else"}

Custom formatter - the most straightforward approach

In Elixir's logging framework there is a somewhat informal notion of a "custom formatter", this is a configuration parameter that lets the user of a logging backend customize the way log events are serialized. By convention, this piece of configuration is specified by the format key.

The simplest form of this parameter is a string.

import Config

config :logger, :console,
  format: "[$level] $message"

Which lets you specify simple formats, but this parameter can also be a {module, function} tuple which lets you take complete control over formatting.

import Config

config :logger, :console
  format: {MyApp.Logger.Formatter, :format}

Once you do that, it is just a matter of implementing this module and function.

defmodule MyApp.Logger.Formatter do
  def format(level, message, timestamp, metadata) do
    # encode the log event
  end
end

Now we have complete control. We also have to provide the metadata keys we wish to receive, as usual.

import Config

config :logger, :console
  format: {MyApp.Logger.Formatter, :format},
  metadata: [:request_id]

defmodule MyApp.Logger.Formatter do
  def format(level, message, timestamp, metadata) do
    log_event =
      to_json(
        %{
          timestamp: format_timestamp(timestamp),
          level: level,
          message: :erlang.iolist_to_binary(message)
        },
        metadata
      )

    encode(log_event)
  rescue
    _ -> "could not format: #{inspect({level, message, metadata})}"
  end

  defp to_json(...) # not every Erlang term is valid json, convert here
  defp format_timestamp(...)
end

We get the output we expect.

> Logger.info("nice stuff", request_id: 42)

{"timestamp":"12:00","level":"info","message":"nice stuff","request_id":42}

And with this, all third-party log events will also be formatted properly.

This approach will work for any backend which follows the convention of exposing a format configuration parameter.

A new backend - is overkill

The third and final approach that we will consider is creating a new backend. As I will show, it is no better than the previous approach. First, let's see a simplified implementation of a backend based on the built-in :console backend.

We tell the :logger application to send logs to our new backend.

import Config

config :logger,
  backends: [MyApp.Logger.Backends.Console]

And implement it as follows:

defmodule MyApp.Logger.Backends.Console do
  @behaviour :gen_event

  def init(_) do
    level = :info
    format = Logger.Formatter.compile("$time $metadata[$level] $message\n")
    metadata = [:request_id]
    device = Process.whereis(:user)

    {:ok, %{level: level, format: format, metadata: metadata, device: device}}
  end

  def handle_event({level, _group_leader, {Logger, message, timestamp, metadata}}, state) do
    %{level: min_level, format: format, metadata: keys, device: device} = state

    cond do
      # Filter out logs that are lower than the configured level
      Logger.compare_levels(level, min_level) == :lt ->
        {:ok, state}

      true ->
        :ok =
          :io.put_chars(
            device,
            Logger.Formatter.format(
              format,
              level,
              message,
              timestamp,
              Keyword.take(metadata, keys)
            )
          )

        {:ok, state}
    end
  end
end

Let's go over that in detail.

A backend is a gen_event, so we declare that we implement that behaviour.

We implement the callback init/1, which returns {:ok, state}, just like a GenServer. Normally, we would fetch configuration for the backend here and do any setup work which is required for that particular backend. For example, a file based backend may open a file and store the file handle. In this example, I locate the :user process and store its pid. The :user process is a special device which is responsible for output to the standard I/O streams, i.e. the console. I also "compile" the format template for later use.

Finally, we implement the handle_event/2 callback, which is the heart of the backend. It receives a log event, drops it if it is lower than the configured minimum log level, otherwise, it formats and sends it to the output device. You can see that this particular backend delegates formatting completely to another module: Logger.Formatter.

The Logger.Formatter module, given a format string, works in the following way:

> fmt = Logger.Formatter.compile("[$level] $message\n")
["[", :level, "] ", :message, "\n"]

> output = Logger.Formatter.format(fmt, :info, "200 OK", :no_timestamp, :no_metadata)
["[", "info", "] ", "200 OK", "\n"]

> :io.put_chars(:user, output)
[info] 200 OK
:ok

but when given a {module, function} tuple, it will simply delegate to that module.

> Logger.Formatter.compile({MyMod, :my_fun})
{MyMod, :my_fun}

> Logger.Formatter.format({MyMod, :my_fun}, level, message, timestamp, metadata)
# Just calls MyMod.my_fun(level, message, timestamp, metadata)

Seeing what a backend consists of you should be able to see that formatting happens entirely in Logger.Formatter.format/5. Which gets us back to where we were in the second approach.

Which is why I conclude:

Creating a new backend just to format log events is overkill.

Because, as the above code shows, it is no more flexible than a formatting module.

So when should you create a new backend? I'll tentatively propose a rule and my reasoning for it.

A logger backend is tasked with receiving log events and sending them to an output device. Only the backend is responsible for sending output to the device, therefore this suggests it has the following responsibilities:

managing the I/O device
and efficient use of the I/O device (i.e. buffering output)

In particular, the backend is not responsible for formatting output, that is left up to the Logger.Formatter module (or a custom formatter).

Which leads to the following rule:

Only create a new logger backend when you need to manage a new type of I/O device.

Each device may have particular needs in terms of setup and efficient use, and that can be encapsulated by the backend.

Examples of I/O devices that may warrant their own backends:

the standard I/O streams, i.e. the console
a file
a set of files with log rotation
a socket (for remote logging)

That's all I have to say about the different approaches to structured logging using the built-in Elixir logger.

Now I'd like to look at a potential future approach.

Reports - a future approach?

Earlier in the article we saw that the message argument to the Logger API functions need not be a string.

> Logger.info(%{status: 200, request_id: 42})

12:00 [info]  [request_id: 42, status: 200]

This may look promising until you realise that you have no control over how this message is formatted.[^1]

defmodule CustomFormatter do
  def format(level, message, timestamp, metadata) do
   # "message" is already formated into "[request_id: 42, status: 200]"
  end
end

So what is going on?

To understand this we need to look at how Elixir's logger is implemented on top of the Erlang logger.

The Erlang logger has its own design which differs in significant ways from the Elixir logger. To keep this article short, I will not explain the Erlang logger in detail, but poke around its API enough to give you an idea of the pieces.

The Erlang logger is organised into "handlers". When Elixir starts up, it registers the Logger module as a handler (replacing the default Erlang handler).

> :logger.get_handler_ids
[Logger]

You can take a look at the configuration:

> :logger.get_handler_config Logger
{:ok,
 %{
   ... elided ...,
   id: Logger,
   module: Logger.Handler
 }}

A "handler" configures a callback module to implement its behaviour, which is the Logger.Handler module, in this case. The callback that handles a log event is log/2 and what it receives is a little different to what a "backend" receives.

An Erlang "log event" has the following type:

@type log_event :: %{level: level(),
                     msg: {:report, map()}
                        | {:string, :unicode.chardata()},
                     meta: metadata()}

For example, a log event could be:

%{level: :info,
  msg: {:string, "200 OK"},
  meta: %{request_id: 42}}

%{level: :info,
  msg: {:report, %{status: 200, request_id: 42}},
  meta: %{}}

The name Erlang gives a structured log message is a "report". A report is just a map. The line of code responsible for formatting this report is here.

It turns into the following call:

> Kernel.inspect(Map.to_list(%{status: 200, request_id: 42}))
"[request_id: 42, status: 200]"

The Elixir logger's handler implementation will always render a report to a string before passing it to a backend. It is this fact that makes reports unusable for structured logging today.

I assume this was done for backwards-compatibility, after all, the Elixir Logger predates the Erlang logger.

One could imagine a change to the Elixir logger to allow backends to receive Erlang-style log events and bring the two systems closer together.

[^1] There is the report_cb callback, but it does not help.