Discussion on: Ratlog.js – JavaScript Application Logging for Rats, Humans and Machines

View post

A few questions:

How does this interact with centralized log aggregators? If you do microservices/cloud deployments without this, you are doing it wrong and you are running blind as a bat.
Does it have something like a Mapped Diagnostic Context? Look it up, this stuff is awesome. You'll be wondering why you never used this before. E.g. all our application logs could include fields that indicate request method, path, implementation method, user id (if applicable), host, user agent, etc. Meaning we can slice and dice our centralized logs across microservices using these fields. If I get a user complaining about some weird error, I can dig out exactly what he did when from millions of log messages across many micro-services and applications.
How does this play with stuff that has it's own notion of logging? If you can't control how third party stuff logs, your logs are going to be a mess.

In the Java world there are many logging frameworks but they tend to play nice with each other by abstracting and encapsulating each other's APIs. E.g. I can use logback to send logs to logstash/elasticsearch in json format (using one of several formatter plugins) with MDC fields from my mapped diagnostic context added. Meanwhile parts of my application logging using log4j, commons-logging, or java's built in logging get picked up by slf4j and redirected to logback. Also, their various MDC implementations play nice with each other. I can do something similar with log4j 2.0 and several other framweorks. All this means is I have a a few jar files on the classpath and loggers declared using any of the above frameworks will end up being picked up by logback when it writes the logs. We have production and dev configs for this so we can get human readable output when running tests and output optimized for logstash when running in production.

jorin • Apr 24 '18 • Edited

hi @jillesvangurp , great questions!

As you probably can tell, the main use case of Ratlog is to have a format which is directly consumable by humans so it might not be the best fit for a cloud-based setup.
If you don't look at the logs directly, you can also output JSON and query it through some central service such as Elastisearch.
But if you like to have both, it is totally possible to forward Ratlog logs to a central system and process them further. The format is not only human-readable, but also easy to parse by machines.
I already tried to mention this here but I think I need to elaborate on it to be more clear about the goal of Ratlog and about which problems it might not be the best solution for.
Ratlog has more structure than plain-text output, but still very little compared to most data formats. Using Ratlog's fields you could store information coming from Mapped Diagnostic Context, but it doesn't have any opinions of what to collect by itself. You can implement that in libraries built on top of Ratlog. I think this definitely an important part of logging and I'm thinking about creating some sort of common style guide for naming tags and fields and which ones to collect for which sort of service (see #6).
In the systems I get to work with I am mostly lucky enough that we can control which libraries to use and we can decide where their output goes. As I tried to mention here libraries should not write logs themselves, but leave all output to the application logic.
When working with existing systems or with external tools, it get's very difficult to enforce any kind of common log format. I wouldn't expect sshd, Nginx, Postgres and my own application code to output logs in the same format. Personally I don't know about the Java ecosystem; maybe the tooling there already agrees on common logging solutions - which is great. But then as explained in the first point above, this is not the problem scope Ratlog is trying to solve.
The way you explain the state of things in the Java ecosystem sounds pretty sophisticated and seems like there logging is mostly a solved problem. In other environments like Node.js it is still wild west and tools don't work together nicely.
The Ratlog.js library is a very simple logging solution for situations where all you want is simple logs that are easy to understand.
However the Ratlog format itself is independent from the JavaScript library. As you mention that logback already gives you human-readable output it might not be all that useful in that scenario, but you can see it as an alternative output format for that functionality, which is intended to be read by humans.
I guess in Java it wouldn't make any sense to implement a Ratlog library from scratch, but instead one could configure, for example logback, to output already existing logs in the Ratlog format.

Thanks for all the input!

Jilles van Gurp • Apr 24 '18

Great, I wrote the original reply from a point of view of taking care of centralized logging in a polyglot environment (including node.js) and being quite depressed with how people seem to be stuck reinventing doing logging the wrong way.

The way you need to think is in terms of many independent components that each generate logs that need to be captured somehow. You use stuff like Elastic Beats, logstash, syslog, etc. to collect logs from different sources and send them to remote systems where they get aggregated, analyzed and stored. Each application typically runs in a docker container. Docker comes with its own ecosystem of logging stuff (some of it quite crappy, beware). The bare minimum is simply capturing stdout and pumping it to some remote aggregator annotated with some metadata about where it came from for. For example logstash comes with a lot of functionality to help you make sense of all the crap you get from legacy systems doing whatever on stdout (e.g. nginx access logs, mysql logs, systemd, syslog) and pick that apart. A lot is lost in translation this way. The best way is to not lose stuff and output in a format that your remote aggregator understands, which is typically json or some well defined line based format like syslog.

What you built is basically a custom log formatter. What happened in the java world is that logging apis you use in your code were separated from log output generation pretty early on. That's a good idea, that needs adopting.

So, it is easy to switch any Java application to whatever output format you need and people do that all the time. Most other languages seem to not do that and so you end up with basically some hardcoded syntactic sugar around puts in ruby or console.log in javascript and log output that cannot be customized and lacks usable context. The best you can hope for is doing some application specific parsing of the output to grab things like timestamps and log levels. Garbage in, garbage out, basically.

So, if you want to facilitate your stuff being used in a cloud based environment, you want some option to output json instead of human readable stuff and be able to switch between both. It's OK to output that to stdout or stderr. In a dockerized environment that is both acceptable and easy to deal with. You get the best of both worlds.

jorin • Apr 24 '18

I guess I need to be more clear when describing the reasoning behind and the goal of Ratlog.
The setup you are describing is definitely the right choice for cloud-scale infrastructure. There JSON output is a good choice for further processing and aggregation. That's why winston and bunyan do things the way they do. They are definitely a better choice for logging in a big system with centralized logging.
They also provide support for other formatting options than JSON so one could implement a pretty-printed version - maybe even using Ratlog.

Ratlog is custom format (mostly) for being consumed directly by humans. It doesn't need processing to be readable. Not all software is cloud-scale. I saw the need for Ratlog because we used these complex logging tools and in the end we were reading JSON manually. We don't want to have to setup a centralized logging system when running a simple system on a single machine.

The other issue Ratlog solves for us is giving us the right context for logs using tags and fields - kinda like what you described with Mapped Diagnostic Context.

You are definitely right, that this is still missing in the JS ecosystem. It would be nice if also in cloud-based scenarios logging libraries give you more tools to add context to logs in a standard way.
That's also an interesting problem. But not one Ratlog is trying to solve.

Jilles van Gurp • Apr 24 '18

The point is that Ratlog is about output, not about the log api itself. These should be independent. So, why not separate those; like other frameworks do. So you have 1 way to log and multiple ways to target either human readable output or machine readable output. That would solve a real problem in the node.js world where most logging frameworks are naive and simplistic. I'd love to get better logging from node.js.

jorin • Apr 29 '18

Just stopping by to let you know I appreciate the feedback you gave!

While I like to keep Ratlog.js as simple as possible to use for the simple use case, I realize the value of Ratlog's logging semantics separate from its output format.
I extended the API in a way which allows using those semantics independent from the output format and which should make it easy to use it on top of other logging frameworks, libraries and services:

const log = ratlog.logger(log => {
  process.stdout.write(JSON.stringify(log) + '\n')
})

log('log')

const debug = log.tag('debug')

debug('debugging only')

You can find more in the API docs