loading...
Cover image for Linux Logging Tutorial: What Are Linux Logs, How to View, Search and Centralize Them
Sematext

Linux Logging Tutorial: What Are Linux Logs, How to View, Search and Centralize Them

radu0gheorghe profile image Radu Gheorghe Originally published at sematext.com ・12 min read

TL;DR note: if you want the bzip2 -9 version of this post, scroll down to the very last section for some quick pointers. If you want to learn a bit about Linux system logs, please continue, as we'll talk about all these and more:

  • What are Linux logs and who generates them
  • Important types of Linux logs and their typical location
  • How to read and search logs, whether they're written by journald or syslog
  • How to centralize logs of many servers in one location. Spoiler alert: the easiest way is to send all system logs to Sematext Cloud in three commands, so you can build actionable dashboards:

Short Recap: What Are Linux Logs?

Linux logs are pieces of data that Linux writes, related to what the server, kernel, services, and applications running on it are doing, with an associated timestamp. They often come with other structured data, such as a hostname, being a valuable analysis and troubleshooting tool for admins when they encounter performance issues. You can read more about logs and why should you monitor them in our complete guide to log management. Here's an example of SSH log from /var/log/auth.log directory:

May 5 08:57:27 ubuntu-bionic sshd[5544]: pam_unix(sshd:session): session opened for user vagrant by (uid=0)

Notice how the log contains a few fields, like the timestamp, the hostname, the process writing the log and its PID, before the message itself. In Linux, logs come from different sources, mainly:

  • Systemd journal. Most Linux distros have systemd to manage services (like SSH above). Systemd catches the output of these services (i.e., logs like the one above) and writes them to the journal. The journal is written in a binary format, so you'll use journalctl to explore it, like:
    $ journalctl
    ...
    May 05 08:57:27 ubuntu-bionic sshd[5544]: pam_unix(sshd:session): session opened for user vagrant by (uid=0)
    ...
  • Syslog. When there's no systemd, processes like SSH can write to a UNIX socket (e.g., /dev/log) in the syslog message format. A syslog daemon (e.g., rsyslog) then picks the message, parses it and writes it to various destinations. By default, it writes to files in /var/log, which is how we got the earlier message from /var/log/auth.log.
  • The Linux kernel writes its own logs to a ring buffer. Systemd or the syslog daemon can read logs from this buffer, then write to the journal or flat files (typically /var/log/kern.log). You can also see kernel logs directly via dmesg:
$ dmesg -T
...
[Tue May 5 08:41:31 2020] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
...
  • Audit logs. These are a special case of kernel messages designed for auditing actions such as file access. You'd typically have a service to listen for such security logs, like auditd. By default, auditd writes audit messages to /var/log/audit/audit.log
  • Application logs. Non-system applications tend to write to /var/log as well. Here are some popular examples:
    • Apache HTTPD logs are typically written to /var/log/httpd or /var/log/apache2. HTTP access logs would be in /var/log/httpd/access.log
    • MySQL logs typically go to /var/log/mysql.log or /var/log/mysqld.log
    • Older Linux versions would record boot logs via bootlogd to /var/log/boot or /var/log/boot.log. Systemd now takes care of this: you can view boot-related logs via journalctl -b. Distros without systemd have a syslog daemon reading from the kernel ring buffer, which normally has all the boot messages. So you can find your boot/reboot logs in /var/log/messages or /var/log/syslog
    • Last but not least, you may have your own apps using a logging library to write to a specific file

These sources can interact with each other: journald can forward all its messages to syslog. Applications can write to syslog or the journal. It's Linux, where everything is configurable. But for now, we'll focus on the defaults: where can you typically find different types of logs in most modern distributions?

Log Files Location: Where Are They Stored?

Typically, you'll find Linux server logs in the /var/log directory. This is where syslog daemons are normally configured to write. It's also where most applications (e.g., Apache HTTPD) write by default. For Systemd journal, the default location is /var/log/journal, but you can't view the files directly because they're binary. So how do you view them?

How to Check Linux Logs

If your Linux distro uses Systemd (and most modern distros do), then all your system logs are in the journal. You can view them with journalctl, and you can find the most important journalctl commands here. If your distribution writes to local files via syslog, you can view them with standard text processing tools, such as cat, less or grep:

# grep "error" /var/log/syslog | tail
Mar 31 09:48:02 ubuntu-bionic rsyslogd: unexpected GnuTLS error -53 - this could be caused by a broken connection. GnuTLS reports: Error in the push function. [v8.2002.0 try https://www.rsyslog.com/e/2078 ]
...

If you're using auditd to manage audit logs, you can check them in /var/log/audit.log by default, but you can also search them with ausearch. That said, you're better off shipping these security logs to a central location, especially if you have multiple servers. For this task, a tool like Auditbeat might work better than auditd. We wrote a separate tutorial on centralizing audit logs with Auditbeat, but in the next section we'll focus on centralizing Linux system logs in general.

Centralizing Linux Logs

System logs can be in two places: systemd's journal or plain text files written by a syslog daemon. Some distributions (e.g., Ubuntu) have both: journald is set up to forward to syslog. This is done by setting ForwardToSyslog=Yes in journald.conf.

Centralizing Logs via Journald

Our recommendation is to use the journal-upload to centralize logs, if the distribution has systemd. You can check this by running journalctl - if the command isn't found, you don't have the journal. As promised earlier, you can centralize your system logs to Sematext Cloud with three commands:

  1. Install journal-upload. On Ubuntu, this works via sudo apt-get install systemd-journal-remote
  2. Configure journal-upload. In /etc/systemd/journal-upload.conf, set URL=http://logsene-journald-receiver.sematext.com:80/YOUR_LOGS_TOKEN
  3. Start journal-upload now and on every boot: systemctl enable systemd-journal-upload && systemctl start systemd-journal-upload

Alternatively, you can use Logagent's journal-upload input to gather journal entries from one or more machines, before shipping them to a central location. That central location can be Sematext Cloud, a local ELK stack or something else: If you want to learn more about journald and journalctl, as well as the options you have around centralizing the journal, have a look at our complete guide to journald.

Centralizing Logs via syslog

There are a few scenarios in which centralizing Linux logs with syslog might make sense:

  • Your Linux distribution doesn't have journald. This means system logs go directly to your syslog daemon
  • You want to use your syslog daemon to collect and parse application logs as well. An example is described in our tutorial for Apache logs with rsyslog and Elasticsearch.
  • You want to forward journal entries to syslog (i.e., by setting ForwardToSyslog=Yes in journald.conf), so you can use a syslog protocol as a transport. However, this approach will lose some of journald's structured data: journald only forwards syslog-specific fields.
  • Similar to the above, except that you'd configure the syslog daemon to read from the journal (like journalctl does). This approach doesn't lose structured data, but is more error prone (e.g., in case of journal corruption) and adds more overhead.

In all situations listed above, data will go through your syslog daemon. From there, you can send it to any of the supported destinations. Most Linux distributions come with rsyslog installed. To forward data to another syslog server via TCP, you can add this line in your /etc/rsyslog.conf:

*.* @@logsene-syslog-receiver.sematext.com

This particular line will forward data to Sematext Cloud's syslog endpoint, but you can replace logsene-syslog-receiver.sematext.com with the host name of your own syslog server. Some syslog daemons can output data to Elasticsearch via HTTP/HTTPS. rsyslog is one of them and so is syslog-ng. For example, if you use rsyslog on Ubuntu, you'll install the Elasticsearch output module first:

sudo apt-get install rsyslog-elasticsearch

Then, in the configuration file, you need two elements:

  1. A template that formats your syslog messages as JSON, for Elasticsearch to consume
template(name="LogseneFormat" type="list" option.json="on") {
 constant(value="{")
 constant(value="\\"@timestamp\\":\\"")
 property(name="timereported" dateFormat="rfc3339")
 constant(value="\\",\\"message\\":\\"")
 property(name="msg")
 constant(value="\\",\\"host\\":\\"")
 property(name="hostname")
 constant(value="\\",\\"severity\\":\\"")
 property(name="syslogseverity-text")
 constant(value="\\",\\"facility\\":\\"")
 property(name="syslogfacility-text")
 constant(value="\\",\\"syslog-tag\\":\\"")
 property(name="syslogtag")
 constant(value="\\",\\"source\\":\\"")
 property(name="programname")
 constant(value="\\"}")
}
  1. An action that forwards data to Elasticsearch, using the template specified above
module(load="omelasticsearch")
action(type="omelasticsearch"
 template="LogseneFormat" # the template that you defined earlier
 searchIndex="LOGSENE_APP_TOKEN_GOES_HERE"
 server="logsene-receiver.sematext.com"
 serverport="443"
 usehttps="on"
 bulkmode="on"
 queue.dequeuebatchsize="100" # how many messages to send at once
 action.resumeretrycount="-1") # buffer messages if connection fails

The above example shows how to send messages to Sematext Cloud's Elasticsearch API, but you can adjust the action element to point it to your local Elasticsearch:

  • searchIndex would be your own rolling index alias
  • server would be the hostname of an Elasticsearch node
  • serverport can be 9200 or a custom port Elasticsearch listens to
  • usehttps="off" would send data over plain HTTP

Whether you use a syslog protocol, the Elasticsearch API or something else, it's better to forward syslog directly from the syslog daemon than to tail individual files from /var/log using a different log shipper. Tailing files will add overhead and miss some of the metadata, such as facility or severity. Which is not to say that files in /var/log are useless. You'll need them in two scenarios:

  • Logs of applications that write directly to /var/log. For example, HTTP logs, FTP logs, mysql logs and so on. You can tail such files with a log shipper. We have tutorials on parsing apache logs with rsyslog and with Logstash.
  • Process system logs with UNIX text tools like grep. Here, different log files contain different kinds of data. We'll look at the typical configuration in the next section.

What Are the Most Important Log Files You Should Monitor

By default, some distributions write system logs to syslog (either directly or from the journal). The syslog daemon writes these logs to files under /var/log. Typically that syslog daemon is rsyslog, though syslog-ng works in a similar fashion. In this section, we'll look at the important log files and:

  • what kind of information you'll find in them
  • how is rsyslog configured to write there (in case you want to change the configuration)
  • how to view the same information with journalctl, in case it doesn't forward to syslog

/var/log/syslog or /var/log/messages

This is the “catch-all” of syslog. For example:

# logger "this is a test"
# tail -1 /var/log/syslog
May 7 15:33:11 ubuntu-bionic test-user: this is a test

Typically, you'll find all messages here (error logs, informational messages, and every other severity), as this line from /etc/rsyslog.conf suggests:

*.* /var/log/syslog

The only exception is the stop action. For example, you may find something like this:

:msg,contains,"[UFW " /var/log/ufw.log
& stop

In plain English, this block says:

  • If the msg property of this message contains "[UFW "
  • Then write to /var/log/ufw.log (the file output module is implied)
  • If the action succeeds (&), then don't process this message further (stop)

So if the /var/log/syslog action comes later, it won't write UFW messages there. If there's nothing in /var/log/syslog or /var/log/messages, you probably have journald set up not to forward to syslog. The same data (and more) can be viewed via journalctl with no parameters. By default, journalctl pages data through less, but if you want to filter through grep you'll need to disable paging:

# journalctl --no-pager | grep "this is a test"
May 07 15:33:11 ubuntu-bionic test-user[7526]: this is a test

/var/log/kern.log or /var/log/dmesg

This is where kernel messages go by default:

Apr 17 16:47:28 ubuntu-bionic kernel: [ 0.004000] console [tty1] enabled

It's really down to filtering syslog messages by the kern facility:

kern.* /var/log/kern.log

If you don't have syslog (or the file is missing) and you have journald, you can show kernel messages in journalctl:

# journalctl -k
...
Apr 17 16:47:28 ubuntu-bionic kernel: console [tty1] enabled
...

/var/log/auth.log or /var/log/secure

This is where you find authentication messages, generated by services like sshd:

May 7 15:03:09 ubuntu-bionic sshd[1202]: pam_unix(sshd:session): session closed for user vagrant

This is another filter by facility, this time by two values (auth and authpriv):

auth,authpriv.* /var/log/auth.log

You can do such filters in journalctl as well, except that you have to provide numeric facility levels:

# journalctl SYSLOG_FACILITY=4 SYSLOG_FACILITY=10
...
May 7 15:03:09 ubuntu-bionic sshd[1202]: pam_unix(sshd:session): session closed for user vagrant
...

/var/log/cron.log

This is where your cron messages go (i.e., jobs that run regularly):

May 06 08:19:01 localhost.localdomain anacron[1142]: Job `cron.daily' started

Yet another facility filter:

cron.* /var/log/cron

With journalctl, you'd do:

# journalctl SYSLOG_FACILITY=9

/var/log/mail.log or /var/log/maillog

Email daemons such as Postfix typically log to syslog in the mail facility, just like cron logs to the cron facility. Then, rsyslog puts these logs in a different file:

mail.* /var/log/mail.log

If you're using journald, you can still view mail logs with:

# journalctl SYSLOG_FACILITY=2

Because journald exposes the syslog API, everything that normally goes to syslog ends up in the journal.

TL;DR Takeaways

Let's summarize the actionables here:

Posted on by:

radu0gheorghe profile

Radu Gheorghe

@radu0gheorghe

I do Elasticsearch and Solr consulting. And training. Wrote Elasticsearch in Action. rsyslog fanboy.

Sematext

Metrics, Traces, Logs - Infrastructure Monitoring, APM, Log Management - all in 1 place. We're hiring!

Discussion

markdown guide
 

Just as a meta-heads up, this syndication on DEV is already out-ranking the original on your site for "linux logging," "what are linux logs," and "linux logging tutorial." The canonical link is a suggestion (which Google sometimes ignores), rather than a command, and this is a case of "suggestion discarded."

This is a good post -- it could probably rank for all of those searches from the original on sematext.com if you gave it enough time.

Anyway, feel free to consider or ignore as you prefer, of course. I'm just throwing it out there because I oversee a lot of organic post creation and subsequent syndication and it surprises a lot of people to learn that canonical linking doesn't always work as intended.