When running an application in a local development environment, the de facto tactic to see what it’s doing is, of course, to attach a debugger. You can step through the code line by line and inspect every variable along the way. It’s a no-brainer. But what about higher environments like QA, Staging or Production where you cannot attach a debugger? How do we gain insight into what an application is doing in these environments? One word: Log!
Log Levels
Logs aren’t just for capturing errors. A robust logging strategy makes thorough use of all of the different log levels your logging framework has to offer. As a .NET developer, I've used several different logging frameworks over the course of my career, each of which offers the same six log levels, but the naming differs a bit between them. For consistency's sake, we’re going to use the names of the Microsoft log levels in this article.
Microsoft | Serilog | lognet | Description |
---|---|---|---|
Critical | Fatal | FATAL | Logs that describe an unrecoverable application or system crash, or a catastrophic failure that requires immediate attention. |
Error | Error | ERROR | Logs that highlight when the current flow of execution is stopped due to a failure. These should indicate a failure in the current activity, not an application-wide failure. |
Warning | Warning | WARN | Logs that highlight an abnormal or unexpected event in the application flow, but do not otherwise cause the application execution to stop. |
Information | Information | INFO | Logs that track the general flow of the application. These logs should have long-term value. |
Debug | Debug | DEBUG | Logs that are used for interactive investigation during development. These logs should primarily contain information useful for debugging and have no long-term value. |
Trace | Verbose | ALL | Logs that contain the most detailed messages. These messages may contain sensitive application data. These messages are disabled by default and should never be enabled in a production environment. |
Best Practices
It’s very common for the only logging in an application to be Error
logs from within try/catch blocks. While this is a good thing to do, if that’s all you’re doing, then you’re going to have very limited visibility into the execution of your application once it leaves your local development environment. Consider making better use of the other log levels according to the best practices described below.
Set the Current Log Level Via External Configuration
Setting the current log level should always be an operational task; never hard coded. External configurations allow operators to toggle verbose logging on demand via environment variables, configuration files, etc.
Default to the Information Log Level
When choosing a global default log level (e.g. in web.config
or appsettings.json
) for your application, Information
is usually a good choice. This means that Critical
, Error
, Warning
and Information
log entries will be written while Debug
and Trace
log entries will be suppressed. You might then want to provide a different default log level per environment (e.g. in web.{env}.config
or appsettings.{env}.json
); perhaps Trace
in Development and Debug
in QA.
Log Catastrophic Failures Using the Critical Log Level
As developers, we're are usually pretty good about logging run-of-the-mill errors: catch exception, log an error message, carry on. But catastrophic failures are a different animal.
A situation might arise where an application encounters an unrecoverable error during its start-up routine and throws an exception. You can catch the exception, log it using the Critical
log level instead of Error
and exit cleanly. In this flow, the log entry should make it out of the application since the application decided to end its own life as oppose to dying in a sudden accident.
Other catastrophic failures aren’t so easily handled. Maybe an exception is thrown during startup that is outside of your reach or Kubernetes decides to kill a container from the outside due to it hitting a memory limit. In the former case, any log entries that might be written are like people trying to escape a burning building; some might make it out before the building collapses on them, others might not. In the latter case, the application itself is unaware of the failure. One millisecond it is healthy and executing and the next millisecond its entire execution environment has been nuked. There isn’t the opportunity to even attempt to write a log entry. Be aware of these potential cases and work with your operations team to diagnose them when they occur.
Use the Warning and Information Log Levels Judiciously
Critical
and Error
are pretty easy to use appropriately (i.e. when things go off the rails) but Warning
and Information
are not so clear.
Warning
should be reserved for logs that are not errors—they don’t impede execution—but are not entirely normal behavior. Examples might be a cache miss on an expensive object that really should have been in the cache, a piece of code that completed but took longer than expected, a failed login attempt, or an access control violation. The key though is that the warning should be actionable. You shouldn’t clutter the logs with warnings that you don’t intend to do anything about. Only use them if you plan to do something if/when that warning appears consistently in the logs.
Information
should be used when the application is executing normally but you want to communicate something to the operator. The most common of these might be application life-cycle events, like Application started
or Application stopped
. In the absence of a monitoring tool (like New Relic, AppDynamics or DataDog) it might be appropriate to log application events at the Information
log level, but if you have these tools, you should consider writing application events there rather than to your logs.
Don’t use either of these log levels so often that your application is writing copious logs when things are going smoothly. A healthy application’s logs should be mostly silent; perhaps only infrequent Information
log entries to confirm that it is still there.
Step Through Code Using the Debug Log Level
When your application is misbehaving is when you’ll want the ability to get enhanced visibility into what it’s doing. Your first inclination might be to try to reproduce the bug in a local development environment with a debugger attached, but this isn’t always possible. You might not know enough about the bug to reproduce it locally. It might be unique to the configuration or data of a particular environment. “It works on my machine.” Right? So what do we do in these cases? Well, if you’ve made good use of the Debug
log level when you wrote your code, a simple configuration change can get you more detailed logging from the environment where the issue is occurring.
But what does “good use of the Debug
log level” look like? To answer that, lets think about how we use a debugger. When you’re debugging an application, the first thing you might do is put a bunch of strategically placed breakpoints in the code to see which ones are hit, in which order, how many times, etc. You’d normally do this ad hoc after a bug has been discovered, but what if you did it proactively and methodically during development? You could log a Debug
entry at each place you’d potentially put a breakpoint with your debugger. Later, when your application misbehaves in a higher environment, you can ask operations to switch the log level to Debug
, see the log entries, and follow along in the source code to see what your application is doing out in the wild. Think of it like leaving a map for your future self.
These log entries should just be mile markers so you can identify which lines of code your application is executing. Don’t put any data, like the values of variables and parameters, in these logs. That should be reserved for the Trace
log level.
Inspect Variables Using the Trace Log Level
Trace
is the most detailed log level and should only be turned on as a last resort. When stepping through the code with the Debug
log level isn’t enough, use Trace
to write the values of variables, parameters and settings to the logs for inspection. This mimics the behavior of inspecting the contents of variables with the debugger attached. It should only be done as a last resort for two reasons: 1) if you’re being thorough, this will drastically increase the amount of log data your application will produce, which has storage, cost and performance implications and 2) there could be sensitive data in the log entries written using this log level (more about this in the next section).
Never Log Personal Identifying Information or Secrets
Never—and I really mean never—write personal identifying information, credentials or other secrets to any log level (including Trace
). Seriously, don’t do it!
Personal Identifying Information
Regarding PII, writing this information to logs makes your life that much harder when dealing with GDPR, CCPA and whatever other privacy laws pop up in the future. You don’t want to have to worry about purging PII from your logs because your log retention is greater than the 30-day deadline for complying with GDPR requests.
PII you should avoid logging includes, but is not limited to:
- First Name
- Last Name
- Username
- Gender
- Birthday
- Mailing/Billing Addresses
- Email Addresses
- Phone Numbers
- Social Security Number
- Credit Card Numbers
Secrets
Regarding secrets, writing this information to logs is a big security faux pas. Logs are often openly available to a broad audience in a technology organization so anything written there is visible to lots of people. This means that if anything shows up in the logs that isn’t meant to be broadcast to everyone, the admin of that system has to change it in order to maintain security.
Secrets you should avoid logging include, but are not limited to:
- Usernames
- Passwords
- API Keys
- Encryption Keys
- Connection Strings
- Magic URLs
Example
public class Goku
{
private readonly ILogger _logger;
public Goku(ILogger<Goku> logger)
{
_logger = logger;
}
public int PowerUp(int desiredPower)
{
_logger.LogDebug($"BEGIN {MethodInfo.GetCurrentMethod()}");
_logger.LogTrace("desiredPower = {desiredPower}", desiredPower);
try
{
var power = 0;
var warned = false;
for (; power < desiredPower; power += 100)
{
_logger.LogDebug("Charging ...");
_logger.LogTrace("power = {power}", power);
_logger.LogTrace("warned = {warned}", warned);
if (power > 9000 && !warned)
{
_logger.LogWarning("It's over 9000!");
warned = true;
}
}
_logger.LogDebug("Power up complete.");
_logger.LogTrace("power = {power}", power);
}
catch(Exception ex)
{
_logger.LogError(ex, "Failed to complete power up.");
}
_logger.LogDebug($"END {MethodInfo.GetCurrentMethod()}");
return power;
}
}
Top comments (2)
Great article! As knowing which log level to use can be tricky (and not an exact science 😅) at some point my team decided to limit to only 2 levels:
debug
anderror
witherror
logs being sent via Slack/Email to make sure they don't go silent.Most people think of logging as text based, but structured logging to a database is incredibly useful. It allows you to track down specific instances of something failing by searching with parameters you suspect.
You can also save a lot of space by simply storing the hash of log message template, instead of the merged string.