DEV Community

Cover image for Top 3 Ways to Log Errors to Simplify Debugging For Systems Engineers

Posted on • Updated on • Originally published at

Top 3 Ways to Log Errors to Simplify Debugging For Systems Engineers

How often do you find yourself spending hours trying to triage a failure in production?

Debugging is challenging because we don't know what we're looking for, and to top it off, we have no way to pick on the trail of errors.

The most obvious first step is to ensure all services are logging errors to a standard logging tool.

Logs are your guardian angels when it comes to finding root causes of failures.

Let's understand how to befriend these angels and streamline debugging.

Here's a sample scenario.

A user wants to open an online bank account.

The backend function calls are -

1. (api) POST /accounts
    2. (api) CreateAccountHandler()
       2.1 (internal) AccountsSvc.InitiateCreateAccount()
       2.2 (internal) AccountsSvc.ValidateUser()
           2.2.1 (validator) ValidateID()
       2.3 (internal) AccountsSvc.ValidateBankDetails()
           2.3.1 (validator) ValidateID()
       2.4 (internal) AccountsSvc.CreateAccount()
           2.4.1 (database) Store.SaveAccountDetails()
        (writer) Data.Write()
     3. (serialization) Response.Serialize()
Enter fullscreen mode Exit fullscreen mode

#1 Relative Messaging

Steps 2.2.1 and 2.3.1 have the same ValidateID() function call.
If the ValidateID() call fails to validate the user, we want to log about this correlation to pinpoint the exact failure.
Use this template across all appropriate levels -

Failed to perform X while attempting to do Y

So, a failure in ValidateID() at 2.2.1 would result in errors logged from the bottommost to the topmost caller -

Error Log Trace
- (validator/ValidateID) Failed to validate the ID.
- (internal/ValidateUser) Failed to validate the ID while attempting to validate user details.
- (internal/InitiateCreateAccount) Failed to validate user details while attempting to create a new account.
Enter fullscreen mode Exit fullscreen mode

#2 Log Across Microservices

Imagine there are 2 microservices involved in this sample example.

API service accepts incoming http requests.
Internal Service performs the CreateAccount operations.
Enter fullscreen mode Exit fullscreen mode

If all microservice log the shared context, correlating the API service errors with internal service becomes easy.

The shared context is metadata that helps filter loglines - a request ID, a user ID or an account ID.

A logline is the "vertex", and the shared context is the segment that connects each vertex.

#3 Avoid Pseudo Errors

We retry operations to deal with intermittent failures such as a flaky network.
Use the correct log level to separate intermittent failures from final failures.

A simple way to do this is by using the "warn" level for intermittent failures and the "error" level for the final failure.

I hope these simple improvements streamline debugging failures.

Want to add more best practices to this list? Let me know in the comments!

Discussion (0)