Shabd Saran

Posted on Aug 2, 2023

Contextualising identifiers in a system

#systemdesign #architecture #watercooler

Introduction

At least two unique identifiers exist even in the most rudimentary authentication systems: user ID and session ID. It is hard to keep track of different identifiers in a system. These identifiers are logged to files, reported to Sentry, sent to the frontend, passed around to developers in Slack in bug reports, and, most importantly, shared with non-programmers in the team.

In no time, the user.Id is propagated as just 507f1f77bcf86cd799439011, which is then, for example, shared on Slack by customer support to report a bug. The developers then have to make sense out of this random ID and try to map it to an actual database record. The backtracking becomes a nuisance when the context is unclear.

The missing jigsaw piece

The identifier may propagate to a log file, Slack message, Sentry report, CMS dashboard or Google Analytics. The root cause of the problem is the loss of context from the identifier. The problem seems trivial if the system is not large enough. In systems with myriad unique identifiers, each representing an essential entity, issues related to backtracking or context retention can escalate rapidly.

But what if the context is attached to the identifier itself? That will solve the problem. The identifier may reach the end of the universe, but the relevant context will travel with it.

That is where identifier prefixing comes in.

Expounding the argument

Most modern systems have identifiers like TXN_507f1f77bcf86cd799439011. Here, the prefix “TX_” contextualises the identifier. For example, the given identifier can be guessed as a transaction ID without additional information.

The idea is to attach meaningful prefixes to critical identifiers. That will ensure these identifiers easily differentiate from the noise outside the system. The prefix is carrying relevant information about the identifier itself.

Getting down to business

Here is an example to demonstrate the implementation.

A program uses UUID strings to assign unique IDs to its database records. The program MUST respect the UUID formatting during retrieval and modification of the database documents. But it may ignore the formatting elsewhere.

In such a system, all the entity modules can implement toUUID and fromUUID functions. Role of these functions, implemented internally, will be to convert a (prefix)_ID to ID and vice-versa, respectively. These functions can translate prefixed identifiers to UUID formatted strings before the database query construction. Attachment of prefix to the returning value of identifiers from the entity modules can also be guaranteed.

The type of the IDs returned from the modules may only be asserted as a string (as opposed to a valid UUID). It is one of the caveats of the implementation: only the entity module can assert the UUID formatting. Everyone else may treat the ID as any other string.

The prefixed identifier may now be sent to Sentry or logged to AWS S3, but it will always carry the context of the identifier with it. Even the non-programmers will recognise a transaction ID from the CMS dashboard. What a relief!

Conclusion

The idea is simple enough to explain without code. It just need two functions to convert an ID to a prefixed ID and vice-versa. But the value addition is massive.

Identifier contextualising is not a new concept. People have been doing it for years now.

Top comments (5)

Anubhav Singhal • Aug 2 '23

Is there a better way than prefix?
In vast system, prefix context might also become confusing jargon. How about the UUID encryption contains a 2nd layer info about the context which can be resolved by platforms to get context?

Shabd Saran • Aug 2 '23

That's an interesting approach. Hadn't thought of it. Regardless, the suggested approach also takes in account non-programmers. It will be easier for them to recognise an identifier with a prefix in a third-party ecosystem, e.g. the Google Analytics dashboard.

kailash chandak • Aug 2 '23

I believe its better to handle at different level rather the actual value itself.
How about having structured logs(for logging system) with key=value approach to handle this.

Shabd Saran • Aug 2 '23

That's an approach generally taken by systems that want to retain the context in logs. However, it's difficult to predict the exact scope of an identifier. For example, what if the user ID ends up in the Google Analytics dashboard? We might not always have the freedom to tag context of an identifier especially when it propagates outside the system.

Anubhav Singhal • Aug 9 '23

agreed.

DEV Community

Contextualising identifiers in a system

Introduction

The missing jigsaw piece

Expounding the argument

Getting down to business

Conclusion

Top comments (5)

Read next

LangGraph State Machines: Managing Complex Agent Task Flows in Production

3-Tier Architecture in Azure Cloud: A Comprehensive Deployment Guide

Building a tiny type-safe typescript ECS (Entity-component-system)

Should web designers learn JavaScript or CSS?