Perhaps it happened to you. You’re working on a project, developing some fancy software where users can do all sorts of cool things. As part of this work, your software needs to be able to recognize your users. Perhaps there’s something your users store for later, some content they create, or you simply need to know who paid and who didn’t.
That’s simple, right? You authenticate the users using a federated login provider and the only extra work you need to do is to store the link between the user ID and whatever resource in your software that you want tied to that user. It’s a simple data model — each user has access to a single data blob. This is the user and this is the resource they own. You store it in a database, problem solved.
Sometimes the story ends here. More often than not though, your product gets new features and the number of things you need to keep track of for each user keeps growing. Perhaps there are different types of content each user can create? Or different features they should have access to based on what they pay? It may even be something as simple as listing all documents or pictures the user owns in your software. Your database needs indexing now. You’re still tracking the link between a user ID and resource, except the ‘resource’ is now a list. It’s still simple, but you have to start considering performance and optimize data fetching. Oh well, nothing too complicated.
Now imagine that you have some sort of interaction between different users. Maybe there’s a whole company using your software, with administrators and regular users. Or perhaps users want to share their content with others. Or there’s some way for more than one person to collaborate on the same resource. Your old data structure, where you had a user ID and a linked list of resources, is no longer sufficient. Now you not only need to keep track of the users and their respective resources, you also need to know who can do what actions on a given resource. Your old tables may still work, but you need a different strategy for indexing. You also definitely need more tables and more ways to access the data. We’re starting to talk about many to many relationships. Things quickly get messy. It’s still all manageable, probably.
If your project is complete, congratulations. However, likely, you’ll keep adding features to your product, and you’ll start to discover the more things your users can do, the more roles they can play and more resource types they can engage with. This means your simple data model gets more and more complicated. Your indexing needs to be a first class notion, and you need to think more often about the speed of data fetching. What started as a single database table is now multiple tables with relationships that may need to change as you work on new features. What a headache.
Speaking of change, it’s unlikely that your original vision of resource hierarchy and user permissions will hold true forever. As you add more and more features, as requirements change, you’re making tweaks to your roles and types of data you’re storing. Perhaps your notion of resources changed and now you have sub-resources? Or you need to deal with user or role groupings? One day, your old queries no longer work. Your old tables can’t simply be extended to hold the additional complexity, they have to be redesigned. You have a data migration on your hands.
What’s more, your indexes have to be rebuilt. Things start to get slower and slower. You make some changes and it turns out that you accidentally let one user see another user’s private resources. Luckily, you discover that bug right away and spend the evening fixing it — phew, hopefully no actual users found out!
Over time, your product grows. You get more and more users. Whenever you fail to quickly and correctly resolve what resources should be available to which user, many people get annoyed. Whenever your authorization code fails for some reason, your whole application is severely affected. What started as a simple bit of code, is now a critical, ever growing component that you have to maintain in top shape. Perhaps it’s time to have a dedicated team working on it?
I’ve been through this journey myself, more than once. Every time, I was looking back and cursing myself for ever thinking that authorization is a simple problem. It may start simple, but rarely remains so. It usually ends up in a headache and lots of wasted hours.
It may seem like a good idea to build it all yourself. While your application is unique, the authorization problems it faces rarely are. It’s just one of the areas where it’s better not to reinvent the wheel and use something off the shelf.
Originally published at https://authress.io on May 29, 2020.