The dozen or so articles and threads I've read over the years regarding implementing multi-tenancy in SaaS platforms fail to cover areas outside of the database itself. When it comes to a cross-tenant data leak, chances are it won't be an issue at the database level that gets you.
Cache Layer
Most architectures have a caching layer (e.g. Redis) in use for all sorts of purposes. Chances are you are not deploying a redis/memcache server for each tenant, so you're probably using the common practice of putting the unique tenant id in the key, maybe at the beginning, such as "1234:users" for tracking users of each tenant.
What happens if the variable used to put the tenant is wrong? Such as an undefined variable in JS:
${teantId}:users
instead of${tenantId}:users
. Subtle typo, but it will work, and produceundefined:users
as the key that all your tenants will access. Well, there's a cross-tenant leak that your days/weeks of obsessing over your database design didn't capture.Or maybe you have the correct variables, but in the wrong order:
${accountId}:${tenantId}:users
instead of${tenantId}:${accountId}:users
. Good luck catching either of these in a code review!
A basic way to prevent these issues is to have the tenantId passed in as a param to all cache layer requests. Don't just directly access redis. Instead, create some interface to abstract away redis and force the tenantId to be passed in, and then validate the value.
External Services
You can't just obsess over a leak in your code... you need to guard against an external service having their own leak as well! Yes, this happens, even for major cloud providers. Oh the stories some of us can tell.
Anyway, if the response from the external call provides the tenant identifier somewhere then check that it is what you expected. Otherwise, if you just take it as is, you've just possibly polluted your database with someone else's data, and that isn't gonna be easy to clean up.
Added bonus to anyone who goes back to their code and makes sure their API includes a field in the body or headers to identify the tenant for the response.
Wrap up
Be paranoid! Everyone is out to get you. And chances are, the place you obsessed the most about multi-tenancy issues isn't gonna be the place that ultimately burns you. Put guards and alerts in your code if there's even a hint of a leak. And remember, if you don't store data then it can't be leaked. :-)
Top comments (0)