This post was first posted on my blog
It is no coincidence that designers of software systems are called architects. Software architects have borrowed quite a lot of concepts from civil engineering. Multi-tenancy is one such concept borrowed from civil engineering. Much like in civil engineering, software multi-tenancy optimizes resource utilization, but offers flexibility of customization.
Let us understand the concept from civil engineering first.
I was born in an independent house. Small house, yet independent. We styled it whatever way we wanted it. We partitioned the house into two and had a cow shed on one side and a living space on the other. We didn't have to seek permission from our neighbors to do that. We had the freedom to do whatever we wanted within our plot of land. That freedom had a cost. If we had any issue with any utilities (say water) we had to handle it ourselves.
When I went to college, I lived in a hostel. Each of us had an independent room. We could bring our own furniture. Some of us brought bare minimum to get through four years, while few rich kids brought luxurious furniture. The college took care of the maintenance and repairs. While we were allowed to bring our own furniture, we couldn't modify the rooms, not even the color of the walls. The rooms did not represent us.
After few years in job, I bought a flat. Each of the 200 owners have the same layout but each of us have filled our homes with our own style of furniture. Some of them have customized even further. They have a sound-proof room for audio recording! Like in our hostels, the maintenance and repair is by a common agency.
Now let us understand the concept from software engineering perspective.
Let us talk with examples from software industry.
Say that I want to host my own email and document server. After much search I choose owncloud1. I install it on my own server. I create one account for myself and I use it. This is a single-user, single-tenant system. I'm the user as well as the tenant. I can make whatever modification I want on the system.
Want to host it on a separate domain? Check.
What to change the colors in the system? Check.
What to pick and choose modules? Check.
As I get used to this system, I want to share the calendar with my wife and family. I create individual accounts for each of them. Now the system becomes multi-user, but it is still single-tenant. Each user can customize few features, like calendar name, but they can't pick and choose a functionality. If my sister wants to sync this calendar with her gmail calendar, she can't. Why? Because if I enable it, it will be enabled for all and I don't want to confuse my father with a new option. There is no option to enable it only for one user or set of users.
Now my colleagues Martin and Bob are impressed with how I manage my family schedule and they want to do the same. I could install owncloud in their own server. They will get the freedom to customize the way they want. But they don't want the headache of managing their servers and monitoring them. So they ask me to manage for them. They also tell me, they don't want to share their data with anyone else. They ask me if I could set up independent data storage for them. It doesn't stop. Martin wants only calendar facility and want to access the system via schedules.martinfamily.com. Bob on the other hand, wants to access the system from familyroom.bob.com. As indicated by the url, he wants to store family documents in addition to using calendar. He wants AWS S3 to be his data store. I can setup all of these because Owncloud supports multi-tenancy. They are pleased with the setup and invite all their family members to use the system. They decide to pay me for my service. I charge them according to the usage instead of a flat rate. (I wish I could end the story like the fairy tales: I lived happily ever after in Hawaii :-) ). This setup is now a multi-user, multi-tenant system.
As you can understand, not all web applications have to be multi-user system and not all multi-user web-applications have to be multi-tenant system. All solutions are contextual. An architect has to select the right approach for a given set of requirement.
Architecting a multi-tenant system is more complicated than architecting a multi-user system. There are two models of architecting a multi-tenant system: instance replication and data-segregation.
In instance replication model, the system spins a new instance for every tenant. This is easier to start, but hard to scale. It becomes a nightmare when 100s of tenants signup.
In data-segregation model, the application is shared between tenants but data of each tenant is stored in separate data stores. Separate data stores could be separate databases or separate schema within the same database.
To support such data-segregation, there has to be an additional management layer in the system architecture to provision a separate data-store every time a tenant signs up. Multi-tenancy includes customized UI for each tenant, selective subscription of services, and metered billing. This management layer is responsible for all of these functions.
By now it should be obvious that multi-tenant systems are not appropriate for every web-application. A multi-user architecture is sufficient enough for a B2C (business-to-customer) web-application. A multi-tenant architecture should be considered for a B2B (business-to-business) application.
Even though a multi-tenant architecture is a complex one, data privacy regulations like GDPR will force most B2B SAAS applications to become a multi-tenant system. If you are building a new B2B SAAS application, you better start with a multi-tenancy architecture.
-
I have no connection with owncloud. This is only an example.Β β©
Top comments (6)
There's a third model for data segregation: a single schema in a single database, with data identified as belonging to one or another tenant by a key column. Usually the key can be traced back to a table representing all tenants in the system. If you set up foreign keys appropriately, you only need the tenant id in tables immediately related to your tenants, since rows in tables further out are related to rows which are related to tenants. There can be good reasons to add the tenant id field where it's not strictly needed, such as avoiding poorly-performing joins.
The shared-tables approach is by far the simplest and lightest in terms of infrastructure. You don't have to worry about staging new data storage for new tenants, and you don't have to juggle connections to query data for tenant A instead of tenant B: just ensure you're filtering for the correct tenant id. You may not even need that management layer.
It's not all roses, though; you have to be really careful about managing your tenant ids to ensure that nobody sees anyone else's stuff. In certain industries like health or finance, standards or regulations dictate stricter segregation of tenant data. And everyone's data living in the same tables makes backups, restores, and exports an all-or-nothing proposition; I had to develop a tool to let me work with discrete tenant data sets.
There's one other caveat I forgot to mention: it's difficult if you let tenants manage installations and/or dictate the pace of upgrades they receive, which if you're not selling service licenses only you probably are. That was the original job I built arachne for, pulling tenant data out of shared storage when they wanted to go on-premise instead. Had I anticipated that scenario I might have done things differently on that one.
This is the golden way to do it. With DB level partitioning and Globally unique table keys you can even move data between tenant groups in different databases. Yes a little bit advanced, but great Stuff :-).
This is actually the way .net does it by default. I'm surprised the original author didn't include it.
Good article!
Explains a complex subject in a simple way.