At work, we have a multi-tenancy setup where for performance purposes we split tenants across multiple identically-structured schemas. While giving each tenant their own schema would help with isolation, having too many separate database objects would make migrations and using pg_dump too resource intensive. Instead, we use 100 numerically named schemas to randomly distribute tenants across the schema space.
This works great as an easy way to kick the can of application performance down the road. We have a
master database which contains a record of all tenants and which schema their data is contained in. The app simply looks up that tenant and uses its corresponding schema.
The difficulty when performing queries across the entire database, either for analytics purposes, or to find a particular object without knowing which tenant it belongs to. This article by Dave Clark is a great starting point, but it wasn't quite doing it for me.
I've been able to combine his "application generated union query" and "union view" approaches into something that works really well for my use case of ad-hoc database exploration of just a handful of tables at a time, and doesn't require anything except pure SQL. My approach doesn't rely on periodic view refreshing, excessive load on the database, or wasting space on materialized views.
I didn't figure this out on my own; because postgres is generally statically typed, it's tricky to get it to do things in a generic or dynamic way. If you want to see my creative process, check out this Stack Overflow question.
Here's the code:
CREATE OR REPLACE FUNCTION app_union(tbl anyelement) RETURNS setof anyelement AS $$ BEGIN return query execute string_agg( distinct format('select * from %I.%I', schema, pg_typeof(tbl)::text), ' union all ' ) from tenants; END $$ LANGUAGE plpgsql; select * from app_union(null::my_table);
Simple, right? Breaking it down, there are two Postgres features that make this sort of thing possible: polymorphic types, and
record types are statically typed. This means that when returning a row or set of rows from a function, the caller has to provide the full row type definition of the result set. This looks something like
select my_function('table_name') t(id uuid, name text...). This is incredibly laborious to write out for ad-hoc queries.
Turns out, you can simply tell postgres to chill by using polymorphic types. However, the type still needs to be something postgres can infer, which is why we pass a null row of the table we want to the function instead of the name of the table as text.
We then build the query by using
pg_typeof to get the record type of the row we passed in, and casting that type to text.
Yes, I did promise this wouldn't hammer your database at the beginning, didn't I? Well, it turns out with dynamic sql, adding a where clause to the function is trivial. Here's my final version:
CREATE OR REPLACE FUNCTION app_union(tbl anyelement, extra text) RETURNS setof anyelement AS $$ BEGIN return query execute string_agg( distinct format('select * from %I.%I %s', schema, pg_typeof(tbl)::text, extra), ' union all ' ) from tenants; END $$ LANGUAGE plpgsql;
To solve my performance issues, I simply added an
extra text parameter to my function. This lets me pass any arbitrary sql into the function to go after the
from in every sub-query. So, for example
select * from app_union(null:my_table, 'where n > 1') lets me apply the filter at the original table level rather than pulling all 12 million records and applying the filter to the union query.
I wanted to get a list of all tenants active in the last 7 days. With
app_union, it's a breeze:
select name, created, last_activity from tenant join ( select tenant, max(created) as last_activity from app_union(null::event, 'where created > now() - interval ''7'' day') group by tenant ) e on tenant = tenant.id;
This query took about 10 seconds on my production server with 2.5MM event records. This is fairly hefty, but not bad for a one-off analytics query!
Thanks for reading — here's hoping this proves useful in your database wrangling adventures.