Ayush Mishra

Posted on Jun 17

NestJS and TypeORM — Efficient Schema-Level Multi-Tenancy with Auto Generated Migrations: A DX Approach

#nestjs #typeorm #postgres #backend

When developing a B2B SaaS application, maintaining strict data isolation while ensuring developer productivity is paramount. Schema-based multi-tenancy in NestJS with TypeORM and PostgreSQL provides a balanced approach to achieve this, ensuring high security without compromising efficiency. This blog will guide you through setting up a schema-based multi-tenancy system with automatically generated migrations, ensuring your database schema stays in sync with your entity definitions at all times.

Why Schema-Based Multi-Tenancy?

The Problem

Traditional multi-tenancy approaches, such as the pooling strategy (using a tenantId key for scoping requests), often complicate queries and increase the risk of data leaks due to developer oversight. Furthermore, as the number of tenants grows, database performance can degrade, and operations like data extraction or restoration for individual tenants become cumbersome.

The Solution

Schema-based multi-tenancy partitions data by creating a separate schema for each tenant within the same database instance. This setup inherently scopes queries to the tenant’s schema, improving security and simplifying development.

Setting Up the Groundwork

Repository Structure

Organize your repository with clear distinctions between public and tenant-specific modules. Here’s a simplified structure:

src/
  modules/
    public/
      entities/
      migrations/
    tenant/
      entities/
      migrations/
  tenancy/
    tenancy.module.ts
    tenancy.middleware.ts
    tenancy.utils.ts

TypeORM Configuration

Configure TypeORM for both public and tenant schemas. Public entities and migrations are straightforward, while tenant-specific configurations must ensure the correct schema is targeted.

// public-orm.config.ts
export const publicConfig: DataSourceOptions = {
  type: 'postgres',
  host: process.env.DB_HOST,
  port: +process.env.DB_PORT,
  username: process.env.DB_USERNAME,
  password: process.env.DB_PASSWORD,
  database: process.env.DB_NAME,
  entities: [__dirname + '/../modules/public/entities/*.entity{.ts,.js}'],
  migrations: [__dirname + '/../modules/public/migrations/*{.ts,.js}'],
  synchronize: false,
};

// tenant-orm.config.ts
export const tenantConfig: DataSourceOptions = {
  ...publicConfig,
  entities: [__dirname + '/../modules/tenant/entities/*.entity{.ts,.js}'],
  migrations: [__dirname + '/../modules/tenant/migrations/*{.ts,.js}'],
};

Tenancy Utils

We also setup a simple caching system for our Data Sources, this prevents us from having to instantiate new connections on each incoming request.

// tenancy.utils.ts
import { DataSource, DataSourceOptions } from "typeorm";

import { MAX_CONNECTION_POOL_SIZE } from "../../env";
import { tenantConfig } from "../../tenant-orm.config";

export const tenantConnections: { [schemaName: string]: DataSource } = {};

export async function getTenantConnection(
  tenantId: string
): Promise<DataSource> {
  const connectionName = `tenant_${tenantId}`;

  if (tenantConnections[connectionName]) {
    const connection = tenantConnections[connectionName];
    return connection;
  } else {
    const dataSource = new DataSource({
      ...tenantConfig,
      name: connectionName,
      schema: connectionName,
      poolSize: MAX_CONNECTION_POOL_SIZE,
    } as DataSourceOptions);

    await dataSource.initialize();

    tenantConnections[connectionName] = dataSource;

    return dataSource;
  }
}

Creating Tenants

Creating a tenant involves generating a new schema, running tenant-specific migrations, and setting up tenant-specific connections. Here’s an overview of the process in the TenantsService:

// tenants.service.ts
import { Injectable } from '@nestjs/common';
import { InjectDataSource } from '@nestjs/typeorm';
import { DataSource } from 'typeorm';
import { Tenant } from './entities/tenant.entity';

@Injectable()
export class TenantsService {
  constructor(@InjectDataSource() private dataSource: DataSource) {}

  async createTenant(tenantDto: CreateTenantDto): Promise<Tenant> {
    const tenant = new Tenant();
    tenant.name = tenantDto.name;
    await this.dataSource.getRepository(Tenant).save(tenant);

    const schemaName = `tenant_${tenant.id}`;
    await this.dataSource.query(`CREATE SCHEMA ${schemaName}`);

    // Run migrations for the new schema
    await this.runMigrations(schemaName);

    return tenant;
  }

  private async runMigrations(schemaName: string) {
    const tenantConfig = {
      ...this.dataSource.options,
      schema: schemaName,
    };

    const tenantDataSource = new DataSource(tenantConfig);
    await tenantDataSource.initialize();
    await tenantDataSource.runMigrations();
    await tenantDataSource.destroy();
  }
}

Handling Requests

Middleware for Tenant Identification

A middleware extracts the tenant ID from the request header and adds it to the request object for downstream processing.

// tenancy.middleware.ts
import { Injectable, NestMiddleware } from '@nestjs/common';
import { Request, Response, NextFunction } from 'express';

@Injectable()
export class TenancyMiddleware implements NestMiddleware {
  use(req: Request, res: Response, next: NextFunction) {
    const tenantId = req.headers['x-tenant-id'];
    if (!tenantId) {
      return res.status(400).send('Tenant ID is missing');
    }
    req['tenantId'] = tenantId;
    next();
  }
}

Dynamic Connections

Dependency Injection

In your NestJS application, set up dependency injection to provide tenant-specific database connections. This ensures that the correct database connection is used based on the tenant ID.

// tenancy.module.ts
import { Global, Module, Scope } from "@nestjs/common";
import { REQUEST } from "@nestjs/core";
import { Request } from "express";
import { CONNECTION } from "./tenancy.symbols";
import { getTenantConnection } from "./tenancy.utils";

/**
 * Note that because of Scope Hierarchy, all injectors of this
 * provider will be request-scoped by default. Hence there is
 * no need for example to specify that a consuming tenant-level
 * service is itself request-scoped.
 * https://docs.nestjs.com/fundamentals/injection-scopes#scope-hierarchy
 */
const connectionFactory = {
  provide: CONNECTION,
  scope: Scope.REQUEST,
  useFactory: async (request: Request) => {
    const { tenantId } = request;

    if (tenantId) {
      const connection = await getTenantConnection(tenantId);

      const queryRunner = await connection.createQueryRunner();
      await queryRunner.connect();

      return queryRunner.manager;
    }

    return null;
  },
  inject: [REQUEST],
};

@Global()
@Module({
  providers: [connectionFactory],
  exports: [CONNECTION],
})
export class TenancyModule {}

Services that handle tenant-specific operations must dynamically establish connections using the tenant ID.

@Injectable({ scope: Scope.REQUEST })
export class CatsService {
  private catRepository: Repository<Cat>;

  constructor(
    @Inject(REQUEST) private readonly request: Request,
    @Inject(CONNECTION) private readonly connection: DataSource
  ) {
    this.catRepository = connection.getRepository(Cat);
  }

  async createCat(catDto: CreateCatDto): Promise<Cat> {
    const cat = new Cat();
    cat.name = catDto.name;
    return this.catRepository.save(cat);
  }

  async getAllCats(): Promise<Cat[]> {
    return this.catRepository.find();
  }
}

Automatic Migration Generation

This is the real secret sauce of peak DX. Preventing hours of pulling out hair, trying to figure out inconsistencies caused by manually written migrations. Automating the generation of migrations for both public and tenant schemas ensures consistency between the database and entity states. Below is a script to generate migrations for public and tenant entities:

// generateMigrations.ts
import { DataSource, DataSourceOptions } from 'typeorm';
import { publicConfig } from './orm.config';
import { tenantConfig } from './tenant-orm.config';
import { camelCase } from 'typeorm/util/StringUtils';

const defaultSchema = 'tenant_default';

if (process.argv.length < 3) {
  console.log('Please choose either public or tenanted');
  process.exit(1);
}

const environment = process.argv[2];
const type = process.argv[3];

if (['dev', 'development'].includes(environment)) {
  process.env.NODE_ENV = 'development';
} else if (['test', 'testing'].includes(environment)) {
  process.env.NODE_ENV = 'test';
} else if (['prod', 'production'].includes(environment)) {
  process.env.NODE_ENV = 'production';
} else {
  console.log('Please choose either dev/development, test/testing, or prod/production');
  process.exit(1);
}

let dataSource: DataSource = null;

if (type === 'public') {
  dataSource = new DataSource(publicConfig as DataSourceOptions);
} else if (type === 'tenanted') {
  dataSource = new DataSource({ ...tenantConfig, schema: defaultSchema } as DataSourceOptions);
} else {
  console.log('Please choose either public or tenanted');
  process.exit(1);
}

generateMigrations().then(({ upSqls, downSqls }) => {
  console.log('Migration generated successfully');
  const fileContent = getTemplate('migration', new Date().getTime(), upSqls, downSqls.reverse());
  const fs = require('fs');
  const path = require('path');
  const filePath = path.join(__dirname, `../migrations/${type}/${new Date().getTime()}-migration.ts`);
  fs.writeFileSync(filePath, fileContent);
  process.exit(0);
}).catch((error) => {
  console.log(error);
  process.exit(1);
});

function queryParams(parameters: any[] | undefined): string {
  if (!parameters || !parameters.length) {
    return '';
  }
  return `, ${JSON.stringify(parameters)}`;
}

async function generateMigrations() {
  await dataSource.initialize();
  const logs = await dataSource.driver.createSchemaBuilder().log();
  let upSqls: string[] = [];
  let downSqls: string[] = [];

  logs.upQueries.forEach((upQuery) => {
    upSqls.push(
      `await queryRunner.query(\`${upQuery.query.replace(/`/g, '\\`').replace(new RegExp(defaultSchema, 'g'), '${schema}')}\`${queryParams(upQuery.parameters)});`
    );
  });
  logs.downQueries.forEach((downQuery) => {
    downSqls.push(
      `await queryRunner.query(\`${downQuery.query.replace(/`/g, '\\`').replace(new RegExp(defaultSchema, 'g'), '${schema}')}\`${queryParams(downQuery.parameters)});`
    );
  });

  return { upSqls, downSqls };
}

function getTemplate(name: string, timestamp: number, upSqls: string[], downSqls: string[]): string {
  const migrationName = `${camelCase(name, true)}${timestamp}`;

  return `import { MigrationInterface, QueryRunner } from 'typeorm';
import { PostgresConnectionOptions } from 'typeorm/driver/postgres/PostgresConnectionOptions';

export class ${migrationName} implements MigrationInterface {
  name = '${migrationName}'

  public async up(queryRunner: QueryRunner): Promise<void> {
    const { schema } = queryRunner.connection.options as PostgresConnectionOptions;
    ${upSqls.join('\n')}
  }

  public async down(queryRunner: QueryRunner): Promise<void> {
    const { schema } = queryRunner.connection.options as PostgresConnectionOptions;
    ${downSqls.join('\n')}
  }
}
`;
}

Finally, add a npm script in package.json

"migration:generate": "ts-node src/generateMigrations.ts"

bun run migration:generate dev tenanted

While auto generated migrations are a god’s send, one must always double check the changes manually, writing appropriate data migrations if necessary. TypeORM CAN NOT be blindly trusted.

Conclusion

Schema-based multi-tenancy, combined with automated migration generation, streamlines the development process while ensuring data security and isolation. By dynamically establishing tenant-specific connections and automating migration processes, we reduce the risk of human error and maintain consistency between the codebase and database schema.

This approach allows for scalable and maintainable multi-tenant applications, providing a robust foundation for B2B SaaS solutions. With the setup detailed above, you’re equipped to handle tenant-specific data management efficiently and securely.

Credits

This blog post was inspired by Thomas van den Berg’s article on schema-based multi-tenancy with NestJS, TypeORM, and PostgreSQL. Thomas’s insights and detailed explanations provided a solid foundation for understanding and implementing schema-based multi-tenancy, which has greatly contributed to the development of this guide.

Top comments (7)

Antonio | CEO at Litlyx.com • Jun 17

Great post! Your detailed guide on implementing schema-level multi-tenancy in NestJS with TypeORM is incredibly valuable. I appreciate how you addressed the complexities of traditional multi-tenancy approaches and provided a clear solution with schema separation.

The step-by-step instructions on setting up the repository, configuring TypeORM, and managing tenant-specific connections are especially useful. Keep up the excellent work!

Antonio, CEO at Litlyx.com

Fyodor • Jun 17

That's really interesting approach, and thanks for the detailed explaination 👍 Not sure it's the most performant way though, but WDYT? Did you try it in the wild?

(Also as a side effect, I have found the first person (this guy Thomas van den Berg) that uses the same Gridsome theme as I do for the personal website 😅 separate words of appreciation 🫶)

Ayush Mishra • Jun 17

Yes, as the CONNECTION object being injected in each tenanted service is request-scoped, it causes all of them to be request-scoped as well. The DI container reinstantiates the services on each request.

We're using the same approach at my workplace however and have not faced any significant performance impact.

A solution might be to create Dynamic Providers, which would instantiate providers for each tenant in advance and reuse them in the future. I'd write a blog if I ever end up doing that for sure! 🙌

Manokaran K • Jul 6

Excellent post.

One issue I faced when following this is that after a few requests that use the CONNECTION provider (in tenancy.module.ts), the DB pool runs out and the server just hangs till rebooted. I resolved it by returning the queryRunner instead of queryRunner.manager and calling queryRunner.manager.release() after the db tasks are completed in each request. Is there a better way of doing this?

Manokaran K • Jul 25

Actually, what I did above was ugly and unsustainable. All that was needed was to remove the await queryRunner.connect() line.

Samuel Goldenbaum • Aug 5

Thanks for sharing. I see that the connection factory is request scoped, so a new connection is created and garbage collected for every request which is a performance hit. This seems like the right spot for durable providers.

The tenantConnections map doesn’t seem to have functionality to close the connections and, as the provider is request scoped, the module lifecycle events like onModuleDestroy will not fire so we cannot tap into them to loop through the connections and close them.

I was curious if you had made updates to this architecture since writing this article?

f-paul • Aug 22

Thank you for this article, I wanted to know if you had some additionnals resources or a real implementation?

DEV Community