Design a multitenant application on Rails 6 with horizontal sharding

#rails #programming #webdev #database

One of the most common design patterns for multitenant architectures is to associate every tenant with a unique subdomain on your root domain. For eg. if your application runs on example.com, marvel as a tenant would access the system using marvel.example.com and so on.

This pattern has its own advantages(easy/faster DNS resolution when running on a multi pod setup) and disadvantages(DNS updates for every tenant creation). Instead of debating that, we will delve into how to implement this architecture in a Rails application using the new multi & horizontal DB setup provided by Rails 6.0/6.1.

To begin with, we will need a Tenant model. Since your tenants will be identified by subdomains, it makes sense to have a subdomain column in the table along with other application required attributes. Each tenant belongs to a Shard and all data of that tenant would reside on that shard. So we will need a shard model as well.

We can begin by setting up the required database configurations first:

# config/database.yml

default: &default
  adapter: sqlite3
  pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
  timeout: 5000

development:
  default:
    <<: *default
    database: primary_db
  default_replica:
    <<: *default
    database: primary_db_replica
    replica: true
  shard1:
    <<: *default
    database: shard1_db
  shard1_replica:
    <<: *default
    database: shard1_db_replica
    replica: true

We will define the required models as well accordingly.

# app/models/application_record.rb

# frozen_string_literal: true
class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true

  db_configs = Rails.application.config.database_configuration[Rails.env].keys

  db_configs = db_file.each_with_object({}) do |key, configs|
    # key = default, db_key = default
    # key = default_replica, db_key = default
    db_key = key.gsub('_replica', '')
    role = key.eql?(db_key) ? :writing : :reading

    db_key = db_key.to_sym
    configs[db_key] ||= {}

    configs[db_key][role] = key.to_sym
  end

  # connects_to shards: {
  #   default: { writing: :default, reading: :default_replica },
  #   shard1: { writing: :shard1, reading: :shard1_replica }
  # }
  connects_to shards: db_configs
end

# app/models/global_record.rb

# frozen_string_literal: true
class GlobalRecord < ActiveRecord::Base
  self.abstract_class = true

  connects_to database: { writing: :default, reading: :default_replica }
end

# app/models/tenant.rb

# frozen_string_literal: true
class Tenant < ApplicationRecord
  include ActsAsCurrent

  validates :subdomain, format: { with: DOMAIN_REGEX }
  # other DSL

  after_commit :set_shard, on: :create

  private

  def set_shard
    Shard.create!(tenant_id: self.id, domain: subdomain)
  end
end

# app/models/shard.rb

# frozen_string_literal: true
class Shard < GlobalRecord
  include ActsAsCurrent

  validates :domain, format: { with: DOMAIN_REGEX }
  validates :tenant_id

  before_create :set_current_shard

  private

  def set_current_shard
    self.shard = APP_CONFIGS[:current_shard] #shard1
  end
end

With multitenant architectures, there will always be a global context and a tenant specific context. We isolate such models through abstract classes ApplicationRecord and GlobalRecord. They also take care of abstracting database connections and setting up the required isolations.

We can also leverage the BelongsToTenant pattern for all models that belong to a tenant and inherit from ApplicationRecord.

All ActiveRecord inherited models connect by default to a default shard and a writing role unless connected_to another connection. Hence, when connecting to GlobalRecord inherited models, we will not require any explicit connection handling.

We can also define a proxy class to abstract out all application specific connection handling logic:

# app/proxies/database_proxy.rb

# frozen_string_literal: true
class DatabaseProxy
  class << self
    def on_shard(shard: , &block)
      _connect_to_(role: :writing, shard: shard, &block)
    end

    def on_replica(shard: , &block)
      _connect_to_(role: :reading, shard: shard, &block)
    end

    def on_global_replica(&block)
      _connect_to_(klass: GlobalRecord, role: :reading, &block)
    end

    # for regular executions, since Global only connects to default shard,
    # no explicit connection switching is required.
    # def on_global(&block)
    #   _connect_to_(klass: GlobalRecord, role: :writing, &block)
    # end

    private

    def _connect_to_(klass: ApplicationRecord, role: :writing, shard: :default, &block)
      klass.connected_to(role: role, shard: shard) do
        block.call
      end
    end
  end
end

With this setup in place, we can now write both application and background middlewares that handle shard selection and tenant isolation on a per request or job basis.

# lib/middlewares/multitenancy.rb

# frozen_string_literal: true
module Middlewares
  # selecting account based on subdomain
  class Multitenancy
    def initialize(app)
      @app = app
    end

    def call(env)
      domain = env['HTTP_HOST']

      shard = Shard.find_by(domain: domain)
      return @app.call(env) unless shard

      shard.make_current
      DatabaseProxy.on_shard(shard: shard.shard) do
        account = Account.find_by(subdomain: domain)

        account&.make_current
        @app.call(env)
      end
    end
  end
end

# config/application.rb
require 'lib/middlewares/multitenancy'

config.middleware.insert_after Rails::Rack::Logger, Middlewares::Multitenancy

Anybody who's building new products on the web, Ruby on Rails has never been better to kickstart your next big unicorn.

Top comments (6)

Jarrodsz • Jul 20 '22

Awesome writeup!

How about creating tennanta in an admin environment and creating automatically the database for the tennant. How to run migrations on these databases?

Any thoughts on automating above and be able to dynamically from code generate the database.yml or load the list of tennant databases at runtime?

That way one could automate the tennant creation

Ritikesh • Jul 31 '22

Thanks @jarrodsz! Rails does not support runtime reloading of Database configurations as far as I know. You'd probably have to write some patches to do that. I'd instead recommend automating the following in order -

creation of the tenant - not yet accessible however, controlled by some flag/switch
adding the tenant to database configs
triggering a deployment with the newly generated configs
updating the state of the tenant and making it accessible.

Taranjyot Singh • Jul 30 '21 • Edited

Hi Ritikesh, Thank you for the above article. I'm working on migrating our rails app to multi-DB architecture. Is the read_replica switched automatically or you have to explicitly call DatabaseProxy.on_replica for read operations application wide?

Ritikesh • Sep 4 '21

Hi Taranjyot,

Thanks for reading. To answer your question, Rails allows for a default connection switching option for web requests as well - you can read more here. However, we have explicit connection switching logics as required. For ex, in controller context, we have a private method called around_replica and use that as an around_action for actions (eg. DB backed searches, get requests etc) where we want reads to go to the replica.

  # application_controller.rb
  private
  def around_replica(&block)
    DatabaseProxy.on_replica(&block)
  end

  # some_controller.rb
  around_action :around_replica, only: %i[show index]