Hopsoft

Posted on Mar 18, 2022 • Edited on Jul 29, 2022

Optimizing Rails connections

#ruby #rails #database #redis

This is the 3rd post in a series that starts here..

Have you ever experienced a Rails app that suddenly and inexplicably delivers dismal response times or becomes completely unresponsive? It can be incredibly confusing, especially when nothing in the environment has changed recently.

Let's look at one of the most common reasons this might happen.

Datastore Connections

Correctly configuring Rails connections can be complicated. Here's a short video that explains how easy it is to get wrong.

Note that this problem isn't isolated to your primary database either. It's possible to misconfigure any datastore that has limited connections. Postgres, Redis, Memcached, etc...

There are a few aspects of this problem to consider.

Connection Limits

Datastores have an upper limit to how many active connections they can reasonably support. System resources typically govern these limits. For example, PostgreSQL defaults to 100 connections while Redis defaults to 10,000. It's possible to override default connection limits. Just be sure you know what you're doing.

Hosting providers impose further limits. Consider that Heroku's Standard 0 Postgres plan caps at 120 connections and their Premium 0 Redis plan caps at 40.

Connection Pools

Establishing a connection is an expensive and slow operation. Connection pooling reduces overall latency by limiting how frequently connections are set up by reusing pre-established connections.

The improved efficiency gained from connection pooling enables applications to serve more concurrent requests than otherwise possible based on datastore connection limits alone. Pooling is a force multiplier.

Deployment Topology

The deployment configuration plays a significant role. Consider a database with a limit of 40 connections. We'd be at capacity with 2 servers configured to use 20 connections each.

This problem compounds as the deployment becomes more complex. And the challenge isn't limited to the number of servers. We also need to account for the number of processes running per server and the number of threads active in each process.

The Variables

Typical Rails applications use the following environment variables.

WEB_CONCURRENCY - the number of processes
RAILS_MAX_THREADS - the number of threads

These are descriptive names, but I think we can do better. Given that Rails applications do more than serve web requests, let's consider some more suitable names before looking at formulas.

SERVERS - the number of servers/dynos
PROCESSES - the number of processes
THREADS - the number of threads

We also need to factor in external systems that have access to the database. This adds another variable to consider.

EXTERNAL_CONNECTIONS

That's all the variables, now onto the formula.

The Formula

The formula is quite simple.



(SERVERS * PROCESSES * THREADS) + EXTERNAL_CONNECTIONS

Let's look at a standard Rails deployment.

2 web servers, 2 processes, 8 threads
3 worker servers, 4 processes, 16 threads
10 external connections

Here's the equation for the number of connections.



(2 * 2 * 8) + (3 * 4 * 16) + 10 = 234

It's vital to ensure that you don't configure more connections than are available. You may need to increase the provisioned hardware or reduce the number of connections your application uses.

Configuration

Here's an example of configuring Rails for the deployment described above.



# config/database.yml
default: &default
  adapter: postgresql
  encoding: unicode
  pool: <%= ENV["THREADS"] %>
  # it's generally a good idea to configure timeouts
  connect_timeout: 1
  variables:
    statement_timeout: 5s
    lock_timeout: 2s



# config/initializers/sidekiq.rb
settings = {
  url: ENV["REDIS_QUEUE_URL"], 
  size: ENV["THREADS"].to_i + 2, # pool size
  # it's generally a good idea to configure timeouts
  connect_timeout: 0.2,
  read_timeout: 0.5,
  write_timeout: 0.5
}

Sidekiq.configure_server do |config|
  config.redis = settings
end

Sidekiq.configure_client do |config|
  config.redis = settings
end



# config/environments/production.rb
Rails.application.configure do
  config.cache_store = :redis_cache_store, {
    url: ENV["REDIS_CACHE_URL"],
    size: ENV["THREADS"] + 2, # pool size
    # it's generally a good idea to configure timeouts
    connect_timeout: 0.2,
    read_timeout: 0.5,
    write_timeout: 0.5
  }
end



# config/puma.rb
workers ENV["PROCESSES"]
max_threads_count = ENV["THREADS"]
min_threads_count = ENV["THREADS"]
threads min_threads_count, max_threads_count



# Procfile
web: PROCESSES=2 THREADS=8 bin/start-pgbouncer bundle exec puma -C config/puma.rb
worker: PROCESSES=4 THREADS=16 bin/start-pgbouncer bundle exec sidekiq -C config/sidekiq.yml

The formula and configuration may appear simple, but they are deceptively complex. The complexity becomes more apparent as the deployment grows more sophisticated. For example, consider a deployment with multiple worker pools of varying sizes backed by Redis instances with different connection limits. Can you think of other deployment topologies? How does the math work out?

Advanced Options

Stacked Pooling

It's possible to stack connection pooling with tools like pg_bouncer, which can help further ensure your Rails application doesn't exhaust database connections. Heroku supports this with a buildpack.

Sidekiq Swarm

The number of processes and threads can be hard to determine when using tools like Sidekiq swarm. Just pay attention and ensure that you know what the number is.

DEV Community

Optimizing Rails connections

Datastore Connections

Connection Limits

Connection Pools

Deployment Topology

The Variables

The Formula

Configuration

Advanced Options

Stacked Pooling

Sidekiq Swarm

Top comments (0)

Read next

Aurora Limitless - Resharding

Where WordPress Pages are Stored: A Complete Guide

Garantindo a Integridade de Dados em Transações Financeiras: A Solução com Transações no PostgreSQL

10 Future Apache Iceberg Developments to Look forward to in 2025