Michal Bryxí

Posted on Jun 13, 2019

Advanced counter_cache in Rails with SQL triggers

#rails #ror #postgres #sql

Ruby on Rails has useful little feature called counter_cache. In a scenario where you have model Author that has relationship hasMany with model Book, rails will save in the column authors.books_count respective number of books per given author when you manipulate the database through Active Record (AR) models.

The problem

That is well and good for simple scenarios. But mine was much more complicated than that. Imagine following models:

# models/visit.rb

class Visit < ActiveRecord::Base
  has_many :sells

  def amount
    sells.sum("(sells.price * sells.count) * (1 - sells.discount / 100)")
  end
end

# models/sell.rb

class Sell < ActiveRecord::Base
  belongs_to :visit
end

The attribute amount is being read every time a request to Visit is made. And request to Visit can be made in big batches. So having that value cached somewhere makes sense from performance perspective.
I need the computed values to be immediately correct, so offloading the computation to active_job was not an option.
There are many more attributes on my models that need these types of calculations and they are usually even more complex than this example.
Rails native counter_cache can only keep track of count() of the records. For more complex calculations, there is the counter_culture gem, but for that one all of the problems from the next point also apply.
It is perfectly possible to use before_save from AR to save calculated fields, but I had few concerns:
- Some of the calculation logic is fairly complex and I was not sure whether it would be possible to express everything in AR that would require just one SQL query.
- Even if it would be possible to do everything in 1 SQL query, it is still a round trip from the database to the application server and back. And all the calculations lives in a transaction, so it will slow down any other operations on rows that are part of the transaction.

Offloading work to the database

I made the decission to offload the work to (PostgreSQL) triggers. The obvious disadvantages of this solution are:

Only the database knows that the data has been changed. So if your application requires store & read logic, you should check that the read actually returns fresh data.
If there is more logic on the application side of things, then you might need to duplicate that also in your database. For example the "soft delete" gem paranoia always leaves deleted records in place and just adds a flag that record should be ignored. Your database naturally does not know about this.
I personally still consider SQL procedures/triggers a dark magic so debugging those is not a nice experience for me.

The problem of C(reate) U(pdate) D(elete)

In PostgreSQL depending on the trigger operation (TG_OP) following variables are created:

NEW for Create operation
OLD and NEW for Update operation
OLD for Delete

These variables hold respective state of the row on which the operation happened. This is important especially for the Update operation. Imagine a scenario where the code changes the visit_id for a Sell. Then the amount needs to be recalculated on both OLD and NEW records (resp: their respective relationships).

The idea of how to solve this in an elegant way plus the proper syntax in (Postgre)SQL is the main topic of this post.

For a good way on how to deal with triggers in Rails, I would recommend a great gem - hair_trigger. The gem works as follows: you will simply store the SQL of the trigger in respective model and every time this code changes hair_trigger will generate migration that will drop the old trigger and create an updated one.

With this approach you can easily see the "additional logic" of your Models / tables. And have the privilege to not care about how to load the triggers in the database.

Show me the code

# models/visit.rb

class Visit < ActiveRecord::Base
  has_many :sells

  def amount
    sells.sum("(sells.price * sells.count) * (1 - sells.discount / 100)")
  end
end

# models/sell.rb

class Sell < ActiveRecord::Base
  belongs_to :visit

  trigger.after(:insert, :update, :delete)
         .name("after_sells")
         .declare("var_curr sells; var_amount decimal; var_curr_arr sells[];") do
    <<-SQL

      CASE TG_OP
        WHEN 'DELETE' THEN
          var_curr_arr := ARRAY[OLD];
        WHEN 'UPDATE' THEN
          var_curr_arr := ARRAY[OLD, NEW];
        WHEN 'INSERT' THEN
          var_curr_arr := ARRAY[NEW];
      END CASE;

      FOREACH var_curr IN ARRAY var_curr_arr
      LOOP

        var_amount := (SELECT COALESCE(SUM((sells.price * sells.count) * (1 - sells.discount / 100)), 0) FROM sells WHERE sells.visit_id = var_curr.visit_id;

        UPDATE visits SET amount = var_amount WHERE visits.id = var_curr.visit_id;

      END LOOP;
    SQL
  end
end

Here's what's happening there:

Based on TG_OP operation we will create array var_curr_arr that will be filled with the rows that has changed.
Then we will FOREACH over this array. Realistically speaking maximum of two elements will be there, but allows us to DRY our code.
And finally inside the LOOP we will do the respective update(s).

After this we just need to let hair_trigger generate new database migrations based on the updated code of our trigger:

rake db:generate_trigger_migration

And then simply run Rails migrations:

rake db:migrate

Now you can go ahead, create / update / delete bunch of sells and see whether amount for respective visit(s) has been updated.

Notes

The COALESCE(..., 0) trick is here to make sure we don't hit the problem with NULL.
Splash photo credits: Crissy Jarvis.

DEV Community

Advanced counter_cache in Rails with SQL triggers

The problem

Offloading work to the database

The problem of C(reate) U(pdate) D(elete)

Show me the code

Notes

Top comments (0)

Read next

gem install error: incompatible function pointer types passing

Simple accordion without JavaScript

Implement a Secure, Dynamic Domain Approval System for Embeddable Widgets in Ruby on Rails

Understanding Transaction Isolation Levels in PostgreSQL