Ruby on Rails has useful little feature called counter_cache. In a scenario where you have model Author
that has relationship hasMany
with model Book
, rails will save in the column authors.books_count
respective number of books per given author when you manipulate the database through Active Record (AR) models.
The problem
That is well and good for simple scenarios. But mine was much more complicated than that. Imagine following models:
# models/visit.rb
class Visit < ActiveRecord::Base
has_many :sells
def amount
sells.sum("(sells.price * sells.count) * (1 - sells.discount / 100)")
end
end
# models/sell.rb
class Sell < ActiveRecord::Base
belongs_to :visit
end
- The attribute
amount
is being read every time a request toVisit
is made. And request toVisit
can be made in big batches. So having that value cached somewhere makes sense from performance perspective. - I need the computed values to be immediately correct, so offloading the computation to active_job was not an option.
- There are many more attributes on my models that need these types of calculations and they are usually even more complex than this example.
- Rails native
counter_cache
can only keep track ofcount()
of the records. For more complex calculations, there is the counter_culture gem, but for that one all of the problems from the next point also apply. - It is perfectly possible to use
before_save
from AR to save calculated fields, but I had few concerns:- Some of the calculation logic is fairly complex and I was not sure whether it would be possible to express everything in AR that would require just one SQL query.
- Even if it would be possible to do everything in 1 SQL query, it is still a round trip from the database to the application server and back. And all the calculations lives in a transaction, so it will slow down any other operations on rows that are part of the transaction.
Offloading work to the database
I made the decission to offload the work to (PostgreSQL) triggers. The obvious disadvantages of this solution are:
- Only the database knows that the data has been changed. So if your application requires store & read logic, you should check that the read actually returns fresh data.
- If there is more logic on the application side of things, then you might need to duplicate that also in your database. For example the "soft delete" gem paranoia always leaves deleted records in place and just adds a flag that record should be ignored. Your database naturally does not know about this.
- I personally still consider SQL procedures/triggers a dark magic so debugging those is not a nice experience for me.
The problem of C(reate) U(pdate) D(elete)
In PostgreSQL depending on the trigger operation (TG_OP
) following variables are created:
-
NEW
for Create operation -
OLD
andNEW
for Update operation -
OLD
for Delete
These variables hold respective state of the row on which the operation happened. This is important especially for the Update operation. Imagine a scenario where the code changes the visit_id
for a Sell
. Then the amount
needs to be recalculated on both OLD
and NEW
records (resp: their respective relationships).
The idea of how to solve this in an elegant way plus the proper syntax in (Postgre)SQL is the main topic of this post.
For a good way on how to deal with triggers in Rails, I would recommend a great gem - hair_trigger. The gem works as follows: you will simply store the SQL of the trigger in respective model and every time this code changes hair_trigger will generate migration that will drop the old trigger and create an updated one.
With this approach you can easily see the "additional logic" of your Models / tables. And have the privilege to not care about how to load the triggers in the database.
Show me the code
# models/visit.rb
class Visit < ActiveRecord::Base
has_many :sells
def amount
sells.sum("(sells.price * sells.count) * (1 - sells.discount / 100)")
end
end
# models/sell.rb
class Sell < ActiveRecord::Base
belongs_to :visit
trigger.after(:insert, :update, :delete)
.name("after_sells")
.declare("var_curr sells; var_amount decimal; var_curr_arr sells[];") do
<<-SQL
CASE TG_OP
WHEN 'DELETE' THEN
var_curr_arr := ARRAY[OLD];
WHEN 'UPDATE' THEN
var_curr_arr := ARRAY[OLD, NEW];
WHEN 'INSERT' THEN
var_curr_arr := ARRAY[NEW];
END CASE;
FOREACH var_curr IN ARRAY var_curr_arr
LOOP
var_amount := (SELECT COALESCE(SUM((sells.price * sells.count) * (1 - sells.discount / 100)), 0) FROM sells WHERE sells.visit_id = var_curr.visit_id;
UPDATE visits SET amount = var_amount WHERE visits.id = var_curr.visit_id;
END LOOP;
SQL
end
end
Here's what's happening there:
- Based on
TG_OP
operation we will create arrayvar_curr_arr
that will be filled with the rows that has changed. - Then we will
FOREACH
over this array. Realistically speaking maximum of two elements will be there, but allows us to DRY our code. - And finally inside the
LOOP
we will do the respective update(s).
After this we just need to let hair_trigger generate new database migrations based on the updated code of our trigger:
rake db:generate_trigger_migration
And then simply run Rails migrations:
rake db:migrate
Now you can go ahead, create / update / delete bunch of sells and see whether amount for respective visit(s) has been updated.
Notes
- The
COALESCE(..., 0)
trick is here to make sure we don't hit the problem with NULL. - Splash photo credits: Crissy Jarvis.
Top comments (0)