Using SQL with ActiveRecord for performance gains

#rails #ruby

While working on a side project, I had a need to count items by day of the month. While this can likely be achieved in Ruby, I reached for SQL instead.

I have a table called journal_entries. It has the following columns.

   Column   |              Type              | Collation | Nullable |                   Default
------------+--------------------------------+-----------+----------+---------------------------------------------
 id         | bigint                         |           | not null | nextval('journal_entries_id_seq'::regclass)
 journal_id | bigint                         |           |          |
 body       | text                           |           | not null |
 created_at | timestamp(6) without time zone |           | not null |
 updated_at | timestamp(6) without time zone |           | not null |
 metadata   | jsonb                          |           | not null | '"{}"'::jsonb

For a given journal, I wanted to count the number of journal entries per day. To do that, I ended up with a method called entries_by_day.

def entries_by_day
  journal
    .journal_entries
    .where(created_at: interval)
    .group("extract(day from created_at AT TIME ZONE '#{Time.zone.formatted_offset}')::int")
    .count
end

For the most part, this is a normal ActiveRecord query. We're querying for a journal's journal entries within a given interval. In our case, an interval would be something like July 1, 2020 to July 31, 2020.

Then, we use the group method which is pretty cool. This translates to a GROUP BY in SQL. To group the rows, we use the extract function provided by PostgreSQL (documentation).

What we're saying is this. Give me the day (1, 2, 3, ..., 31) based on the timestamp. If the timestamp is 2020-08-11 19:05:59.508883, extract would give us 11.

What we end up with is a Hash with the days as keys and the count as values.

{25=>3, 29=>2, 27=>2, 23=>2, 24=>2}

This has the advantage of offloading performance concerns to the database. PostgreSQL can typically do these types of operations faster.

If we were to attempt the same in Ruby, we would need to query for all of the records within the month of July. Then, we'd have to use Ruby's group_by method to finish the job.