First of all, this article is not about how much I love Rails. And secondly, it is not about how I hate it. We can treat Rails quite differently and it will become better only if we change something. But also, Rails can get worse just if we try to make corrections to it. Well, anyway, you've been warned and I hope you got me.
One of the basic concepts of ActiveRecord is that the database is quite utilitarian in nature and can be easily modified. So, there you are sitting and writing your models for MySQL, and suddenly you find out somewhere that it's that simple to just up and replace MySQL with MongoDB. Well, not so drastically, but, say, you may have reasons to replace MySQL with PostgreSQL. Or vice versa, I have nothing against MySQL.
And here ActiveRecord claims to come in handy, since supposedly it's a piece of cake for it. After all, scopes, before/after filters and associations are abstract enough not to worry about generating queries on databases and focus just on the logic of the application. Instead of writing WHERE is_archived = TRUE
, you can easily write where(is_archived: true)
and ActiveRecord will do the rest for you.
But it's not really as simple as all that! In practice, it turns out that this layer of abstraction is full of gaps, like a broken trough from the tale of the Golden Fish. And that many basic features can not be used, like comparing dates or working with arrays. And we have got scopes with forced where("#{quoted_table_name}.finished_at >= ?", Date.current)
or where("#{quoted_table_name}.other_ids <@ ARRAY[?]", ids)
. To which ActiveRecord gives a completely conscious and expected answer: do not use this. Instead of arrays, use habtm-association, and if you need to compare dates, accept this. And God forbid, you miss quoted_table_name
in such a scope - the first includes
or joins
will put everything in its place. It's easier to add them wherever and whenever possible, so that you do not lose the skill.
And, of course, once you decide on such an interference in the work of ActiveRecord, there's no going back. Not only chances, but also a vague hope for a painless transition to another database will be gone. It will be much better to print this code out and burn it. And surely, there is no other reason not to use extra-database capabilities in your application. You're welcome to use and make everybody do!
And when it turns out that use of extra-opportunities is more than half of your scopes in the models folder, it will be quite obvious that ActiveRecord is just a convenient wrapper for integrating one labeled piece of code with another piece of code. And scopes like where(is_archived: true).joins(:sprint).merge(Sprint.archived)
will work fine and their combining won't be much more difficult than cooking eggs, will it?
The next stage is denormalization. Of course, denormalization has always been and has not disappeared anywhere, but taking care of it was placed on mighty shoulders of Rails and ActiveRecord, and you know, that these two guys don't suffer from excessive lightness and ascetic in their resource requirements. Let's say, counter_cache: true
is the first step to denormalization, after all, ActiveRecord won't let you just do COUNT(*) AS sprints_count
so easily (I mean, you're not going to use select
method, are you?). And counter_cache
is far from ideal and in some cases there may be a desynchronization of the real quantity from the cached one. Not critical, of course, but unpleasant. And this is only the first candidate to settle in the database and not load the already burden head of the Ruby machine. Just a couple of triggers and it's done! When deleting and adding an entry to the A-table, you need to recalculate the number of records in the B-table and that's all, right? And, of course, you do the same when editing an entry if foreign_key
has been changed, so as the request UPDATE B SET a_id = $1 WHERE id = $2
will break counter_cache
down both for the old and new tables.
CREATE OR REPLACE FUNCTION update_#{parent_table}_#{child_table}_counter_on_insert() RETURNS TRIGGER AS $$
BEGIN
UPDATE #{parent_table} SET #{counter_column} = COALESCE((SELECT COUNT(id) FROM #{child_table} GROUP BY #{foreign_column} HAVING #{foreign_column} = NEW.#{foreign_column}), 0) WHERE (#{parent_table}.id = NEW.#{foreign_column});
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
The next piece of work with a database will be related to the date-time. First, let the created_at
and updated_at
fields be serviced in the database itself, fortunately, it's much simpler. Primarily, let's set default values:
change_column_default :table_name, :created_at, -> { 'CURRENT_TIMESTAMP' }
change_column_default :table_name, :updated_at, -> { 'CURRENT_TIMESTAMP' }
And to do this everywhere at once, we can organize a cycle throughout all tables, where these fields are present. Except of the schema_migrations
and ar_internal_metadata
tables, certainly:
(tables - %w(schema_migrations ar_internal_metadata)).each { ... }
That's all, now the default values for these tables will be exactly the same as we need. And now it's time to make sure that Rails won't touch these fields. There is an option in the configuration of the framework, which is responsible for this:
Rails.application.config.active_record.record_timestamps = false
So, the next step is to change the updated_at
field when a record is changed. That's simple:
CREATE OR REPLACE FUNCTION touch_for_#{table_name}_on_update() RETURNS TRIGGER AS $$
BEGIN
SELECT CURRENT_TIMESTAMP INTO NEW.updated_at;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Now we need to completely get rid of touch: true
in models. This thing is very similar to the target in the shooting gallery - it's also completely leaky. And I won't even explain why, because you already know all these cases. This is not much more difficult, you just need to update updated_at
where necessary:
CREATE OR REPLACE FUNCTION touch_for_#{table_name}_on_update() RETURNS TRIGGER AS $$
BEGIN
UPDATE foreign_table_name SET updated_at = CURRENT_TIMESTAMP WHERE id = NEW.foreign_column_name;
SELECT CURRENT_TIMESTAMP INTO NEW.updated_at;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Of course, the sequence of calls of such triggers will cause extra actions, but Postgres doesn't provide any sane mechanism to call triggers without changing the record itself. You could try to write SET title = title
but this is hardly ever better than SET updated_at = CURRENT_TIMESTAMP
.
Exactly the same trigger serves for insertion, just update_at
is not necessary:
CREATE OR REPLACE FUNCTION touch_for_#{table_name}_on_insert() RETURNS TRIGGER AS $$
BEGIN
UPDATE foreign_table_name SET updated_at = CURRENT_TIMESTAMP WHERE id = NEW.foreign_column_name;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Certainly, you could try to write this with one function by adding a check on the current event directly into the trigger like IF TG_OP = 'UPDATE' THEN
, but it is preferable to make all triggers as simple as possible, in order to reduce the probability of error.
You might want to somehow automate the generation of such triggers, and then you will most likely need to find all foreign relations between the current table and the rest ones. Here you can easily do that with this request:
SELECT ccu.table_name AS foreign_table_name, kcu.column_name AS column_name
FROM information_schema.table_constraints AS tc
JOIN information_schema.key_column_usage AS kcu
ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage AS ccu
ON ccu.constraint_name = tc.constraint_name
WHERE constraint_type = 'FOREIGN KEY' AND tc.table_name = '#{table_name}'
ORDER BY ccu.table_name;
One more very useful tip. Name all triggers from a template to be able to verify the presence or absence of a required trigger with a single request. For example, this query will find all touch-insert triggers:
SELECT routine_name AS name
FROM information_schema.routines
WHERE
routine_name LIKE 'touch_for_%_on_insert' AND
routine_type ='FUNCTION' AND
specific_schema='public';
And the last thing left is the most terrible. The fact is that Rails is not designed for at least a marginally smart database and it definitely doesn't care that something can be changed there, except ID-field, and even then only when inserting data into the table. Therefore, there is no sane way to add RETURNING id, updated_at
to update-queries, you will need to dive headlong into Rails thicket.
Monkey patch turned out not so much neat, but primarily the goal was to minimize contravention of the current work of the framework as much as possible. Here's how it looks in the end:
module ActiveRecord
module Persistence
# https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/persistence.rb#L729-L741
def _create_record(attribute_names = self.attribute_names)
attribute_names &= self.class.column_names
attributes_values = attributes_with_values_for_create(attribute_names)
an_id, *affected_rows = self.class._insert_record(attributes_values).dup
self.id ||= an_id if self.class.primary_key
Hash[ApplicationRecord.custom_returning_columns(self.class.quoted_table_name, :create).take(affected_rows.size).zip(affected_rows)].each do |column_name, value|
public_send("#{column_name}=", self.class.attribute_types[column_name.to_s].deserialize(value)) if value
end
@new_record = false
yield(self) if block_given?
id
end
private :_create_record
# https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/persistence.rb#L710-L725
def _update_record(attribute_names = self.attribute_names)
attribute_names &= self.class.column_names
attribute_names = attributes_for_update(attribute_names)
if attribute_names.empty?
affected_rows = []
@_trigger_update_callback = true
else
affected_rows = _update_row(attribute_names)
@_trigger_update_callback = affected_rows.any?
end
Hash[ApplicationRecord.custom_returning_columns(self.class.quoted_table_name, :update).take(affected_rows.size).zip(affected_rows)].each do |column_name, value|
public_send("#{column_name}=", self.class.attribute_types[column_name.to_s].deserialize(value))
end
yield(self) if block_given?
affected_rows.none? ? 0 : 1
end
private :_update_record
end
module ConnectionAdapters
module PostgreSQL
module DatabaseStatements
# https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/connection_adapters/postgresql/database_statements.rb#L93-L96
def exec_update(sql, name = nil, binds = [])
execute_and_clear(sql_with_returning(sql), name, binds) { |result| Array.wrap(result.values.first) }
end
# https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb#L147-L152
def insert(arel, name = nil, pk = nil, _id_value = nil, sequence_name = nil, binds = [])
sql, binds = to_sql_and_binds(arel, binds)
exec_insert(sql, name, binds, pk, sequence_name).rows.first
end
alias create insert
# https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/connection_adapters/postgresql/database_statements.rb#L98-L111
def sql_for_insert(sql, pk, id_value, sequence_name, binds) # :nodoc:
table_ref = extract_table_ref_from_insert_sql(sql)
if pk.nil?
# Extract the table from the insert sql. Yuck.
pk = primary_key(table_ref) if table_ref
end
returning_columns = quote_returning_column_names(table_ref, pk, :create)
if returning_columns.any?
sql = "#{sql} RETURNING #{returning_columns.join(', ')}"
end
super
end
# No source in original repo
def quote_returning_column_names(table_ref, pk, action)
returning_columns = []
returning_columns << pk if suppress_composite_primary_key(pk)
returning_columns += ApplicationRecord.custom_returning_columns(table_ref, action)
returning_columns.map { |column| quote_column_name(column) }
end
# No source in original repo
def sql_with_returning(sql)
table_ref = extract_table_ref_from_update_sql(sql)
returning_columns = quote_returning_column_names(table_ref, nil, :update)
return sql if returning_columns.blank?
"#{sql} RETURNING #{returning_columns.join(', ')}"
end
# No source in original repo
def extract_table_ref_from_update_sql(sql)
sql[/update\s("[A-Za-z0-9_."\[\]\s]+"|[A-Za-z0-9_."\[\]]+)\s*set/im]
Regexp.last_match(1)&.strip
end
end
end
end
end
The most important thing is that here we turn to ApplicationRecord.custom_returning_columns
to find out which columns, besides id, we are interested in. And this method looks something like this:
class << self
def custom_returning_columns(table_ref, action)
return [] if ['"schema_migrations"', '"ar_internal_metadata"'].include?(table_ref)
res = []
res << :created_at if action == :create
res << :updated_at
res += case table_ref
when '"user_applications"'
[:api_token]
when '"users"'
[:session_salt, :password_changed_at]
# ...
else
[]
end
res
end
end
All examples are given for PostgreSQL, not for MySQL, so MySQL followers will have to reinvent their own wheel.
Instead of conclusions, we could say that now the aching Rails's head will hurt a little less. Such routine processes as counter_cache
and touch
are going to sink into oblivion, and the next article will be focused on something more global, like removing hanging spaces, validating data, cascade data deletion, or paranoid removal and rspec for sql. If this article proves to be interesting, of course.
Top comments (0)