loading...

Why is Rails ActiveRelation.update_all updating a different set of records?

andy profile image Andy Zhao (he/him) ・1 min read

I've got a weird case of ActiveRecord_Relation magic doing something unintended. Given the following:

comment = Comment.find(params[:id])
# select only the original commenter's comments that match two scenarios
related_comments = comment.where(something: params[:something)
  .union(comment.where(something_else: params[:something_else]))
  .where(user_id: comment.user_id)

related_comments.count
#=> 4
# This is to demonstrate how many records I'm intending to update.
# Using .length would work too, for example's sake.

related_comments.update_all(receive_notifications: true)
#=> SQL (3.9ms)  UPDATE "comments" SET "receive_notifications" = 't'
#=> WHERE "comments"."user_id" = $1  [["user_id", 11]] [sql_query]
#=> 12

From the looks of it, .update_all is running a single SQL query that is updating all comments with a user_id of 11. The .where(user_id: 11) was technically the last query, but shouldn't it be running a SQL query for only the relations of related_comments?

I have a workaround where I could do this, but it would require two SQL queries and seems overly cumbersome than necessary:

comment = Comment.find(params[:id])
# same as above
related_comment_ids = comment.where(something: params[:something)
  .union(comment.where(something_else: params[:something_else]))
  .where(user_id: comment.user_id).pluck(:id)
#=> [12, 14, 17, 18]

Comment.where(id: related_comment_ids).update_all(receive_notifications: true)
#=> SQL (5.7ms)  UPDATE "comments" SET "receive_notifications" = 't'
#=> WHERE "comments"."id" IN (18, 14, 17, 12) [sql_query]
#=> 4

Discussion

pic
Editor guide
Collapse
rhymes profile image
rhymes

You probably hit a limitation of ActiveRecord :-)

union is not part of ActiveRecord, which might be the reason why update_all is not able to work with it.

There's a lot of magic in it :D

PostgreSQL itself is able to update rows from unions if you project the same exact row types from both queries:

PracticalDeveloper_development> select count(*) from comments;
+---------+
| count   |
|---------|
| 30      |
+---------+
PracticalDeveloper_development> update comments set updated_at = now() where id in (select id from comments where id < 10 union select id from comments where id > 15);
UPDATE 24

In conclusion: I don't know what's going on 🧐

Collapse
andy profile image
Andy Zhao (he/him) Author

Haha yeah the magic is certainly mystifying me. Guess I'll put on my wizard robes and dive into the code.

Collapse
rhymes profile image
rhymes

Let us know if you find anything :D

Collapse
andy profile image
Andy Zhao (he/him) Author

Hmm interesting, I'll give this a try when I get a chance.

Re: what I'm trying to do. I'm trying to combine two Active record queries and then do a third where query to get only the commenter's comments. The final chained query should produce a single SQL query, and then I want to update_all a single collection/relation of records.

The where is called on the ActiveRecord relation and not an individual ActiveRecord record, which is how I can chain the where after doing union.

Collapse
phallstrom profile image
Philip Hallstrom

What is the SQL generated by related_comments.count. Count will change the query sometimes if Rails decides it doesn't need the join/includes. I'm guessing it's dropping your union -- or simply not doing it correctly. And that's what is leading to the difference in rows affected.

edit: however your second solution suggests i'm wrong about that. still, would be curious to see the SQL if you just run the query to select them and for the count.

Collapse
andy profile image
Andy Zhao (he/him) Author

Here's the SQL for related_comments.count:

(22.0ms)  SELECT COUNT(*) FROM ( (SELECT "comments".* FROM "comments"
WHERE (("comments"."ancestry" LIKE '12/%' OR "comments"."ancestry" = '12')
OR "comments"."id" = 12)) UNION (SELECT "comments".* FROM "comments" WHERE
1=0 ORDER BY COALESCE("comments"."ancestry", '') ASC) ) "comments" WHERE
"comments"."user_id" = $1  [["user_id", 11]] [sql_query]

And for the select:

(4.1ms)  SELECT "comments".* FROM ( (SELECT "comments".* FROM "comments"
WHERE (("comments"."ancestry" LIKE '12/%' OR "comments"."ancestry" = '12')
OR "comments"."id" = 12)) UNION (SELECT "comments".* FROM "comments" WHERE
1=0 ORDER BY COALESCE("comments"."ancestry", '') ASC) ) "comments" WHERE
"comments"."user_id" = $1  [["user_id", 11]] [sql_query]

FWIW the code I'm using does not run related_comments.count; that was to demonstrate how many records should be updated.

Collapse
coreyja profile image
Corey Alexander

I don't know what the magic/issue is with your first example, but I have an idea of how to make your workaround take a single query. I can't seem to get an example like this working with union(and on Rails 5.2 it gives a deprecation warning, about removing the delegation down into AREL) so I can't test this well myself.

But if you do NOT pluck the id, and simply pass a relation into the where clause for id it will do a sub-select for the update. I tried it with a simpler case and got the update_all to use a sub_select. So I think you should be able to do this in one query with:

comment = Comment.find(params[:id])

related_comments = comment.where(something: params[:something)
  .union(comment.where(something_else: params[:something_else]))
  .where(user_id: comment.user_id)


Comment.where(id: related_comments).update_all(receive_notifications: true)
andy profile image
Andy Zhao (he/him) Author

Ahh gotcha :) I'll keep note of that for next time!

Thread Thread
andy profile image
Andy Zhao (he/him) Author

You're totally right about not being able to call .where on comment; the example I used is wrong. I'm actually doing this:

comment.subtree.union(comment.ancestors).where(user_id: comment.user_id)

.subtree and .ancestors are methods from a gem, and I initially thought putting that in the example would be confusing. My workaround-example made it more confusing. 🙃

andy profile image
Andy Zhao (he/him) Author

The Ancestry gem. It's what we (dev.to) use for the comment tree.

Thread Thread
andy profile image
Andy Zhao (he/him) Author

@acflint running .or gives me an error since the built in subtree and ancestors methods have some extra SQL in there. :(

comment.subtree.or(c.ancestors)
#=> ArgumentError: Relation passed to #or must be
#=> structurally compatible. Incompatible values:
#=> [:order, :reordering]
Collapse
ben profile image
Ben Halpern

What happens if you tack .all onto the first related_comments = query?

Collapse
andy profile image
Andy Zhao (he/him) Author

.update_all still returns 12 :/