Preventing Useless Database Hits

Molly Struve on December 16, 2018

After reading the title of this post many of you are probably thinking... DUH, I already know how to do that. But let’s just hold up a minute,... [Read Full]
markdown guide
 

Nice post 😊 I didn't know about none. I wonder why AR queries the db with 1=0 instead of doing whatever .none does 🤔

 

Right? I went into the AR source code one weekend and tried to update it in such a way that none would be used when it saw an empty array. Broke a lot of specs and after 3 hours of trying to fix them I gave up. One day I will go back!

 

I feel that, lol. So vindicating when you finally get it after numerous independent attempts!

 

There're a couple of things I don't understand about this.

Why would ActiveRecord want to ensure that no records are returned with WHERE 1=0? Unless... do you want a "chainable relation" of foos to which you can later add items? That seems like a topsy-turvy way of working. What am I getting mixed up on?

If the service is smart enough to inject a condition if the query is expected to be empty, why isn't it smart enough to not bother running it in the first place?

Why isn't the query cached, since it's identical to the previous one and most databases will have a cheap memory cache for that sort of thing?

In short, I don't understand why Ruby would be asking the database for something it already knows is going to be empty.

 

Why would ActiveRecord want to ensure that no records are returned with WHERE 1=0?

Your assumption is correct, they do this so they can return an active record relation. I dove into the ActiveRecord codebase trying to find a way to update it so that it would not hit the database and would return a None Relation instead but I was never able to figure it out and get all the specs passing.

If the service is smart enough to inject a condition if the query is expected to be empty, why isn't it smart enough to not bother running it in the first place?

What we are working with when dealing with these empty arrays is a method that we expect to return an active record relation so that we can chain other things to it. This is why we want to use the none scope, then we don't have to place conditionals all over the code base when using the method since it handles that internally. Here is one of the methods we use it in

def connectors
  @connectors ||= begin
    connector_ids = vulns.select(:connector_id)
    if connector_ids.empty?
     Connector.none
    else
     Connector.with_connector_definition(connector_ids)
    end
  end
end

Then all over the codebase we can call connectors.something and never have to worry about it.

Why isn't the query cached, since it's identical to the previous one and most databases will have a cheap memory cache for that sort of thing?

I tried to simplify the concept for the example but in real life these queries are being executed across many workers so the only layer of cacheing available is at the database level. Even then, we dont want to have to make a round trip call to the database if we dont have too

I don't understand why Ruby would be asking the database for something it already knows is going to be empty.

It is not actually Ruby doing the asking, it is Rails. When we take Ruby frameworks and use them they can hide a lot of what is going on under the hood and if you are not careful you can end up with lots of these types of useless database hits. The frameworks are trying to be as user friendly as possible and in some cases that can cause a loss in performance optimizations

Hopefully that explanation makes sense. Let me know if you have any other questions!

 
 

Would love to see a post from a dev.to team member about performance optimizations you guys have done with this site to make it so fast. I'd imagine there are lots of neat tricks to share!

 

Love all the code examples and how this is pretty much a broken down white paper but more approachable. This makes it so much easier to learn from your experience! I highly recommend everyone check out Molly's RubyConf talk, she rocked it!

 

@molly I've watched your speech 2 days ago. Great stuff!

 

@molly do you have any resources about caching strategies? What approach did you use for cache invalidation?

 

Shoot, I some how totally missed this comment last month! Sorry for the late reply!

I don't really have any good resources to share since most of what we learned was through trial and error on the job. This is a big reason why I wanted to share our experiences so hopefully others can learn from them. Redis docs are also super awesome!

As for cache invalidation, that is always tricky. You really have to do what works best for your situation and use case. You want it short enough that basically no one really notices its cached, yet long enough to provide performance benefit. For somethings a 5 minute cache gives us huge gains while other caches have to be set for 24 hours to see any benefit. Take your best guess at what you think is a good number and then track your performance. Is it helping? How much is it helping? Is it noticeable at all to clients or customers? You will likely have to tweak your original guess but eventually you will find the sweet spot. We have had to tweak more than one invalidation timeframe getting to where we are now and I suspect we will be tweaking many more in the future.

 
 
 

Cool! Sometimes we think that making a Rails app perform better is hard, but you proved the opposite!

 

Last time I used WordPress a few years ago, it was making over 80 database calls JUST to render the generic default homepage.. That is just absolutely bonkers!

code of conduct - report abuse