DEV Community

loading...

Sanitizing SQL in Rails/ActiveRecord

nicklevenson
I'm a newer full stack engineer who is here to learn and write about my coding endeavors. Let's get in touch!
・3 min read

ActiveRecord methods are great for easily querying your database. However, at some point you'll probably have to implement custom sql into your application if the traditional ActiveRecord methods won't get you what you need. When writing custom SQL in Rails, it is important to sanitize it, making sure there's no chance for an injection attack. Sometimes, however, you will have to interpolate Ruby variables into your SQL queries, but regular interpolation can be insecure. So what's the solution?

I have recently been working on an application with a Rails backend, that essentially matches musicians with each other based on similar interests, locational range, etc. When I was working on my Rails application recently, I had to write a long custom SQL statement. I wanted to make a method that could find users based on range, similar tags, and any filters like genres or instruments (in relation to the current user).

The query needed to be dynamic - the method containing the query took in arguments and those arguments would affect what the query would be looking for. Here is some pseudo code of the method I wanted to create.

def similarly_tagged_users(range: nil, instruments: nil, genres: nil)
 #NO IDS - get a list of ids of users who the current user is 
 already connected with, has rejected, or is the current user 
 themselves.

 #FILTERED RANGE IDS - see if the user applied a mile radius 
 to apply to the query
 #get a list of ids of users who are in range of the current 
 user

 #OTHER IDS TO FILTER BY - see if the user applied filters 
 (list of instruments and genres)
  #get a list of ids of users who play these instruments or 
  identify with these genres

 #these will all be arrays of ids above. 

 #write a query to sort users by tags that match the current 
 user so users with the most similar tags appear first, as 
 well as not include any of the NO IDS, filter by RANGE IDS, 
 and OTHER IDS TO FILTER BY. Limit results to only 50 users. 

end

Enter fullscreen mode Exit fullscreen mode

Ok so we have our pseudo code. We know that we will have an array of ids of users to exclude in the query, an array of users that fit the range parameters, as well as an array of ids of users with instruments and genres that match the query. Here's where the sanitization comes in.

ActiveRecord has a sanitization method called sanitize_sql_array, that lets you pass an array into a SQL statement. It looks like: sanitize_sql_array(["name=? and group_id=?", "foo'bar", 4]). You pass an array into the method where the first index is the SQL query you want to write with the question mark placeholder, and the following indexes are variables you can pass in. With this method, any sql injections will be escaped, so you can safely pass variables to your query.

So for example, my users in range looks like this:

range_query = range ? conn.sanitize_sql_array(["u.id IN(?)", self.users_in_range(all_users, range)]) : "true"
Enter fullscreen mode Exit fullscreen mode

I am checking if there is range in the parameters of the method, and if there is, returning the sanitized sql that has executed the method that returns an array of user ids that are in range of a given radius. So that variable, range_query, can be accessed and put into the custom SQL query I would write later. One thing I should note is that you have to establish a connection with the ActiveRecord base to execute these methods. You may notice before the method is 'conn'. Before I use these methods I write: conn = ActiveRecord::Base so I can use conn as the base for the sanitization methods.

In the end, my sql query looked like this:

sql2 = <<~SQL
      SELECT u.*, COALESCE(matching_tag_counts.n, 0) AS similarity_score
      FROM users AS u
        LEFT OUTER JOIN (
          SELECT user_id, COUNT(*) AS n
          FROM usertags
          WHERE #{conn.sanitize_sql_array(["tag_id IN(?)", self.tag_ids])}
          GROUP BY user_id
        ) AS matching_tag_counts ON u.id=matching_tag_counts.user_id
        WHERE #{conn.sanitize_sql_array(["u.id NOT IN(?)", no_ids])}
        AND #{range_query}
        AND #{genre_instrument_query}
        ORDER BY similarity_score DESC
        LIMIT 50
    SQL
Enter fullscreen mode Exit fullscreen mode

You can see how I interpolated variables that have already been sanitized (range_query and genre_instrument_query). I also interpolated the sanitize_sql_array method in a couple areas. So this is how you can safely interpolate in a custom SQL queries.

Discussion (0)