DEV Community

Cover image for Deleting all data from a table in Ruby on Rails
Amirul Asyraf
Amirul Asyraf

Posted on

Deleting all data from a table in Ruby on Rails

If you need to refresh your database, you might find that you need to delete every row from a specific table during development.

That is slow 🐢. Freaking slow 🐄.

Glad Rails have many small tricks to solve everything 🌈 (but still not documented properly 😂).

There are several ways to achieve this. I will rank it from the slowest/scary/bad to the simplest and yet powerful solution.

Let's go !!!


3) Drop the whole database and bring it back up again

We can achieve this with rake:db:drop, then rake:db:setup.

Wuuu, drop the database then create a new one, load the schema, and initialize it with the seed data.

Slow & Expensive !!! 🐢

2) destroy_all

Destroys the records by instantiating each record and calling its #destroy method. Each object's callbacks are executed (including :dependent association options).

Records are instantiated and it invokes before_remove, after_remove , before_destroy and after_destroy callbacks.

Examples

# 1
Person.where(sex: "male").destroy_all # 😂

# 2
class Author < ActiveRecord::Base
  has_many :books
end

author.books.size # => 3
author.books
# => [
#       #<Book id: 1, name: "Sapiens", author_id: 1>,
#       #<Book id: 2, name: "The Artist's Way", author_id: 2>,
#       #<Book id: 3, name: "Remote", author_id: 3>
#    ]

author.books.destroy_all

author.books.size # => 0
author.books      # => []

Book.find(1) # => Couldn't find Book with id=1
Enter fullscreen mode Exit fullscreen mode

Note: Instantiation, callback execution, and deletion of each record can be time consuming when you're removing many records at once. It generates at least one SQL DELETE query per record (or possibly more, to enforce your callbacks).

1) delete_all 👑

If you want to delete many rows quickly, without concern for their associations or callbacks, use delete_all instead.

Boot up a console and call delete_all on your model:

% rails c
> Book.count
# => 1200 
> Book.delete_all
# => 1200
> Book.count
# => 0
Enter fullscreen mode Exit fullscreen mode

Deletes the records without instantiating the records first, and hence not calling the #destroy method nor invoking callbacks.

This is a single SQL DELETE statement that goes straight to the database, much more efficient than destroy_all.

You need to be careful on two things:

1- As .delete_all method does not instantiate any object hence does not provide any callback (before_* and after_destroy don't get triggered).

2- Be careful with relations, in particular :dependent rules defined on associations are not honored. Returns the number of rows affected.

Post.where(person_id: 5).where(category: ['Something', 'Else']).delete_all
Enter fullscreen mode Exit fullscreen mode

Both calls delete the affected posts all at once with a single DELETE statement.

If you need to destroy dependent associations or call your before_* or after_destroy callbacks, use the destroy_all method instead.

More on here

The end


resources:

1 2 3 4 5 6 7

Latest comments (2)

Collapse
 
djuber profile image
Daniel Uber

There's also a way to truncate the table, effectively replacing the table's rows with an empty table. This bypasses any database side integrity triggers like cascading deletes of foreign keys, it's usually faster than deleting rows from a very large table, and should be roughly equivalent in meaning to the "delete_all" command.

Books.connection.truncate("books")
Enter fullscreen mode Exit fullscreen mode

This won't work in Sqlite (TRUNCATE is not supported), and won't call any of the database side triggers either, and may or may not fail based on foreign key constraints. It's sometimes faster than dropping the database and recreating it (since you don't need to reload the schema or migrations, and any keys or index definitions are already present, just the data has been removed).

More importantly, this doesn't permit any scoping - author.books.destroy_all sends a WHERE author_id = ? constraint to the DELETE command, but TRUNCATE always indiscriminately removes all data from the table and resets it with an empty table.

There's probably a size below which delete_all is faster than truncate, I haven't tested and suspect the shape of the table (columns and indexes) will contribute to the timing.

The biggest reason not to use a command like truncate (apart from the rarity of needing to wipe out all of the data) is that it's non-transactional, there's no way to rollback a truncation once you've done it, you can't guarantee atomic actions be bundled together (truncate authors but only if I can also truncate books, otherwise neither), and if you use a transaction rollback strategy around your test cases to provide isolation truncate will not be safe if you also expect to have seeded data present for tests.

Collapse
 
asyraf profile image
Amirul Asyraf

Wow, new stuff I learned today. Thanks Daniel !!!