Selecting duplicate records might be useful in cases when you want to introduce new multi column index on existing data.
duplicates = ActiveRecord::Base.connection.execute('
SELECT items.id, items.name
FROM items
WHERE items.id IN (
SELECT
id
FROM (
SELECT
id,
ROW_NUMBER() OVER w AS rnum
FROM items
WINDOW w AS (
PARTITION BY entity_id, barcode
ORDER BY id
)
) t
WHERE t.rnum > 1
);
')
Our exported data set will have id
and name
, so that we can tell which rows we want to delete. And it will only show rows where entity_id
and barcode
would not be unique. That's an example, yours will be different.
After the cleanup, when the query above would return an empty result set we should be able to create following index without any issues:
add_index(:items, [:entity_id, :barcode])
Inspiration taken from the article Rails: migration: Adding a unique Index and deleting Duplicates
Image generated via Midjourney prompt: Heap of items where every object is represented twice; --ar 16:9
Top comments (1)
A comment, by request