DEV Community

Cover image for Adding sequence duplicate records in MySQL without a window function
sugawani
sugawani

Posted on • Edited on

Adding sequence duplicate records in MySQL without a window function

Introduction

*I'm learning English now. Please correct my English if I wrote something wrong.

For example, suppose you have a table that contains the following data.

id group_id name
1 1 hoge
2 1 hoge
3 2 fuga
4 2 fuga
5 2 fuga
6 2 piyo
7 3 foo
8 4 bar
9 4 bar

For each group_Id, add suffixes to duplicate names as follows.

id group_id name
1 1 hoge
2 1 hoge_2
3 2 fuga
4 2 fuga_2
5 2 fuga_3
6 2 piyo
7 3 foo
8 4 bar
9 4 bar_2

How to

If you are using MySQL8. Window function can be used.
but in this article, I will explain the queries that can be used with MySQL5.

All queries

For example, the table name is "test_group".

-- Adding sequence duplicate records
SET @group_id = NULL, @name = NULL, @seq = 0;
CREATE TEMPORARY TABLE tmp_test_group AS
SELECT
    id,
    group_id,
    name,
    @seq := IF(@group_id = group_id AND @name = name, @seq + 1, 1) AS seq,
    @group_id := group_id AS hoge,
    @name := name AS hoge2
FROM
    test_group
WHERE
    (group_id, name) IN (
        SELECT
            group_id, name
        FROM
            test_group
        GROUP BY
            group_id, name
        HAVING
            count(group_id) > 1 AND count(name) > 1
    )
ORDER BY
    group_id, name;

-- Add suffix to name
UPDATE
    tmp_test_group
SET
    name = CONCAT(name, '_', seq)
WHERE
    seq <> 1;

-- update origin table name
UPDATE
    group AS org
INNER JOIN
    tmp_test_group AS tmp ON org.id = tmp.id
SET
    org.name = tmp.name
Enter fullscreen mode Exit fullscreen mode

Explanation

SELECT duplicate records

SELECT
    group_id, name
FROM
    test_group
GROUP BY
    group_id, name
HAVING
    count(group_id) > 1 AND count(name) > 1
Enter fullscreen mode Exit fullscreen mode

Use GROUP BY and HAVING to get records with duplicate group_id and name.

Adding sequence duplicate records

SET @group_id = NULL, @name = NULL, @seq = 0;
CREATE TEMPORARY TABLE tmp_test_group AS
SELECT
    id,
    group_id,
    name,
    @seq := IF(@group_id = group_id AND @name = name, @seq + 1, 1) AS seq,
    @group_id := group_id AS hoge,
    @name := name AS hoge2
Enter fullscreen mode Exit fullscreen mode

Use SELECT and variables to check if the current record matches the previous one by group_id, name.
Here's how it works.

1. SELECT first record

id group_id name
1 1 hoge
  • @seq = 1
    • At this point, both @group_id and @name are NULL, so the IF condition evaluates to false, and 1 is assigned.
  • @group_id = 1
  • @name = 'hoge'

2. INSERT to tmp_test_group

id group_id name seq
1 1 hoge 1

3. SELECT next record

id group_id name
2 1 hoge
  • @seq = 2
    • At this point, @group_id = 1 and @name = 'hoge'.
    • Since the currently selected group_id, name match, the IF condition evaluates to true, and @seq + 1 is assigned.
  • @group_id = 1
  • @name = 'hoge'

4. INSERT to tmp_test_group

id group_id name seq
1 1 hoge 1
2 1 hoge 2

5. SELECT next record

id group_id name
3 2 fuga
  • @seq = 1
    • At this point, @group_id = 1 and @name = 'hoge
    • Since the currently selected group_id, name is not match, the IF condition evaluates to false, and 1 is assigned
  • @group_id = 2
  • @name = 'fuga'

6. INSERT to tmp_test_group

id group_id name seq
1 1 hoge 1
2 1 hoge 2
3 2 fuga 1

This process is repeated for each record until there are no more records.

💡Point💡
Always assign to @seq before @group_id, @name.
If you evaluate before @seq, you would lose last selected record.

Add suffix to name

UPDATE tmp_test_group
SET name = CONCAT(name, '_', seq)
WHERE seq <> 1;
Enter fullscreen mode Exit fullscreen mode

I want to create records named hoge, hoge_2, hoge_3..., so I am updating records where seq is greater than 1.

update origin table name

UPDATE
    group AS org
INNER JOIN
    tmp_test_group AS tmp ON org.id = tmp.id
SET
    org.name = tmp.name
Enter fullscreen mode Exit fullscreen mode

Update the original table's name by joining on id with the temporary table.

FYI

If you are using MySQL8, you can use Window function to write this more easily.

WITH tmp AS (
    SELECT
        id,
        group_id,
        name,
        ROW_NUMBER() OVER (PARTITION BY group_id, name ORDER BY id) AS seq
    FROM
        test_group
    WHERE (group_id, name) IN (
        SELECT
            group_id, name
        FROM
            test_group
        GROUP BY
            group_id, name
        HAVING
            count(group_id) > 1 AND count(name) > 1
    )
)
UPDATE
    test_group
JOIN
    tmp ON test_group.id = tmp.id
SET
    test_group.name = CONCAT(test_group.name, '_', tmp.seq)
WHERE
    tmp.seq > 1;
Enter fullscreen mode Exit fullscreen mode

Top comments (0)