DEV Community

Michael Z
Michael Z

Posted on • Edited on • Originally published at michaelzanggl.com

4 MySQL traps

Originally posted at michaelzanggl.com. Subscribe to my newsletter to never miss out on new content.

Note: Some of these might also exist in other database systems.

utf8 is not actually utf8

Ever ran into this SQL exception? Incorrect string value: โ€˜\xF0\x9F\x98\x83 <โ€ฆโ€™ for column...

You probably ran into the same trap as many others. Turns out mySQL's utf8 charset can only store around 6% of all possible unicode code points. This also excludes emoticons for example.

insert into emails(body) values ('๐Ÿฆ„๐Ÿฆ„๐Ÿฆ„');
Enter fullscreen mode Exit fullscreen mode

That's why you should use utf8mb4 instead.

Read more about it and how to migrate: https://mathiasbynens.be/notes/mysql-utf8mb4

comparing varchar_field with false or 0

Imagine the following contrived example

function normalizeEmail(email) {
    if (!validate(email)) {
        return false
    }

    return normalize(email)
}

// somewhere in the code
await User.where('email', normalizeEmail(email)).first()
Enter fullscreen mode Exit fullscreen mode

The imaginary ORM would execute the query select * from users where email = <prepared value> LIMIT 1

If the validation inside normalizeEmail was successful, the query would be select * from users where email = 'normalized.email@test.com' LIMIT 1.

If the validation was not successful, the query would be select * from users where email = false LIMIT 1.

Now go ahead and run something like select * from users where <varchar field> = false in your database system.

Since the fields are not comparable, MySQL will convert one to the other, make them match and return you all users. Our ORM will simply pick the first user and continue the logic with that. ๐Ÿ˜ฌ Pretty dangerous.

The same happens with field = 0

insert on duplicate key update creates primary key holes

Imagine we have a table statistics with the columns id(AI), fkid, title.

INSERT INTO statistics (fkid, title) VALUES (1, 'first');
Enter fullscreen mode Exit fullscreen mode

This will insert a new record with the id 1. Let's imagine there is some batch which always inserts or updates the title. It might execute the following queries over some time:

INSERT INTO statistics (fkid, title) VALUES (1, 'second') ON DUPLICATE KEY UPDATE title = 'second';
INSERT INTO statistics (fkid, title) VALUES (1, 'third') ON DUPLICATE KEY UPDATE title = 'third';
INSERT INTO statistics (fkid, title) VALUES (1, 'fourth') ON DUPLICATE KEY UPDATE title = 'fourth';
Enter fullscreen mode Exit fullscreen mode

Finally we want to insert a record with a new fkid.

INSERT INTO statistics (fkid, title) VALUES (100, 'first') ON DUPLICATE KEY UPDATE title = 'first';
Enter fullscreen mode Exit fullscreen mode

This inserts a new record in the table, but guess what the id is? You'd might expect it to be 2, but in reality, everytime insert on duplicate key update failed to insert and processed the update, it incremented the auto increment internally. That means the id of this record would be 5.

It's not too important that the IDs are really sequential, but you might wonder what's the cause of it.

Read more about it: https://stackoverflow.com/questions/38347110/prevent-innodb-auto-increment-on-duplicate-key

int(2) is not what you think it means

The length on INTs have no real meaning in MySQL, you can still insert values like 9999999. They only restrict how many characters should be displayed in the command line client. ๐Ÿคจ

Read more about it: https://stackoverflow.com/questions/5634104/what-is-the-size-of-column-of-int11-in-mysql-in-bytes

Are there any other traps you are aware of?

Top comments (2)

Collapse
 
faizan profile image
Faizan Akram Dar

Amazing

Collapse
 
mrscx profile image
MrScX

This is a really informative article. Thank a lot!