DEV Community

Cover image for Database Indexes, please be careful when using it!
Frastyawan Nym
Frastyawan Nym

Posted on

Database Indexes, please be careful when using it!

Lets do disclaimer before we start πŸ˜‰.

This is my current understanding of Database Index.
If there is any mistake that write that lead to misunderstanding, please let me know and I will update this post ASAP.

Let's begin πŸš€

What is database index?

Database index is a built-in tools that will help us for searching data inside database. Index will quickly locate record of our queries.

The problem

To help to understand, I will make up some use case. Example, we have a users table that have id, full_name & gender

users table

That users table will have 1000 rows. 2 of them is male and the rest (998) of them is female.
Say that we need to find our male user, we will create query something like this SELECT * FROM users WHERE gender = 'male'.
After we executed that query, we will have result of 2 users.

Behind the scene, database engine will search all of 1000 rows to find that 2 rows.
Yes, that sound inefficient. But that is how database engine works. 😡

What is the solution?

The solution is to use Database Index. πŸ’‘
With index, database engine will make some of algorithm that map our table based on the index we chose.

Okay, so we create an index on gender column inside users table.
Back again to our query, now if we execute SELECT * FROM users WHERE gender = 'male', we will get 2 users again as result (of course!).

The difference is, right now database engine will search on 2 rows to find that 2 rows.
Now that sound efficient❗

But wait...

After we know that, maybe there is a thinking like this:

"let's create index for all of our column, so we can get quick result every time we search something πŸ€”"

Thats right, you will get a quick query every search.
But please don't over-used it. Even though Index have upside, Index also have downside.

If we create an index, behind the scene database engine will write additional data to our database.

Lets back again to our users table. The calculation is like this:

2 index (1 for ID, 1 for gender) * 1000 rows = 2000
# PK is always get index
Enter fullscreen mode Exit fullscreen mode

Imagine if we put index to all of our column

4 index (ID, gender, full_name, created_at) * 1000 rows = 4000
# Thats only 4 columns.
# The more column we create, the more usage it will take when CREATE, UPDATE, DELETE, etc.
Enter fullscreen mode Exit fullscreen mode

the conclusion is: please use it wisely. πŸ˜‡

Bonus: How to create index

  • PostgreSql
CREATE TABLE "customers" (
  "id" SERIAL PRIMARY KEY,
  "gender" varchar,
  "full_name" varchar,
  "created_at" timestamp
);

CREATE INDEX ON "customers" ("gender");
Enter fullscreen mode Exit fullscreen mode
  • MySql
CREATE TABLE `customers` (
  `id` int PRIMARY KEY AUTO_INCREMENT,
  `gender` varchar(255),
  `full_name` varchar(255),
  `created_at` timestamp
);

CREATE INDEX `customers_index_0` ON `customers` (`gender`);
Enter fullscreen mode Exit fullscreen mode

Top comments (2)

Collapse
 
darkain profile image
Vincent Milum Jr

If you have low cardinality of your data, in PostgreSQL, you can use a partial index instead.

In your example, you have 2 items on one side, and 998 on the other. This is the perfect scenario for a partial index. This has the ability to store an index for the 2 items, while ignoring the other 998 rows. So instead of creating 1000 index entries, only 2 are created, keeping storage usage minimal, AND still retaining the performance benefits!

postgresql.org/docs/current/indexe...

Collapse
 
frasnym profile image
Frastyawan Nym

Thanks for the addition Vincent!

I didn't know before that we could create conditions on index