DEV Community

Mohmed Ishak

Posted on Jun 16, 2021

Build Relational Databases Like A Pro (MySQL, PostgreSQL, etc.)

#database #sql #mysql

Let's face it, building a database that works is not a challenge at all. The real challenge is to build a database that is fast, reliable, scalable, and well-engineered. In this article, let me show you some of the most important tips when building relational databases like MySQL, PostgreSQL, etc.

[1] Index Your Database

Consider a database table (customers) with the following header/columns:
| id | state | points |

// 10 million rows go here...

Here's how to query the database to get customers in 'NYC':

SELECT id
FROM customers
WHERE state = 'NYC';

The query will be slow because we've got 10 million rows and the DB engine needs to query all rows to get the relevant data. With indexing we can eliminate all irrelevant rows before querying the data so only rows with 'NYC' will be checked when 'NYC' is queried. Here's the same query with index:

CREATE INDEX idx_state ON customers (state);
-- convention is to prefix index with "idx" followed by underscore and then column name. 
SELECT id
FROM customers
WHERE state = 'NYC';

This query will be so much faster because only rows with 'NYC' will be queried, not all rows, thanks to indexing. Think of indexing as educating the database engine on which data is where and how it is stored so that when you query data next time, it gives the relevant data more efficiently without wasting resources/time.

I can talk so much about indexing but I don't want to bore you. If you want to learn more about indexing, Google these: "multiple-column indexing", "indexing and sort operation", "composite indexes", etc.

[2] Think in Present

You need to only store that data that you need or the business needs. Don't try to solve a problem which might or might not happen in future. If you need to store additional data, you can always create a new table and add relationship(s).

[3] Say No To Blobs

Blob stands for Binary Large Object and this basically means images, audios, and other large files converted to binary format. If I could impose a rule in the tech world, I'd forbid blob as a data in database tables, because it's incredibly bad to do so. Blobs increase the size of database like crazy and this impacts the performance.

[4] Find The Sweet Spot Between Normalizing / Denormalizing Data

If a table has way too much columns, it never means the table is well-engineered. If that's the case, consider splitting your table to two or more tables. In contrast, if there are a whole lot of tables with very few columns (say 1 to 2) and they're connected with relationships, consider combining them (and be okay with duplicating some data). A lot of beginners think that denormalizing data is something bad because it has the prefix "de" but I can assure you it's not true, lol.

[5] Select The Perfect Data Type

Just because a data type works well with your use case, it doesn't mean it's the best one. For example, to store age of users, use TINYINT data type (an unsigned TINYINT stores values from 0 to 255) instead of INT. INT data type stores up to 2 billion (or something like that). This takes more space and we don't need it at all. The age of users will never go beyond 255. If anyone does reach 254 years old, we still have 1 year to change the data type.

[6] Don't Use Relational Databases, They're Not Scalable

Just kidding, or am I? Just kidding... or am I?

Top comments (9)

Mahamed Belkheir • Jun 17 '21

jokes aside, people who say relational databases aren't scalable enough are either on the scale of major web service providers like Netflix, Discord etc, or extremely underestimating what a properly setup relational database can do

Davis Tibbz • Jun 19 '21

I mean there is a reason why YouTube, Paypal, Netflix are using MySQL. Instagram uses PostgreSQL

Mahamed Belkheir • Jun 19 '21

yep, usually multiple different databases are used at that scale, discord getting millions of DB writes a second made them choose use Mongo initially, then even that did not cope with the amount of writes they were getting, they then migrated to Cassandra, and eventually Scylla as their messages DB

Mohmed Ishak • Jun 19 '21

Yeah, relationally DBs force you to be extremely organized when designing them.

Mohmed Ishak • Jun 17 '21

Right 😃

Ivan Borshchov • Jun 17 '21

Full true, agree with all. My five cents: I even created a post about indexes where I shown on real open source experiments which show huge power of indexes hinty.io/devforth/sql-query-optimi...

Mohmed Ishak • Jun 17 '21

Damn, you wrote an entire crash course. 😆

Bobby Iliev • Jun 18 '21

This is a great article! 🚀

If you are an opensource enthusiast, feel free to contribute it as a chapter to this opensource eBook here:

bobbyiliev / introduction-to-sql

Free Introduction to SQL eBook

💡 Introduction to SQL

This is an open-source introduction to SQL guide that will help you learn the basics of SQL and start using relational databases for your SysOps, DevOps, and Dev projects. No matter if you are a DevOps/SysOps engineer, developer, or just a Linux enthusiast, you will most likely have to use SQL at some point in your career.

The guide is suitable for anyone working as a developer, system administrator, or a DevOps engineer and wants to learn the basics of SQL.

🚀 Download

To download a copy of the ebook use one of the following links:

📘 Chapters

About the book
Databases
MySQL
Tables
Basic Syntax
SELECT
WHERE
ORDER and GROUP BY
INSERT
UPDATE
DELETE
JOIN
The MySQL dump command
Conclusion

🌟 Sponsors

This book is made possible thanks to these fantastic companies!

💙 DigitalOcean

DigitalOcean is a cloud services platform delivering…

View on GitHub

Mohmed Ishak • Jun 18 '21

Thanks man.