DEV Community


Build Relational Databases Like A Pro (MySQL, PostgreSQL, etc.)

Mohmed Ishak
I'm Mohmed Ishak and I love to write short and concise tech articles when I'm free.
・3 min read

mohmed ishak
Let's face it, building a database that works is not a challenge at all. The real challenge is to build a database that is fast, reliable, scalable, and well-engineered. In this article, let me show you some of the most important tips when building relational databases like MySQL, PostgreSQL, etc.

[1] Index Your Database

Consider a database table (customers) with the following header/columns:
| id | state | points |

// 10 million rows go here...

Here's how to query the database to get customers in 'NYC':

FROM customers
WHERE state = 'NYC';
Enter fullscreen mode Exit fullscreen mode

The query will be slow because we've got 10 million rows and the DB engine needs to query all rows to get the relevant data. With indexing we can eliminate all irrelevant rows before querying the data so only rows with 'NYC' will be checked when 'NYC' is queried. Here's the same query with index:

CREATE INDEX idx_state ON customers (state);
-- convention is to prefix index with "idx" followed by underscore and then column name. 
FROM customers
WHERE state = 'NYC';
Enter fullscreen mode Exit fullscreen mode

This query will be so much faster because only rows with 'NYC' will be queried, not all rows, thanks to indexing. Think of indexing as educating the database engine on which data is where and how it is stored so that when you query data next time, it gives the relevant data more efficiently without wasting resources/time.

I can talk so much about indexing but I don't want to bore you. If you want to learn more about indexing, Google these: "multiple-column indexing", "indexing and sort operation", "composite indexes", etc.

[2] Think in Present

You need to only store that data that you need or the business needs. Don't try to solve a problem which might or might not happen in future. If you need to store additional data, you can always create a new table and add relationship(s).

[3] Say No To Blobs

Blob stands for Binary Large Object and this basically means images, audios, and other large files converted to binary format. If I could impose a rule in the tech world, I'd forbid blob as a data in database tables, because it's incredibly bad to do so. Blobs increase the size of database like crazy and this impacts the performance.

[4] Find The Sweet Spot Between Normalizing / Denormalizing Data

If a table has way too much columns, it never means the table is well-engineered. If that's the case, consider splitting your table to two or more tables. In contrast, if there are a whole lot of tables with very few columns (say 1 to 2) and they're connected with relationships, consider combining them (and be okay with duplicating some data). A lot of beginners think that denormalizing data is something bad because it has the prefix "de" but I can assure you it's not true, lol.

[5] Select The Perfect Data Type

Just because a data type works well with your use case, it doesn't mean it's the best one. For example, to store age of users, use TINYINT data type (an unsigned TINYINT stores values from 0 to 255) instead of INT. INT data type stores up to 2 billion (or something like that). This takes more space and we don't need it at all. The age of users will never go beyond 255. If anyone does reach 254 years old, we still have 1 year to change the data type.

[6] Don't Use Relational Databases, They're Not Scalable

Just kidding, or am I? Just kidding... or am I?

Discussion (9)

belkheir profile image
Mahamed Belkheir

jokes aside, people who say relational databases aren't scalable enough are either on the scale of major web service providers like Netflix, Discord etc, or extremely underestimating what a properly setup relational database can do

longwater1234 profile image
Davis Tibbz

I mean there is a reason why YouTube, Paypal, Netflix are using MySQL. Instagram uses PostgreSQL

belkheir profile image
Mahamed Belkheir

yep, usually multiple different databases are used at that scale, discord getting millions of DB writes a second made them choose use Mongo initially, then even that did not cope with the amount of writes they were getting, they then migrated to Cassandra, and eventually Scylla as their messages DB

ishakmohmed profile image
Mohmed Ishak Author

Yeah, relationally DBs force you to be extremely organized when designing them.

ishakmohmed profile image
Mohmed Ishak Author

Right πŸ˜ƒ

ivictbor profile image
Ivan Borshchov

Full true, agree with all. My five cents: I even created a post about indexes where I shown on real open source experiments which show huge power of indexes

ishakmohmed profile image
Mohmed Ishak Author

Damn, you wrote an entire crash course. πŸ˜†

bobbyiliev profile image
Bobby Iliev

This is a great article! πŸš€

If you are an opensource enthusiast, feel free to contribute it as a chapter to this opensource eBook here:

GitHub logo bobbyiliev / introduction-to-sql

Free Introduction to SQL eBook

πŸ’‘ Introduction to SQL

This is an open-source introduction to SQL guide that will help you learn the basics of SQL and start using relational databases for your SysOps, DevOps, and Dev projects. No matter if you are a DevOps/SysOps engineer, developer, or just a Linux enthusiast, you will most likely have to use SQL at some point in your career.

The guide is suitable for anyone working as a developer, system administrator, or a DevOps engineer and wants to learn the basics of SQL.

πŸš€ Download

To download a copy of the ebook use one of the following links:

πŸ“˜ Chapters

  • About the book
  • Databases
  • MySQL
  • Tables
  • Basic Syntax
  • JOIN
  • The MySQL dump command
  • Conclusion

🌟 Sponsors

This book is made possible thanks to these fantastic companies!

πŸ’™ DigitalOcean

DigitalOcean is a cloud services platform delivering…

ishakmohmed profile image
Mohmed Ishak Author

Thanks man.