DEV Community

Pranav Gowtam
Pranav Gowtam

Posted on

Normalization of a Database

Welcome back!

In this blog, I will be going over the normalization I implemented into my COVID-19 database to reduce redundancy and clean up the database in general.

So, the best place to start is to answer the overarching question: what is normalization? Also, what are these forms of normalization?

First, normalization is the process of taking tables within a database and breaking them down to reduce the redundancy or repetition of a specific quality. In the example below, in our database, we have a column in the table 'GeographicLocation' named 'Country'. However, since the table also implements the use of zip codes, the 'Country' column becomes irrelevant. Zipcodes lock the app into being US-only.

Alt Text

Normalization has different forms to it. The two I will be exploring in this blog post are 1NF and 2NF -- 1st and 2nd Normal Form.

1NF is the "base" level of all normal forms. Normal forms build off of each other. Each subsequent form must carry all of the qualities of the previous forms. So, 3NF has to have all of the properties in 1NF, 2NF, and the new ones in 3NF.

In the example below, I have my 'Behavior' table. 1NF has three properties that need to be true:

  1. Data is stored within tables and rows; can be uniquely identified by a primary key
  2. Said data is in its most reduced form
  3. There are no repeating/redundant columns

'Behavior' passes these criteria because all of the data in its table is properly stored and has the overarching 'BehaviorID' primary key, all of the data cannot be reduced any further, and there are no redundant columns present.

Alt Text

Moving on, we have the 2nd Normal Form. Looking at the example below, we see that we have a column named 'WorkplacePolicy' in the 'Occupation' table. While the column is not inherently wrong in terms of formatting, it is wrong in terms of placement.

Alt Text

Looking at all of the other attributes on the table, they all directly relate to the primary key. In this case, it is 'CompanyID'. 'WorkplacePolicy' doesn't directly correlate to an individual's ID; rather, it correlates to an entire company. If the policy changes, the change may not sync up to the other values in other columns, which is known as an "anomaly".

To fix this, I created a separate table called 'WorkplacePolicy' with different Boolean values to make it more fleshed out and able to operate independently. I also created a new column named 'CompanyName' to act as a new foreign key in 'Occupation'.

Alt Text

Alt Text

So, I hope this gave a better overview for normalization, 1NF, and 2NF than my video did -- the purpose of my video was mainly to show off the changes I made to my database, while this post was to really go into the technical component of it all.

If you haven't watched my video, you should! I go over the changes above and also how to properly use the alter command.

With that said, thank you for reading -- I hope you have a great day!

~ Incerah

Top comments (0)