DEV Community

AdityaPratapBhuyan
AdityaPratapBhuyan

Posted on

Understanding Database Normalization: A Comprehensive Guide

DB Normalization

Normalisation is essential to database administration because it guarantees data economy, scalability, and integrity. Database Normal Forms are a collection of guidelines that control how data is arranged in relational databases to maximise efficiency and reduce dependencies and redundancies. From First Normal Form (1NF) to Sixth Normal Form (6NF), we shall examine the nuances of each Normal Form in this article, including thorough justifications and instructive examples.

First Normal Form (1NF)

The First Normal Form (1NF) is the fundamental building block of database normalization. To meet the requirements of 1NF, a relation must have:

  • Atomic Values: Each attribute or field within a relation must hold atomic values, meaning they cannot be further divided.
  • Unique Column Names: Every column in a relation must have a unique name to avoid ambiguity.
  • No Duplicate Rows: Each row in a relation must be unique, with no duplicate tuples.

Example:

Consider the following table representing student information:

Student_ID Name Courses
001 John Math, Physics
002 Alice Chemistry, Math
003 Bob Physics, Biology

To convert this table into 1NF, we need to ensure atomicity and eliminate repeating groups. One way to achieve this is by creating separate rows for each course taken by a student:

Student_ID Name Course
001 John Math
001 John Physics
002 Alice Chemistry
002 Alice Math
003 Bob Physics
003 Bob Biology

Second Normal Form (2NF)

Second Normal Form (2NF) builds upon 1NF by addressing partial dependencies within relations. A relation is in 2NF if it meets the following criteria:

  • It is in 1NF.
  • All non-key attributes are fully functionally dependent on the primary key.

Example:

Consider a table that records orders and their corresponding products:

Order_ID Product_ID Product_Name Unit_Price
1001 001 Laptop $800
1001 002 Mouse $20
1002 001 Laptop $800
1003 003 Keyboard $50

In this table, Order_ID serves as the primary key, and Product_ID is a partial key. To achieve 2NF, we need to separate the product information into a separate table:

Third Normal Form (3NF)

Third Normal Form (3NF) further refines the normalization process by eliminating transitive dependencies. A relation is in 3NF if it satisfies the following conditions:

  • It is in 2NF.
  • There are no transitive dependencies; that is, no non-key attribute depends on another non-key attribute.

Example:

Consider a table that stores information about employees, including their department and location:

Employee_ID Employee_Name Department Location
001 John Marketing New York
002 Alice HR Los Angeles
003 Bob Marketing New York

In this table, both Department and Location are non-key attributes. However, Location depends on Department, creating a transitive dependency. To normalize this table to 3NF, we split it into two:

Boyce-Codd Normal Form (BCNF)

Boyce-Codd Normal Form (BCNF) is an extension of 3NF, addressing certain anomalies that may arise in relations with multiple candidate keys. A relation is in BCNF if, for every non-trivial functional dependency X β†’ Y, X is a superkey.

Example:

Consider a table representing courses and their instructors:

Course_ID Instructor_ID Instructor_Name Course_Name
001 101 John Math
002 102 Alice Physics
001 103 Bob Math

In this table, {Course_ID, Instructor_ID} is a composite primary key. However, Instructor_Name depends only on Instructor_ID, violating BCNF. To normalize this table, we separate the Instructor information:

Fifth Normal Form (5NF)

Fifth Normal Form (5NF), also known as Project-Join Normal Form (PJNF), addresses multi-valued dependencies within relations. A relation is in 5NF if it satisfies the following conditions:

  • It is in 4NF.
  • All join dependencies are implied by the candidate keys.

Example:

Consider a table that represents the relationship between authors and their published books:

Author_ID Book_ID Author_Name Book_Title
101 001 John Book1
101 002 John Book2
102 001 Alice Book1
103 003 Bob Book3

In this table, {Author_ID, Book_ID} forms a composite primary key. However, there is a multi-valued dependency between Author_ID and Book_Title. To normalize this table to 5NF, we split it into two:

Sixth Normal Form (6NF)

Sixth Normal Form (6NF), also known as Domain-Key Normal Form (DK/NF), deals with cases where dependencies exist between attributes and subsets of the keys. A relation is in 6NF if it meets the following criteria:

  • It is in 5NF.
  • There are no non-trivial join dependencies involving subsets of the candidate keys.

Example:

Consider a table representing sales data for products:

Product_ID Product_Name Region Sales
001 Laptop East $500
001 Laptop West $700
002 Mouse East $100
002 Mouse West $150

In this table, {Product_ID, Region} is a composite key. However, there is a non-trivial join dependency between Region and Sales, as Sales depend only on Region. To normalize this table to 6NF, we separate the Region and Sales information.

Conclusion

To sum up, database normalisation is an essential step in creating relational databases that are effective and easy to maintain. Database designers can minimise redundancy, stop data abnormalities, and improve query efficiency by following the guidelines of Normal Forms. Comprehending and utilising the many Normal Forms, ranging from 1NF to 6NF, equips database experts to develop resilient and expandable database structures that satisfy the dynamic requirements of contemporary applications.

Top comments (0)