Different Normal Forms in Database Design
In database design, normalization is the process of organizing data to minimize redundancy and dependency, improving data integrity. The process involves dividing large tables into smaller, manageable ones and establishing relationships between them. This ensures that the database is free from anomalies such as insertion, update, and deletion anomalies.
The different normal forms represent specific levels of normalization. Each normal form builds upon the previous one and has its own set of rules. Below is an explanation of the most common normal forms:
1. First Normal Form (1NF)
1NF is the most basic level of normalization, focusing on eliminating duplicate data and ensuring that the data in a table is organized in a way that each column contains atomic values (no repeating groups).
-
Rules of 1NF:
- Each table cell should contain a single value (atomicity).
- Each record (row) must be unique.
- Each column should contain values of a single type (e.g., all integers, all strings).
- No repeating groups of columns or multiple values in a single column.
Example of 1NF:
Before 1NF:
OrderID | Products | Quantities |
---|---|---|
1 | Apple, Banana | 2, 3 |
2 | Orange | 5 |
After converting to 1NF:
OrderID | Product | Quantity |
---|---|---|
1 | Apple | 2 |
1 | Banana | 3 |
2 | Orange | 5 |
2. Second Normal Form (2NF)
2NF builds on 1NF by eliminating partial dependencies. A partial dependency occurs when a non-prime attribute (a column that is not part of the primary key) is dependent on only a part of the primary key (in case of composite primary keys). To achieve 2NF, the table must first meet the requirements of 1NF.
-
Rules of 2NF:
- The table must be in 1NF.
- Every non-prime attribute must be fully functionally dependent on the entire primary key (eliminate partial dependencies).
Example of 2NF:
Before 2NF (Partial Dependency):
OrderID | Product | CustomerName | Price |
---|---|---|---|
1 | Apple | John | 10 |
1 | Banana | John | 5 |
2 | Orange | Jane | 8 |
Here, CustomerName depends only on OrderID
and not on the full primary key (OrderID, Product
). To remove this, we split the table.
After 2NF:
Tables:
- Orders (OrderID, CustomerName)
- OrderDetails (OrderID, Product, Price)
Orders table:
OrderID | CustomerName |
---|---|
1 | John |
2 | Jane |
OrderDetails table:
OrderID | Product | Price |
---|---|---|
1 | Apple | 10 |
1 | Banana | 5 |
2 | Orange | 8 |
3. Third Normal Form (3NF)
3NF builds on 2NF and addresses transitive dependencies, which occur when a non-prime attribute depends on another non-prime attribute. A non-prime attribute should depend only on the primary key. A table is in 3NF if it is in 2NF and all transitive dependencies are removed.
-
Rules of 3NF:
- The table must be in 2NF.
- No non-prime attribute should depend on another non-prime attribute (remove transitive dependencies).
Example of 3NF:
Before 3NF (Transitive Dependency):
OrderID | Product | Category | Supplier |
---|---|---|---|
1 | Apple | Fruit | XYZ |
2 | Carrot | Vegetable | ABC |
Here, Supplier depends on Category, not directly on the OrderID. To resolve this, we split the table.
After 3NF:
Tables:
- Orders (OrderID, Product, Category)
- Category (Category, Supplier)
Orders table:
OrderID | Product | Category |
---|---|---|
1 | Apple | Fruit |
2 | Carrot | Vegetable |
Category table:
Category | Supplier |
---|---|
Fruit | XYZ |
Vegetable | ABC |
4. Boyce-Codd Normal Form (BCNF)
BCNF is a stricter version of 3NF. A table is in BCNF if:
- It is in 3NF.
- For every functional dependency, the left-hand side must be a candidate key (i.e., a minimal superkey).
In simpler terms, BCNF addresses situations where a table is in 3NF but still has some dependencies that involve attributes that aren't candidate keys.
-
Rules of BCNF:
- The table must be in 3NF.
- Every determinant must be a candidate key.
Example of BCNF:
Before BCNF:
CourseID | Instructor | Room |
---|---|---|
101 | Dr. Smith | A1 |
102 | Dr. Smith | B1 |
101 | Dr. Johnson | A2 |
Here, Instructor determines Room, but Instructor is not a candidate key, which violates BCNF. To achieve BCNF, we separate the dependencies into different tables.
After BCNF:
Tables:
- Courses (CourseID, Instructor)
- Rooms (Instructor, Room)
Courses table:
CourseID | Instructor |
---|---|
101 | Dr. Smith |
102 | Dr. Smith |
101 | Dr. Johnson |
Rooms table:
Instructor | Room |
---|---|
Dr. Smith | A1 |
Dr. Smith | B1 |
Dr. Johnson | A2 |
5. Fourth Normal Form (4NF)
4NF addresses multi-valued dependencies, which occur when one attribute determines multiple values of another attribute, and those values are independent of each other. A table is in 4NF if:
- It is in BCNF.
- It has no multi-valued dependencies.
Example of 4NF:
Before 4NF (Multi-valued Dependency):
StudentID | Subject | Hobby |
---|---|---|
1 | Math | Painting |
1 | Science | Cycling |
After 4NF:
Tables:
- Students (StudentID, Subject)
- StudentsHobbies (StudentID, Hobby)
Students table:
StudentID | Subject |
---|---|
1 | Math |
1 | Science |
StudentsHobbies table:
StudentID | Hobby |
---|---|
1 | Painting |
1 | Cycling |
Conclusion
In database design, normalization is a fundamental process for organizing data efficiently. The different normal forms—1NF, 2NF, 3NF, BCNF, and 4NF—ensure that data is stored without redundancy, maintains integrity, and is easy to manage. Each normal form builds on the previous one by eliminating specific types of dependency or anomaly. While normalization improves data quality, it is essential to balance it with performance considerations, sometimes opting for denormalization when necessary for optimization.
Hi, I'm Abhay Singh Kathayat!
I am a full-stack developer with expertise in both front-end and back-end technologies. I work with a variety of programming languages and frameworks to build efficient, scalable, and user-friendly applications.
Feel free to reach out to me at my business email: kaashshorts28@gmail.com.
Top comments (0)