DEV Community

Pranav Bakare
Pranav Bakare

Posted on • Edited on

Normalisation in SQL

Normalization in SQL

Normalization in SQL is a process used to organize a database into tables and columns to reduce redundancy and dependency. The goal is to ensure that data is stored efficiently, minimize the amount of duplicated data, and ensure data integrity. The process is divided into different normal forms (NF), and each form addresses specific issues with data structure and relationships.


Normalization

Normalization is the process of minimizing redundancy from a relation or set of relations. Redundancy in relation may cause insertion, deletion, and update anomalies. So, it helps to minimize the redundancy in relations. Normal forms are used to eliminate or reduce redundancy in database tables.


1. First Normal Form (1NF)

A table is in 1NF if:

  • 1. All the values in a column are atomic (indivisible).
  • 2. Each entry in a column must contain only one value (no multiple values or sets).
  • 3. There should be no repeating groups of data (e.g., arrays or lists within a single column).

2. Second Normal Form (2NF)

A table is in 2NF if:

  • 1. It is already in 1NF.
  • 2. All non-key attributes are fully dependent on the entire primary key (no partial dependencies).

3. Third Normal Form (3NF)

A table is in 3NF if:

  • 1. It is already in 2NF.
  • 2. There are no transitive dependencies, meaning no non-key attribute depends on another non-key attribute.

4. Boyce-Codd Normal Form (BCNF)

A table is in BCNF if:

  • 1. It is already in 3NF.
  • 2. Every determinant must be a candidate key. A determinant is any attribute that can uniquely identify another attribute.

BCNF is a stricter version of 3NF. It handles situations where a 3NF table still has anomalies due to a non-candidate key acting as a determinant.


Why Normalize?

Data Integrity: By eliminating redundancy, normalization reduces the risk of anomalies (like update, insert, or delete anomalies).

Efficiency: Normalized databases are more efficient in terms of space utilization.

Scalability: As the data grows, normalized structures make it easier to maintain and query large datasets.


Trade-offs:

Performance: Normalization often results in more tables, which can lead to more complex joins and potentially slower performance. In highly transactional systems, denormalization is sometimes considered to optimize read operations.

Complexity: Higher normal forms can result in many small tables, making queries more complex to write and maintain.

In most real-world cases, databases are normalized up to 3NF, with some denormalization applied for performance reasons in read-heavy systems.

Top comments (1)

Collapse
 
charles_roth_8c0df94d211a profile image
Info Comment hidden by post author - thread only accessible via permalink
Charles Roth

After "suppose we have the following table", THERE IS NO TABLE. In fact, none of the "example" tables are there!

Some comments have been hidden by the post's author - find out more