DEV Community

Kanhaiya Singhal
Kanhaiya Singhal

Posted on

Normalization database

Normalization:--
The normalization rules are designed to prevent update anomalies and data inconsistencies.
Normalization is process used to improve the data integrity of the database by reducing or eliminating modification anomalies that can occur when the same fact is stored in multiple locations within the database.
In short form, the process of normalization attempts to reduce redundant data that causes unnecessary updates.
Normalization provides a way to improve performance after reducing locking contention and improves multiple-user concurrency because need fewer updates.
It becomes very easily to work with normalized database and reduces development costs.
By placing columns in the correct tables, it increases usability.
Normalization becomes database to less complex and easy to modify so increases extensibility.
First Normal Form (INF):-
It means the data is in an entity format.
• Every unit of data is represented within scalar attributes.
Every attribute must contain one unit of data, and each unit of data must fill one attribute.
• All data must be represented in unique attributes.
Each attribute must have a unique name and a unique purpose. An entity should have no repeating attributes.
• All data must be represented within unique tuples.
According to first normal form, each entity must have one primary key and reduces redundant or duplicate data.
For Example:--
Consider we have one table named “Customer” having lot of columns “Customer”, “PhoneNumber1”,”PhoneNumber2”,”PhoneNumber3”.
CustomerName PhoneNumber1 PhoneNumer2 PhoneNumber3
A 1 5 8
B 2 6 9
C 3 7 11

In this table there are one customer name column, multiple phone number columns which represent only customer name column. That means, data is inconstiten and duplicacy , one unit of data respresents multiple attribute so it violates first normal form.
CustomerID CustomerName

1 A
2 B
3 C
CustomerID PhoneNumber
1 1

1 5
1 8
2 2
2 6
2 9
Second Normal Forms:--+
It caused the dependency issue.
If the attribute depends on one of the primary key attributes but not the other is partial dependency, which violates the second normal form.
It says, attribute should not be partially dependent on the key.
For Examples:--
Look into following table structure:-
RegionalOfficeID RegionalOfficeName StoreName PhoneNumber
1 Southeast StoreOne 828-555-1212
2 Southeast StoreTwo 345-666-4567
3 North Store Three 890-678-5688

Table structure in second normal
RegionalOfficeID RegionalOfficeName
1 Southeast
2 North

RegionalOffice PhoneNumber
1 23423423
2 45645346

Third Normal Forms:-
Third Norma Form checks for transitive dependencies. A dependency is transient when one attribute is dependent on other attribute, which is dependent on the primary key.
Third normal form says, if one attribute depends on the key, but should not depend on the non-key attribute.
The second normal form is violated when an attribute depends on part of the key. The third normal form is violated when the attribute does depend on the key but also depends on another non-key attribute.
For Example:-
Without thirdnormalform:--
RegionalOfficeID RegionalOfficeName phone LMANAger dateofhire
With third normal form:--
Regionalofficeid regionalofficename Lmanager
Lmanaerid lmanaer dateofhire

Fourth Normal Forms:--
Fourth [4] and fifth [5] normal forms deal with multi-valued facts.
The multi-valued fact may correspond to a many-to-many relationship, as with employees and skills, or to a many-to-one relationship, as with the children of an employee (assuming only one parent is an employee).
By "many-to-many" we mean that an employee may have several skills, and a skill may belong to several employees.
In a sense, fourth and fifth normal forms are also about composite keys. These normal forms attempt to minimize the number of fields involved in a composite key, as suggested by the examples to follow.
Under fourth normal form, a record type should not contain two or more independent multi-valued facts about an entity. In addition, the record must satisfy third normal form.
Consider employees, skills, and languages, where an employee may have several skills and several languages. We have here two many-to-many relationships, one between employees and skills, and one between employees and languages.

Under fourth normal form, these two relationships should not be represented in a single record such as

| EMPLOYEE | SKILL | LANGUAGE |

Instead, they should be represented in the two records


| EMPLOYEE | SKILL | | EMPLOYEE | LANGUAGE |
==================== =======================
Other problems caused by violating fourth normal form are similar in spirit to those mentioned earlier for violations of second or third normal form. They take different variations depending on the chosen maintenance policy:
• If there are repetitions, then updates have to be done in multiple records, and they could become inconsistent.
• Insertion of a new skill may involve looking for a record with a blank skill, or inserting a new record with a possibly blank language, or inserting multiple records pairing the new skill with some or all of the languages.
• Deletion of a skill may involve blanking out the skill field in one or more records (perhaps with a check that this doesn't leave two records with the same language and a blank skill), or deleting one or more records, coupled with a check that the last mention of some language hasn't also been deleted.
Fifth Normal Form :--
Fifth Normal Form provides a way or method for designing complex relationships that involve multiple entities.
Fifth normal form deals with cases where information can be reconstructed from smaller pieces of information that can be maintained with less redundancy.
The following example illustrates a case in which the rule about agents, companies, and products is satisfied, and which clearly requires all three record types in the normalized form.

Any two of the record types taken alone will imply something untrue.

AGENT COMPANY PRODUCT
Smith Ford car
Smith Ford truck
Smith GM car
Smith GM truck
Jones Ford car
Jones Ford truck
Brown Ford car
Brown GM car
Brown Totota car
Brown Totota bus


| AGENT | COMPANY | | COMPANY | PRODUCT | | AGENT | PRODUCT |
|-------+---------| |---------+---------| |-------+---------|
| Smith | Ford | | Ford | car | | Smith | car | Fifth
| Smith | GM | | Ford | truck | | Smith | truck | Normal
| Jones | Ford | | GM | car | | Jones | car | Form
| Brown | Ford | | GM | truck | | Jones | truck |
| Brown | GM | | Toyota | car | | Brown | car |
| Brown | Toyota | | Toyota | bus | | Brown | bus |


Observe that:
• Jones sells cars and GM makes cars, but Jones does not represent GM.
• Brown represents Ford and Ford makes trucks, but Brown does not sell trucks.
• Brown represents Ford and Brown sells buses, but Ford does not make buses.

Fourth and fifth normal forms both deal with combinations of multivalued facts.
One difference is that the facts dealt with under fifth normal form are not independent, in the sense discussed earlier. Another difference is that, although fourth normal form can deal with more than two multivalued facts, it only recognizes them in pairwise groups.
Boyce-Codd Normal Form (BCNF):--
It occurs between third and fourth Normal Form.
It decides that in such a case every attribute must describe every candidate key. If attribute describes on the candidate keys but not another candidate key, the entity violates BCNF.

Top comments (0)