Introduction
So we have just been hired as a database administrator for a car dealership. Currently they operate with only one table, which is below. I have also included some entries that exist in this table.
Initial Database Schema
CustomerName
is the name of the customer (e.g. Alan Smith).
Purchase
is what the customer bought (e.g. Corolla).
Address
is the address of the customer (e.g. 2 college road).
Newsfeed
is what kind of marketing information the customer wishes to receive (e.g. a booklet on Toyota car offers).
Supplier
is the brand which supplies the car the customer purchases (e.g. Toyota)
SupplierEmail
is the contact email address for the Supplier
(e.g. toyota@toyota.inc)
Price
is the selling price of the car
Example Entries
1st Normalisation Form
For the database to be in the 1st Normalisation Form we must satisfy the following conditions:
- Each cell to be single valued
- Entries in the column(s) is/are to be of the same type
- Rows are to be uniquely identified
The current iteration of the table has many flaws. For example when you view the above sample data we cannot determine whether “Alan Smith” is the two different people or one person who bought two cars. We also have comma-separated values and this is very messy to deal with.
The first thing we will need to do is add or assign a column as the primary key and no field in the current instance of the database will satisfy the conditions of a primary key. So we will add a CustomerID field which will solve the person identification issue. Once we have made that change our table schema will look as follows:
And now when we look below at the data we now see that it was in fact two /different/ people named “Alan Smith” who purchased a car from us. As a result of this change we have also removed the issue of comma-separated values and each cell is now contains a single value.
2nd Normalisation Form
For the database to be in the 2nd Normalisation Form we must satisfy the following conditions:
- All attributes (columns that are non-key) must be dependent on the key
We must look at each column in the table and ask ourselves, is this dependent on the CustomerID
? When we take a closer look we can see that the CustomerName
, Address
and Newsfeed
are dependant. Just ask yourself, “Do I need the primary key to associate this attribute with the correct entry?”. If the answer is yes, then keep it in the table.
Now when your answer is no then you must separate this attribute out into another table. When we separate out our values we now have the following two tables:
Note: for clarity sake, from now on I will highlight the primary key in orange
The issue now is that we have lost our transactions! Well, for that we will want to create a junction (or associative) table. Now for anyone who doesn’t know what a junction table is don’t worry. It is just the name given to a table that maps two or more tables together by referencing the primary keys of each table. The entries are known as compound keys. See what it looks like below for more clarity.
3rd Normalisation Form
For the database to be in the 3rd Normalisation Form we must satisfy the following conditions:
- All fields (columns) can be determined only by the key in the table (and not any other column).
Now if we look at the table on the right (the purchases table) you will notice that the Supplier
and the SupplierEmail
will always be the same (i.e. the Toyota contact email will never be matched to the Renault supplier). As you add more sales of different cars you will notice that you would get duplicate data for these fields.
Now let’s say we got a new rep for Renault and she contacted us to say that we are to update the email address for Renault to their address. We would have to go through all the entries in the Purchases database and update the SupplierEmail
field. This is not acceptable to us! What we will do is pull out the Supplier
and SupplierEmail
and place it in a different table. We will then reference this table via the use of a foreign key. A foreign key is simply an entry that references the primary key of another table.
Note: the foreign key is in blue, which now references the new supplier table
We have removed the redundancy and can update the contact information in one place.
4th Normalisation Form
For the database to be in the 4th Normalisation Form we must satisfy the following conditions:
- No multi-valued dependencies
So a multi-valued dependency exists in a table when two attributes in the table are independent entries, but are both dependent on a third attribute.
This can be seen above, CustomerName
, Address
and Newsfeed
all depend on the CustomerID
. Now what happens when a customer wants to unsubscribe from receiving the monthly or quarterly brochures from their preferred car brand(s)? Do we place a /null/ in the Newsfeed
column? No, this would be inefficient and just a mess to maintain. Instead we will move the Newsfeed
attribute to a new table.
Final Database Schema
So in the end we arrive at this structure. These tables all adhere to the various conditions and will result in better data management and maintenance for you (the database administrator).
Thanks for taking the time to read this, let me know what thought of it! Or inform me of any changes I could make to the post.
Top comments (0)