Contents
Concept of Normalization
Normalization is a design technique to present a database in normal form for avoiding undesirable things such as repetition of information, inability to represent the information, loss of information, etc. Normalization improves the performance of the database by reducing redundancy. The essence of normalization is to split your data into several tables that will be connected to each other based on the data within them. Relational databases operate on tables of data. These tables must be carefully defined to obtain the advantage of the database approach. The process of determining the proper tables for the database is called normalization.
Normalization is also considered as the process of organizing the data in a database to reduce the redundancies, which includes creating tables and establishing the relationship between those tables using rules designed to protect the data and to make the database flexible. Let’s see the simple table.
Unnormalized Student Info |
Student-id |
Firstname |
Lastname |
Class |
Subject |
Marks |
Sec |
Roll |
Normalize table personal info |
Personal-id |
Firstname |
Lastname |
Class |
Sec |
Subject table |
Subject-id |
Subject |
Roll |
Marks table |
Subject-id |
Marks |
In the above tables, the first table is not in normalized form. Those we want to enter the marks we have to enter all the information so there is a chance of redundancy of data and it seems to be ineffective so the table is split into other three tables to make data-independent and the table is only dependent on keys. The above normalization helps us to make sure of following.
- Dependence between the data is identified.
- Redundancy in the database is minimized.
- The data model is making it more flexible, and easier to maintain.
Some Basic Rules of Normalization
The purpose of Normalization is to remove the various anomalies which are presented in the relation. Normalization is required so that,
- There should be no redundant data values in the relation.
- Each attribute must have a data value in a row.
- Each relation should be self Contained.
- When a value of an attribute is updated, the other data values should not be changed.
The following are the major advantages of the Normalization process
- Normalization reduces data redundancy i.e. data duplication.
- It improves the faster sorting/ indexing technique.
- It simplifies the table structure.
- It improves the performance of the database system.
- It avoids the loss of information.
Types of Normalization
There are different forms that are used to decomposite the complex records of the database into simple records without losing and deleting the original one. The most common normal forms are described below.
a. 1NF (First Normal Form)
When the table has no repeating group of data then it is said to be in first normal form. That means for each cell in the table (one row and one column), can be only one value. This value should be atomic in the sense that it can’t be decomposed into smaller pieces. In the first normal form, all the repeating information has to be removed into smaller pieces. In the first normal form, all the repeating information has to be removed.
The above table is not in normal form because most of the attributes are in repeated form therefore to make it in the first normal form we break the table in the following way.
b. 2NF (Second Normal Form)
The relation is said to be in second normal if it is in first normal form and each attribute is functionally dependent on the entire primary key. The Second Normal form is essential to remove data inconsistencies at the time of modification of the database. For this purpose, we need to split the table.
c. 3NF (Third Normal Form)
A table is said to be in 3NF when it is in second normal form and every non-key attribute is functionally dependent on just the primary key. The logic, analysis, and elements of designing for the third normal form (3NF) are similar to those used in deriving 2NF.