Illogically or inconsistently stored data can cause a number of problems. In a relational database, a logical and efficient design is just as critical. A poorly designed database may provide erroneous information, may be difficult to use, or may even ill to work properly.
and update irregularity, that generates the inconsistent data).
Basically, normalisation is the process of efficiently organising data in a database. There are two main objectives of the normalization process: eliminate redundant data (storing the same data in more than one table) and ensure data dependencies make sense (only storing related data in a table).
Both of these are valuable goals as they reduce the amount of space a database consumes and ensure that data is logically stored.
The process of designing a relational database includes making sure that a table contains only data directly related to the primary key, that each data field contains only one item of data, and that redundant
(duplicated and unnecessary) data is eliminated. The task of a database designer is to structure the data in a way that eliminates unnecessary duplication(s) and provides a rapid search path to all necessary information. This process of specifying and defining tables, keys, columns, and relationships in order
to create an efficient database is called normalization.
Normalisation is part of successful database design.
Without normalisation, database systems can be inaccurate, slow, and inefficient and they might not produce the data you expect.
We use the normalization process to design efficient and functional databases. By normalizing, we store data where it logically and uniquely belongs. The normalization process involves a few steps and each step is called a form. Forms range from the first normal form (1NF) to fifth normal form (5NF). There is also one higher level, called domain key normal form (DK/NF) and BCNF.
When normalising a database you should achieve four goals:
- Arranging data into logical groups such that each group describes a small part of the whole
- Minimizing the amount of duplicated data stored in a database
- Building a database in which you can access and manipulate the data quickly and efficiently without compromising the integrity of the data storage
- Organising the data such that, when you modify it, you make the changes in only one place
A properly normalised design allows you to:
- Use storage space efficiently
- Eliminate redundant data
- Reduce or eliminate inconsistent data
- Ease the database maintenance burden
- Repetition of information
- Inability to represent certain information
- Loss of information
- Difficulty to maintain information
- Eliminating insertion, update and delete anomalies
- Establishing functional dependencies
- Removing transitive dependencies
- Reducing non-key data redundancy
Un-normalised data
Un-normalised data have repeating groups, inconsistent data, delete and insert anomalies.The following table is un-normalized:
Un-normalised data | ||
Project | Color | Size |
P1 | Red | Large |
P1 | Green | Large |
P1 | Red | Medium |
P1 | Green | Medium |
P1 | Red | Small |
P1 | Green | Small |
P2 | Black | Large |
P2 | Black | Small |
1NF meets the following criteria:
- ELIMINATE REPEATING GROUPS (make a separate table for each set of related attributes and give each table a primary key).
- A table is in 1NF if there are no duplicated rows in the table.
- Each table contains all atomic data items, no repeating groups, and a designated primary key (no duplicated rows)
Project Table | ||
ProjectID | Project | Color |
1 | P1 | Red |
2 | P1 | Green |
3 | P2 | Black |
Detail Table | ||
DetailId | ProjectId | Size |
1 | 1 | Large |
2 | 2 | Large |
3 | 1 | Medium |
4 | 2 | Medium |
5 | 1 | Small |
6 | 2 | Small |
7 | 3 | Large |
8 | 3 | Small |
Second Normal Form (2NF)
ELIMINATE REDUNDANT DATA (if an attribute depends on only part of multi-valued key, remove it to a separate table) . A table is in 2NF if it met all database requirements for 1NF, and if each non-key attribute is fully functionally dependent on the whole primary key;
2NF meets the following criteria:
- Each table contains all atomic data items, no repeating groups, and a designated primary key (no duplicated rows)
- Each table has all non-primary key attributes fully functionally dependant on the whole primary key
Project Table | ||
ProjectID | Project | Color |
1 | P1 | Red |
2 | P1 | Green |
3 | P2 | Black |
Size Table | |
SizeID | Size |
1 | Large |
2 | Medium |
3 | Small |
Detail Table | ||
DetailId | ProjectId | SizeID |
1 | 1 | 1 |
2 | 2 | 1 |
3 | 1 | 2 |
4 | 2 | 2 |
5 | 1 | 3 |
6 | 2 | 3 |
7 | 3 | 1 |
8 | 3 | 3 |
Third Normal Form (3NF)
ELIMINATE COLUMNS NOT DEPENDANT ON KEY (if attributes do not contribute to a description of the key remove them to a separate table) . A table is in 3NF if it met all database requirements for both 1NF and 2NF, and if all transitive dependencies are eliminated (each column must depend directly on the primary key; all attributes that are not dependant upon the primary key must be eliminated (e.g. attributes that can be derived from data contained in other fields and tables must be removed)).
3NF meets the following criteria:
- Each table contains all-atomic data items, no repeating groups, and a designated primary key
- Each table has all non-primary key attributes fully functionally dependant on the whole primary key
- All transitive dependencies are removed from each table
Project Table | |
ProjectID | Project |
1 | P1 |
2 | P2 |
Color Table | |
ColorID | Color |
1 | Red |
2 | Green |
3 | Black |
Size Table | |
SizeID | Size |
1 | Large |
2 | Medium |
3 | Small |
Detail Table | |||
DetailId | ProjectId | ColorId | SizeId |
1 | 1 | 1 | 1 |
2 | 1 | 2 | 1 |
3 | 1 | 1 | 2 |
4 | 1 | 2 | 2 |
5 | 1 | 1 | 3 |
6 | 1 | 2 | 3 |
7 | 2 | 3 | 1 |
8 | 2 | 3 | 3 |
Fourth Normal Form (4NF)
ISOLATE INDEPENDENT MULTIPLE RELATIONSHIPS (no table may contain two or more 1:M or M:M relationships that are not directly related) . This applies to only 1:M and M:M relationships. To be in Fourth Normal Form (4NF).
Project Table | |
ProjectID | Project |
1 | P1 |
2 | P2 |
Color Table | |
ColorID | Color |
1 | Red |
2 | Green |
3 | Black |
Size Table | |
SizeID | Size |
1 | Large |
2 | Medium |
3 | Small |
DetailProjectColor | ||
DPCID | ProjectId | ColorId |
1 | 1 | 1 |
2 | 1 | 2 |
3 | 2 | 3 |
DetailProjectSize | ||
DPSID | DPCID | SizeId |
1 | 1 | 1 |
2 | 2 | 1 |
3 | 1 | 2 |
4 | 2 | 2 |
5 | 1 | 3 |
6 | 2 | 3 |
7 | 3 | 1 |
8 | 3 | 3 |
Fifth Normal Form (5NF)
The normal forms discussed so far required that the given relation R if not in the given normal form be decomposed in two relations to meet the requirements of the normal form. In some rare cases, a relation can have problems like redundant information and update anomalies because of it but cannot be decomposed in two relations to remove the problems. In such cases it may be possible to decompose the relation in three or more relations using the 5NF.
The fifth normal form deals with join-dependencies which is a generalisation of the MVD. The aim of fifth normal form is to have relations that cannot be decomposed further. A relation in 5NF cannot be constructed from several smaller relations.
A relation R is in 5NF (or project-join normal form, PJNF) if for all join dependencies at least one of the following holds.
(a) (R1, R2, ..., Rn) is a trivial join-dependency (that is, one of Ri is R) (b) Every Ri is a candidate key for R.
An example of 5NF can be provided by the example below.
Project Table | |
ProjectID | Project |
1 | P1 |
2 | P2 |
Color Table | |
ColorID | Color |
1 | Red |
2 | Green |
3 | Black |
Size Table | |
SizeID | Size |
1 | Large |
2 | Medium |
3 | Small |
DetailProjectColor | ||
DPCID | ProjectId | ColorId |
1 | 1 | 1 |
2 | 1 | 2 |
3 | 2 | 3 |
DetailProjectSize | ||
DPSID | ProjectId | SizeId |
1 | 1 | 1 |
2 | 1 | 2 |
3 | 1 | 3 |
4 | 2 | 1 |
5 | 2 | 3 |
Detail Table | ||
DetailId | DPCID | DPSID |
1 | 1 | 1 |
2 | 2 | 1 |
3 | 1 | 2 |
4 | 2 | 2 |
5 | 1 | 3 |
6 | 2 | 3 |
7 | 3 | 4 |
8 | 3 | 5 |
==============================
Maulik Bhatt |Senior QA Engineer
WebMingle Technology
Accelerated by knowledge. Driven by values.
www.webMingle.in
Please post about BCNF Normalization
ReplyDelete