Database normalization is a technique to organize the contents of databases. Without normalization, database systems can be inaccurate, slow, and inefficient. The community of database professionals developed a series of guidelines for the normalization of databases. Each ‘level’ of normalization is referred to as a form, and there are 5 forms, total. First normal form is the lowest level of normalization, up to fifth normal form, which is the highest level of normalization.
- First Normal Form (1NF): The most basic level of data normalization, first normal form requires the elimination of all duplicate columns in a table, and also requires the creation of separate tables for related data, and identification of each table with a primary key attribute.
- Second Normal Form (2NF): Meets all the requirements of first normal form, and creates relationships between tables using foreign keys.
- Third Normal Form (3NF): Meets all the requirements of second and first normal forms, and removes all columns that are not dependent upon the primary key. Third normal form also removes all derived attributes, such as age.
- Fourth Normal Form (4NF): Fourth normal form adds one additional requirement, which is the removal of any multi-valued dependencies in relationships.
- Fifth Normal Form (5NF): Fifth normal form is a rarer form of normalization, in which case join dependencies are implied by candidate keys (possibly primary key values).
In the reality of database development, getting to 3NF is the most important jump. 4NF and 5NF are a bit more of a luxury (and sometimes a nuisance) in database development, and are rarely seen in practice. If you’re struggling with the concepts, or remembering the first three forms, there is a simple relation. “The key, the whole key, and nothing but the key.”, which relates to 1NF, 2NF, and 3NF.