Database normalization is the process of organizing the attributes and tables of a relational database to minimize redundancy and dependency. There are several normal forms (1NF, 2NF, 3NF, BCNF, 4NF, 5NF), each with its own set of rules to achieve progressively higher levels of normalization. Let’s go through each one with examples:
-
First Normal Form (1NF):
- Rule: Eliminate repeating groups and ensure each column contains atomic values.
- Example: Consider a table
Studentswith columnsStudentID,Name, andSubjectswhereSubjectsis a comma-separated list. To normalize, create a separate tableStudent_Subjectswith columnsStudentIDandSubject.
-
Second Normal Form (2NF):
- Rule: Meet the requirements of 1NF and ensure all non-key attributes are fully functional dependent on the primary key.
- Example: Building on the previous example, if
Subjectsdepends only onStudentID, it should be moved to theStudent_Subjectstable.
-
Third Normal Form (3NF):
- Rule: Meet the requirements of 2NF and eliminate transitive dependencies.
- Example: If there’s a table
Bookswith columnsISBN,Title,Author, andAuthor_Address, whereAuthor_Addressdepends onAuthor, moveAuthor_Addressto a new tableAuthorswith columnsAuthorandAuthor_Address.
-
Boyce-Codd Normal Form (BCNF):
- Rule: Every determinant must be a candidate key.
- Example: If we have a table
Orderswith columnsOrderID,ProductID, andQuantity, whereProductIDdeterminesQuantity, it violates BCNF. To comply, separateProductIDandQuantityinto a new table whereProductIDis the primary key.
-
Fourth Normal Form (4NF):
- Rule: Eliminate multi-valued dependencies.
- Example: Consider a table
Employeeswith columnsEmployeeID,ProjectID, andResponsibilities, whereResponsibilitiesis multi-valued. To normalize, create a new tableEmployee_Projectswith columnsEmployeeID,ProjectID, andResponsibility.
-
Fifth Normal Form (5NF):
- Rule: Decompose tables to eliminate join dependencies.
- Example: If there’s a table
Employee_Projectswith columnsEmployeeID,ProjectID, andResponsibility, andResponsibilitydeterminesEmployeeID, split it into two tables:Employee_ResponsibilitiesandProject_Responsibilities.
In essence, normalization ensures efficient database design by reducing redundancy and dependency, leading to better data integrity and query performance. Each normal form builds upon the previous one, with the ultimate goal of eliminating data anomalies and improving overall database structure.