What is Normalization in DBMS?
What is Normalization?
- Normalization is the process of organizing the data in the database.
- Normalization is used to minimize the redundancy from a relation or set of relations. It is also used to eliminate undesirable characteristics like Insertion, Update, and Deletion Anomalies.
Let’s break this down
what are Anomalies?
Anomalies in the context of databases are unexpected or undesirable issues that arise when a database is poorly designed or not properly normalized. These issues can lead to inconsistent, incorrect, or inefficient data handling.
what is Redundancy?
Redundancy refers to the unnecessary repetition of data within a database. For example, if a student’s address is stored in multiple rows of a table, any update to their address must be made in every occurrence. This wastes storage space and can lead to inconsistencies.
Consider an Example:
Imagine a table Student_Info
with the following structure:
Analysis:
In this case:
- Address (
123 Main St.
) is repeated three times for the same student (Student_ID = 101
). - This repetition consumes unnecessary storage because:
- The
Address
attribute occupies memory for every duplicate entry. - The same data is stored multiple times, even though it doesn’t add new information.
Solution with Normalization:
In a normalized design, you separate the address into its own table, like this:
Now:
- The
Address
is stored only once, saving storage space. - Updating
123 Main St.
requires only one change in theAddress
table, ensuring consistency and efficiency.
By reducing redundancy, you optimize storage and improve database performance.
What is Relation or Set of Relations?
- Relation: In the context of databases, a relation refers to a table. It contains rows (records) and columns (attributes).
- Set of Relations: A collection of interrelated tables in a database.
What are Undesirable Characteristics?
These are problems or issues that arise when a database is not well-structured. They include the following anomalies:
a. Insertion Anomaly
- Occurs when you cannot insert data into a database because some other data is missing.
- Example: In a table where a student’s course enrollment is stored, you might not be able to add a new course unless a student is already enrolled, even if the course exists independently. (detailed explanation for this example is given in next section, don’t worry:)
b. Update Anomaly
- Happens when updating data in one place requires multiple updates in other places, leading to inconsistencies.
- Example: If a lecturer’s name is stored in multiple rows, updating their name in one row but not in others creates inconsistency.
c. Deletion Anomaly
- Arises when deleting certain data unintentionally leads to the loss of other valuable data.
- Example: In a table that stores student enrollments, deleting the last enrollment for a student might also delete their contact information if stored in the same table.
Scenario: Employee and Department Data
Imagine a table called Employee_Department
that stores information about employees and the departments they work in:
╔═══════════════╦═════════════════╦══════════════════╗
║ Employee_ID ║ Employee_Name ║ Department_Name ║
╠═══════════════╬═════════════════╬══════════════════╣
║ 101 ║ Alice ║ HR ║
║ 102 ║ Bob ║ Finance ║
╚═══════════════╩═════════════════╩══════════════════╝
Problem: Adding a New Department
If the database design does not allow a department to exist without an associated employee, you cannot add a new department unless at least one employee is assigned to it.
For example:
- If you want to add a new department called “Marketing”, but no employee has been assigned to it yet, the current table structure won’t allow you to insert this data.
This is because the table is designed to store both employee and department information together, and it lacks the flexibility to store departments independently.
Solution: Normalization
To fix this, we can normalize the database by splitting the data into two separate tables:
1. Department Table
╔═══════════════╦══════════════════╗
║ Department_ID ║ Department_Name ║
╠═══════════════╬══════════════════╣
║ 1 ║ HR ║
║ 2 ║ Finance ║
║ 3 ║ Marketing ║
╚═══════════════╩══════════════════╝
2. Employee Table
╔═══════════════╦═════════════════╦══════════════════╗
║ Employee_ID ║ Employee_Name ║ Department_ID ║
╠═══════════════╬═════════════════╬══════════════════╣
║ 101 ║ Alice ║ 1 ║
║ 102 ║ Bob ║ 2 ║
╚═══════════════╩═════════════════╩══════════════════╝
Now, you can add the new department “Marketing” to the Department
table without requiring an employee to be assigned to it. This avoids the Insertion Anomaly.