- You should be familiar with 2NF
The Third Normal Form (3NF) is a database normalization form that builds on the Second Normal Form. The primary goal of 3NF is to eliminate transitive dependencies.
Key Concepts
- A
Transitive dependencyoccurs when one attribute in a database indirectly relies on another through a third attribute, causing redundancy. For example, if A depends on B (A -> B) and B depends on C (B -> C), then A is transitively dependent on C (A -> C). - A
Functional dependency(X -> Y) in a relationRis considered trivial if the set of attributesYis a subset of (or equal to) the set of attributesX. - A
Super keyis a candidate key or a superset of a candidate key.
The 3NF Rule
A relation is in 3NF if it is in 2NF and at least one of the following conditions holds in every non-trivial function dependency X -> Y:
Xis a super key.Yis a prime attribute.
Example
We have a table course_instructors with the following schema:
import { pgTable, primaryKey, varchar } from "drizzle-orm/pg-core";
export const courseInstructors = pgTable("course_instructors", {
course: varchar("course", { length: 255 }).notNull(),
semester: varchar("semester", { length: 255 }).notNull(),
instructor: varchar("instructor", { length: 255 }).notNull(),
instructorEmail: varchar("instructor_email", { length: 255 }).notNull(),
}, (t) => [
primaryKey({ columns: [t.course, t.semester] })
]);Functional Dependencies
course, semester -> instructor.instructor -> instructor_email.course, semester -> instructor_email.
Candidate keys
Only the pair of attributes course and semester can uniquely identify each row in the table. Therefore, the candidate key is course, semester.
2NF Analysis
The table is in 2NF because all non-key attributes are fully functionally dependent on the entire candidate key (course, semester).
3NF Analysis
The table is not in 3NF because of the transitive dependency instructor -> instructor_email. The attribute instructor_email is dependent on instructor, which is not a super key.
This leads to:
- Redundancy: as the same instructor’s email can appear multiple times for different courses and semesters.
- Update anomalies: as changing an instructor’s email would require updating multiple rows.
3NF Decomposition
To eliminate the transitive dependency, we can decompose the table into two separate tables:
import { pgTable, primaryKey, varchar } from "drizzle-orm/pg-core";
export const instructors = pgTable("instructors", {
instructor: varchar("instructor", { length: 255 }).notNull().primaryKey(),
instructorEmail: varchar("instructor_email", { length: 255 }).notNull(),
});
export const courseInstructors = pgTable("course_instructors", {
course: varchar("course", { length: 255 }).notNull(),
semester: varchar("semester", { length: 255 }).notNull(),
instructor: varchar("instructor", { length: 255 }).notNull().references(() => instructors.instructor, { onDelete: "cascade", onUpdate: "cascade" }),
}, (t) => [
primaryKey({ columns: [t.course, t.semester] })
]);With this decomposition, we have eliminated the transitive dependency. The instructor_email is now stored in a separate table, and we have achieved 3NF.