You should also add foreign key constraints on the relevant columns in the agent_item_types, agent_suppliers, and supplier_item_types tables. These constraints should reference the primary key columns in the respective tables (or equivalent tables defining agents, suppliers, and item types).
- You should be familiar with 4NF
The Fifth Normal Form (5NF), also known as Project-Join Normal Form (PJ/NF) is a level of database normalization where a table is already in 4NF, and every non-trivial join dependency in that table is implied by the candidate keys. The goal is to eliminate redundancy caused by join dependencies and to ensure that relations are decomposed into smaller components without any loss of data.
In practice, 5NF is rarely used in database design.
Key concepts
- A
Join Dependency(JD) is a constraint that specifies that tableRcan be split into several smaller tablesR1, R2,..., Rkand by performing a natural join on these tables, the original tableRcan be reconstructed without any loss of information and no false information (spurious rows) is created during the process. - A
Lossless/Non-Loss Decompositionis a decomposition when all the sub-relations do the natural join and the obtained table is equal to the original table. - A
Natural Joinis a join operation that is used to combine two relations based on all common attributes. - A JD denoted as
*(R1, R2, ..., Rk)on a relationRis considered trivial if at least one of the componentsRi(whereiis between1andk) is equal to the set of all attributes of the original relationR. In simple terms: A join dependency is trivial if one of the parts you’re joining already is the whole original table.
Example
We have a table agent_inventory with the following schema:
import { pgTable, primaryKey, varchar } from "drizzle-orm/pg-core";
export const agentInventory = pgTable("agent_inventory", {
agent: varchar("agent", { length: 255 }).notNull(),
supplier: varchar("supplier", { length: 255 }).notNull(),
itemType: varchar("item_type", { length: 255 }).notNull(),
}, (t) => [
primaryKey({ columns: [t.agent, t.supplier, t.itemType] }),
]);- An
Agentcan be authorized to source products from multipleSuppliers. - An
Agentcan be authorized to offer multipleItem types. - A
Suppliercan manufacture multipleItem types. - The core business constraint: An
Agentmust offer a specificItem typefrom a specificSupplierif and only if the following three conditions are all true:- The
Agentis authorized to source products from thatSupplier. - The
Agentis authorized to offer thatItem type. - The
Supplieris known to manufacture thatItem type.
- The
Illustrative Consequence: An Agent has certain Suppliers and certain Item types in their repertoire. If supplier S1 and supplier S2 are in their repertoire, and item type I is in their repertoire, then (assuming supplier S1 and supplier S2 both manufacture item type I), the agent must offer items of item type I those manufactured by supplier S1 and those manufactured by supplier S2.
You cannot have all the component relationships true without the corresponding combined record existing in the agent_inventory table. It’s this enforced combination based on the component parts that leads to the Join Dependency.
Candidate Keys
agent, supplier, item_type is the only candidate key. All three attributes are required to uniquely identify a specific assignment row.
Join Dependency
*( {agent, supplier}, {agent, item_type}, {supplier, item_type} ) is a non-trivial join dependency. This means that the table can be decomposed into smaller tables without losing any information.
4NF Analysis
The table is in 4NF because there are no non-trivial multivalued dependencies. For example, Helen Fox sources from GearUp and ProServe, and she offers Webcam, Scanners and Docks. However, she doesn’t offer GearUp Docks, nor does she offer ProServe Scanners. The allowed combinations are specific and don’t show the independence required for MVDs based on single attributes.
5NF Analysis
The table is not in 5NF because it contains non-trivial join dependency. This join dependency is not implied by the candidate key agent, supplier, item_type because none of the components in the JD ({agent, supplier}, {agent, item_type}, {supplier, item_type}) are superkeys of the original agent_inventory table.
This leads to:
- Redundancy.
- Deletion Anomalies: Changing a single underlying fact (e.g., a supplier stops making an item type) may require updating multiple rows in the
agent_inventorytable. For instance, ifProServestopped makingDocks, rows for bothChris DayandHelen Foxwould need deletion. - Insertion Anomalies: If underlying facts change such that the rule dictates a new row should exist (e.g., if
Chris Daystarts handlingGearUpsuppliers, andGearUpmakesScanners, whichChris Dayalready handles, the rule implies (Chris Day,GearUp,Scanner) must be inserted), the single-table structure doesn’t automatically enforce this insertion based on the component facts.
5NF Decomposition
We need to split the table into three tables: agent_suppliers, agent_item_types, and supplier_item_types:
import { pgTable, primaryKey, varchar } from "drizzle-orm/pg-core";
export const agentSuppliers = pgTable("agent_suppliers", {
agent: varchar("agent", { length: 255 }).notNull(),
supplier: varchar("supplier", { length: 255 }).notNull(),
}, (t) => [
primaryKey({ columns: [t.agent, t.supplier] }),
]);
export const agentItemTypes = pgTable("agent_item_types", {
agent: varchar("agent", { length: 255 }).notNull(),
itemType: varchar("item_type", { length: 255 }).notNull(),
}, (t) => [
primaryKey({ columns: [t.agent, t.itemType] }),
]);
export const supplierItemTypes = pgTable("supplier_item_types", {
supplier: varchar("supplier", { length: 255 }).notNull(),
itemType: varchar("item_type", { length: 255 }).notNull(),
}, (t) => [
primaryKey({ columns: [t.supplier, t.itemType] }),
]);To retrieve the data in a format equivalent to the original agent_inventory table we can execute this query:
const db = drizzle(...);
const results = await db
.select({
agent: agentSuppliers.agent,
supplier: agentSuppliers.supplier,
itemType: agentItemTypes.itemType,
})
.from(agentSuppliers)
.innerJoin(
agentItemTypes,
eq(agentSuppliers.agent, agentItemTypes.agent)
)
.innerJoin(
supplierItemTypes,
and(
eq(agentSuppliers.supplier, supplierItemTypes.supplier),
eq(agentItemTypes.itemType, supplierItemTypes.itemType)
)
);SELECT
"agent_suppliers"."agent",
"agent_suppliers"."supplier",
"agent_item_types"."item_type"
FROM
"agent_suppliers"
INNER JOIN
"agent_item_types"
ON "agent_suppliers"."agent" = "agent_item_types"."agent"
INNER JOIN
"supplier_item_types"
ON "agent_suppliers"."supplier" = "supplier_item_types"."supplier"
AND "agent_item_types"."item_type" = "supplier_item_types"."item_type";Decomposing the table into agent_suppliers, agent_item_types, and supplier_item_types achieves 5NF. Each table now represents a single fundamental relationship from the core business rule. This eliminates the specific join dependency that caused redundancy and update anomalies in the original table.