First, Second, and Third Normal Form
Functional Dependence
Column A depends on column B means each B value is related to at most one A value.
- Columns A and B may be simple or composite
- denoted as ‘A → B’
Functional dependence is when one column depends on another.
- reflects business rules
- e.g., each student receives one letter grade in a course
- indicates the Grade column depends on the composite column (StudentID, CourseCode)
Multivalued dependence and join dependence entail dependencies between three or more columns.
- are complex and uncommon
Normal Forms
Redundancy is the repetition of related values in a table.
- causes database management problems
- when related values are updated, all copies must be changed
- makes queries slow and complex
- if copies are not updated uniformly,
- copies become inconsistent and the correct version is uncertain
- when related values are updated, all copies must be changed
Normal forms are rules for designing tables with less redundancy.
- are numbered first through fifth
- Boyce-Codd is an improved version of third normal form
- six normal forms comprise a sequence
- with each successive normal form allowing less redundancy
Normal forms
- redundancy occurs when a dependence is on a column that is not unique
- Boyce-Codd normal form eliminates all dependencies on non-unique columns
- is the most important normal form
- Boyce-Codd normal form eliminates all dependencies on non-unique columns
- fourth normal form eliminates multivalued dependencies
- fifth normal form eliminate join dependencies
First Normal Form
- Every cell of a table contains exactly one value
A table is in first normal form (1NF) when the table has a primary key and every cell contains exactly one value.
- two corollaries:
- every non-key column depends on the primary key
- includes partial dependence
- each PK value is unique
- each non-key cell contains exactly one value (atomic)
- so each PK value is related to exactly one non-key value
- has no duplicate rows
- every row contains a different PK value and therefore every row is different
- every non-key column depends on the primary key
- in practice, databases allow tables with duplicate rows and no PK
- these are usually temporary tables
- when data is moved to a permanent table, duplicate rows are removed and a PK is created
Alternative definitions of first normal form
- first normal form is commonly defined in several ways:
- the table has a primary key
- every non-key column depends on the primary key
- the table cannot have duplicate rows
- every cell contains exactly one value
- first 3 are equivalent
- last definition is true of any relational table
- allows for duplicate rows and no primary key
Second Normal Form
A table is in second normal form (2NF) when all non-key columns depend on the whole primary key.
- non-key column cannot depend on part of a composite primary key
- table with a simple primary key is automatically in second normal form
- table in second normal form is also in first normal form by definition
Third Normal Form
- Redundancy can occur in a second normal form table when a non-key column depends on another non-key column
A table is in third normal form when all non-key columns depend on the key, the whole key, and nothing but the key.
- informal definition
- accurate when the primary key is the only unique column
- is also in first and second normal forms, by definition
- no transitive dependencies
A candidate key is a simple or composite column that is unique and minimal.
- minimal means all columns are necessary for uniqueness
- can contain
NULLvalues - table may have several candidate keys
- designer designates one candidate key as the primary key
A super key is an attribute or set of attributes used to uniquely identify all attributes in a relation.
- does not have to be minimal
- can contain extra columns/attributes
- can include redundant attributes/columns
- can contain
NULLvalues - is not necessarily a candidate key
- candidate key is a minimal super key
A non-key column is a column that is not contained in a candidate key.
A table is in third normal form if, whenever a non-key column A depends on column B, then B is unique.
- columns A and B may be simple or composite
- although B is unique, B is not necessarily minimal
- and therefore is not necessarily a candidate key
