关系数据库规范化中 2NF 背后的直觉

Intuition behind 2NF in normalization of relational database

在 2NF 中不允许有 partial dependency 即任何非主属性不应依赖于主键的子集(排除自身否则它将是 full functional dependency)。知道了。但为什么?部分依赖有什么问题?如果我们保持原样,它会破坏什么协议?。我在互联网上搜索过,但没有找到任何参考 material。 BCNF 和 3NF 也一样。

William Kent 的 "A Simple Guide to Five Normal Forms in Relational Database Theory" 是一个很好的信息来源。这是他描述部分依赖问题的方式。

Consider the following inventory record:

    ---------------------------------------------------
    | PART | WAREHOUSE | QUANTITY | WAREHOUSE-ADDRESS |
    ====================-------------------------------

The key here consists of the PART and WAREHOUSE fields together, but WAREHOUSE-ADDRESS is a fact about the WAREHOUSE alone. The basic problems with this design are:

  • The warehouse address is repeated in every record that refers to a part stored in that warehouse.
  • If the address of the warehouse changes, every record referring to a part stored in that warehouse must be updated.
  • Because of the redundancy, the data might become inconsistent, with different records showing different addresses for the same warehouse.
  • If at some point in time there are no parts stored in the warehouse, there may be no record in which to keep the warehouse's address.

顺便说一下,“任何非主属性都不应该依赖于主键的子集”;你 应该 说的更像是“任何非主要属性不应依赖于 任何候选键 的子集”。大多数关于关系理论的文章和书籍都通过假设只有一个候选键来简化他们的解释。但是范式是根据每个个候选键定义的。