确定 table 中列之间的依赖关系

Question

我有一个 table 的非规范化形式，如下所示：

Col1               Col2      Col3            Col4 Col5 
Paris              France    Europe          1     4
Paris              France    Europe          2     5
Paris              France    Europe          3     6
Washington D.C.    USA       North America   8     9
Washington D.C.    USA       North America   7     7
... 
many more rows
...

为了规范化它，我需要了解数据的结构。

据推测，从 Col3 到 Col2 以及从 Col2 到 Col1 存在逻辑依赖关系。巴黎是法国的首都，法国是欧洲的一个国家。

如何使用 SQL 查询来证明这一点？基本上我需要证明存在 "Paris - France - Europe"、"Washington D.C. - USA - North America" 等组合，但从来没有 "Paris - USA - Europe" 或 "Washington D.C. - USA - Europe" 之类的组合。实际上，如果我在我的数据库中找到类似 "Berlin - Germany - Africa" 的内容，只要我没有找到 "Berlin - Germany - Europe".

，查询也应该证明是正确的

Answer 1

你不能。

SQL 查询可以反驳依赖关系，因为您只需要一个反例。但是证明依赖性意味着表明它 永远不会 被破坏，而当前的数据库内容仅代表一个示例。

Answer 2

您可以使用聚合：

select col3, count(*), count(distinct col2)
from t
group by col3;

预计第二列的值为“1”。您可以使用 having count(distinct col2) > 1.

获取 col2 中具有多个值的所有示例

当然，城市确实有相同的名称。例如，巴黎是一个相当有名的 city in Texas.

确定 table 中列之间的依赖关系

Determine dependencies beetween colums in table

sql

database-normalization