求自然连接的基数

Question

|X|表示 X 中的元组数
粗体字母表示关系中的键

考虑关系 R(A, B) 和 S(A, C)，并且 R 在 A 上有一个外键参考文献 S.
|R ✶ S| （其中“*”表示自然连接）是：
选项是：
1.|R|
2.|S|
3.|R|.|S|
4.最大（|R|，|S|）
5. 分钟(|R|, |S|)

我对 natural join 的基数的理解是，如果两个关系之间没有共同的属性，那么 natural join 就像叉积一样，基数将为 r * s。但我不明白关键约束如何在确定基数方面发挥作用。有人可以解释一下吗？

Answer 1

假定每个模式中的粗体A表示它是一个键；并假设外键约束成立——也就是说，R 中每一行的 A 值确实对应于 S:

中的 A 值

R 中的每一行在 A 上自然地连接到 S 中的一行。
S 中可能有行未连接到 R（因为没有强制执行的外键约束）。
所以连接关系的基数是R的基数，答案1.

像这样的模式在现实生活中有实际用途吗？考虑 S 是 C 中的客户名称，由 A 中的客户编号键入。 R 在 B 中保存出生日期，也在 A 中输入客户编号。每个客户都必须有一个名字；确实每个客户（人）都必须有一个d.o.b，但是我们不需要记录他们unless/until他们购买了有年龄限制的物品。

Answer 2

绝对没有足够的信息来回答这个问题。 "natural" 连接可以 return 0 到 R*S 之间的几乎任何值。以下是示例。

这个例子 returns 12:

create table s1 (id int primary key);
create table r1 (s1_id int references s1(id));

insert into s1 (id) values (1), (2), (3);

insert into r1 (s1_id) values (1), (2), (2), (3);

这个例子returns 0:

create table s2 (id int primary key, x int default 2);
create table r2 (s2_id int references s2(id), x int default 1);

insert into s2 (id) values (1), (2), (3);

insert into r2 (s2_id) values (1), (2), (2), (3);

这个例子 returns 4:

create table s3 (id int primary key, y int default 2);
create table r3 (id int references s3(id), x int default 1);

insert into s3 (id) values (1), (2), (3);

insert into r3 (id) values (1), (2), (2), (3);

在所有这些中，r 与 s 具有外键关系。并且正在使用 "natural" 连接。 Here 是一个 db<>fiddle.

即使您假设 "A" 是主键并且没有其他列，行数仍然会变化：

-- returns 4
create table s5 (id int primary key);
create table r5 (id int references s4(id));

insert into s5 (id) values (1), (2), (3);    
insert into r5 (id) values (1), (1), (2), (2);

对战：

-- returns 0
create table s4 (id int primary key);
create table r4 (id int references s4(id));

insert into s4 (id) values (1), (2), (3);

insert into r4 (id) values (NULL), (NULL), (NULL), (NULL);

求自然连接的基数

Find the Cardinality of Natural Join

natural-join

relational-algebra

cardinality