更多table个相同的聚簇索引，我可以省略卫星table上的PK索引吗？

Question

我有多个 table 存储通话数据，它们具有相同的聚集索引：start_time (DATETIME)。基数 table 是“calls”，我有一个“calls_participants”和一个“calls_other_data”。所有 tables 也有一个 call_id CHAR(36) 列来标识一个调用，所以它当然应该被索引。

我要存储很多行（10 亿行）并且希望尽可能节省 space 和维护成本，所以我的想法是仅在 call_id 列上索引base table，因此其他 tables 除了 CLUSTERED start_time 索引外不会有任何索引。然后，如果我必须根据 call_id 访问 calls_other_data table 中的一行，我会这样写：

SELECT cod.some_column
FROM calls_other_data cod
WHERE cod.start_time = (SELECT start_time 
                        FROM calls 
                        WHERE call_id = '36-chars-unique-value')
  AND cod.call_id = '36-chars-unique-value'

我想说这个查询的性能与 calls_other_data.call_id 上有一个索引是完全一样的，因为 calls.call_id 索引可以以相同的方式使用：start_time 值是自动包含的，因此 SQL 服务器必须执行相同的步骤：

在 (either table).call_id 上查找索引以获得 start_time
calls_other_data.start_time

我从来没有读过这样的设计，想看看其他人对它的看法 :) 你知道有什么缺点吗？

显然，如果调用 table 中缺少一行，则很难在其他 table 中查找它，但我不介意。

谢谢:)

Answer 1

我明白你的意思了。 calls_other_data 仍会包含 call_id 列和 start_time 列，就像 calls table 一样，但 calls_other_date.call_id 列不会索引，因为索引带有存储成本。这似乎是你的想法。

这里要注意的一点是，由于您的聚集索引在任何 table 上都不是唯一的，sql 将使它通过添加唯一一些额外的数据称为 uniqueifier。因此，您在这里已经有了您可能没有考虑过的额外存储空间，这让您“优化”存储空间的尝试变得毫无意义。

我反对这种方法。存储很便宜，唯一索引对优化器有很大帮助，外键列（或类似外键的列，如果您实际上没有任何参照完整性）的索引是一个很好的经验法则。

更多table个相同的聚簇索引，我可以省略卫星table上的PK索引吗？

More tables with the same clustered index, can I omit the PK index on the satellite table?

sql

sql-server

database-design