更少的行与更少的列

Question

我目前正在为 PostgreSQL 建模一个 table 模式，它有很多列并且打算容纳很多行。我不知道拥有更多列或将数据拆分为更多行是否更快。

架构如下所示（缩短）：

CREATE TABLE child_table (
  PRIMARY KEY(id, position),
  id bigint REFERENCES parent_table(id) ON DELETE CASCADE,
  position integer,
  account_id bigint REFERENCES accounts(account_id) ON DELETE CASCADE,
  attribute_1 integer,
  attribute_2 integer,
  attribute_3 integer,
  -- about 60 more columns
);

恰好 10 行 child_table 最多与一行 parent_table 相关。顺序由 position 中的值给出，范围从 1 到 10。parent_table 旨在容纳 6.5 亿行。有了这个模式，我最终会在 child_table.

中得到 65 亿行

这样做明智吗？还是以这种方式建模更好，以便我只有 6.5 亿行：

CREATE TABLE child_table (
  PRIMARY KEY(id),
  id bigint,
  parent_id bigint REFERENCES other_table(id) ON DELETE CASCADE,
  account_id_1 bigint REFERENCES accounts(account_id) ON DELETE CASCADE,
  attribute_1_1 integer,
  attribute_1_2 integer,
  attribute_1_3 integer,
  account_id_2 bigint REFERENCES accounts(account_id) ON DELETE CASCADE,
  attribute_2_1 integer,
  attribute_2_2 integer,
  attribute_2_3 integer,
  -- [...]
);

Answer 1

列数和行数比 how well they are indexed 更重要。索引大大减少了需要搜索的行数。在索引良好的 table 中，总行数无关紧要。如果您尝试将 10 行合并为一行，您将使索引变得更加困难。它还将使编写使用这些索引的高效查询变得更加困难。

Postgres many different types of indexes 涵盖许多不同类型的数据和搜索。您甚至可以自己编写（尽管这不是必需的）。

Exactly 10 rows of child_table are at maximum related to one row of parent_table.

避免在您的架构中编码业务逻辑。业务逻辑一直在变化，尤其是任意数字，例如 10。

您可能会考虑的一件事是减少属性列的数量，60 很多，尤其是如果它们实际上被命名为 attribute_1、attribute_2 等.相反，如果您的属性没有明确定义，请将它们存储为带有键和值的单个 JSON column。 Postgres 的 JSON 操作非常高效（假设您使用 jsonb 类型）并且在 key/value 存储和关系数据库之间提供了一个很好的中间地带。

同样，如果任何一组属性是简单的列表（如address1、address2、address3），你也可以考虑使用Postgres arrays.

如果没有具体细节，我无法提供比这更好的建议。

更少的行与更少的列

Fewer rows versus fewer columns

postgresql

postgresql-performance