如何将带有 GROUP BY 子句的查询移植到 PostgreSQL？

Question

我正在将一个简单的费用数据库移植到 Postgres，并卡在一个使用 GROUP BY 和多个 JOIN 子句的视图上。我认为 Postgres 希望我使用 GROUP BY 子句中的所有表。

Table定义在最后。请注意，account_id、receiving_account_id 和 place 列可能是 NULL，而 operation 可以有 0 个标签。

原始`CREATE`声明

CREATE VIEW details AS SELECT
    op.id,
    op.name,
    c.name,
    CASE --amountsign
        WHEN op.receiving_account_id IS NOT NULL THEN
            CASE
                WHEN op.account_id IS NULL THEN '+'
                ELSE '='
            END
        ELSE '-' 
    END || ' ' || printf("%.2f", op.amount) || ' zł' AS amount,
    CASE --account
        WHEN op.receiving_account_id IS NOT NULL THEN
            CASE
                WHEN op.account_id IS NULL THEN ac2.name
                ELSE ac.name || ' -> ' || ac2.name
            END
        ELSE ac.name
    END AS account,
    t.name AS type,
    CASE --date
        WHEN op.time IS NOT NULL THEN op.date || ' ' || op.time
        ELSE op.date
    END AS date,
    p.name AS place,
    GROUP_CONCAT(tag.name, ', ') AS tags
FROM operation op
LEFT JOIN category c ON op.category_id = c.id
LEFT JOIN type t ON op.type_id = t.id
LEFT JOIN account ac ON op.account_id = ac.id
LEFT JOIN account ac2 ON op.receiving_account_id = ac2.id
LEFT JOIN place p ON op.place_id = p.id
LEFT JOIN operation_tag ot ON op.id = ot.operation_id
LEFT JOIN tag ON ot.tag_id = tag.id
GROUP BY IFNULL (ot.operation_id, op.id)
ORDER BY date DESC

Postgres 中的当前查询

我做了一些更新，我现在的声明是：

BEGIN TRANSACTION;
CREATE VIEW details AS SELECT
    op.id,
    op.name,
    c.name,
    CASE --amountsign
        WHEN op.receiving_account_id IS NOT NULL THEN
            CASE
                WHEN op.account_id IS NULL THEN '+'
                ELSE '='
            END
        ELSE '-' 
    END || ' ' || op.amount || ' zł' AS amount,
    CASE --account
        WHEN op.receiving_account_id IS NOT NULL THEN
            CASE
                WHEN op.account_id IS NULL THEN ac2.name
                ELSE ac.name || ' -> ' || ac2.name
            END
        ELSE ac.name
    END AS account,
    t.name AS type,
    CASE --date
        WHEN op.time IS NOT NULL THEN to_char(op.date, 'DD.MM.YY') || ' ' || op.time
        ELSE to_char(op.date, 'DD.MM.YY')
    END AS date,
    p.name AS place,
    STRING_AGG(tag.name, ', ') AS tags
FROM operation op
LEFT JOIN category c ON op.category_id = c.id
LEFT JOIN type t ON op.type_id = t.id
LEFT JOIN account ac ON op.account_id = ac.id
LEFT JOIN account ac2 ON op.receiving_account_id = ac2.id
LEFT JOIN place p ON op.place_id = p.id
LEFT JOIN operation_tag ot ON op.id = ot.operation_id
LEFT JOIN tag ON ot.tag_id = tag.id
GROUP BY COALESCE (ot.operation_id, op.id)
ORDER BY date DESC;
COMMIT;

这里我在添加列出的错误时得到 Column 'x' must appear in GROUP BY clause 个错误：

GROUP BY COALESCE(ot.operation_id, op.id), op.id, c.name, ac2.name, ac.name, t.name, p.name

当我添加 p.name 列时，我得到 Column 'p.name' is defined more than once error. 我该如何解决？

Table定义

CREATE TABLE operation (
  id integer NOT NULL PRIMARY KEY,
  name character varying(64) NOT NULL,
  category_id integer NOT NULL,
  type_id integer NOT NULL,
  amount numeric(8,2) NOT NULL,
  date date NOT NULL,
  "time" time without time zone NOT NULL,
  place_id integer,
  account_id integer,
  receiving_account_id integer,
  CONSTRAINT categories_transactions FOREIGN KEY (category_id)
      REFERENCES category (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT transactions_accounts FOREIGN KEY (account_id)
      REFERENCES account (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT transactions_accounts_second FOREIGN KEY (receiving_account_id)
      REFERENCES account (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT transactions_places FOREIGN KEY (place_id)
      REFERENCES place (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT transactions_transaction_types FOREIGN KEY (type_id)
      REFERENCES type (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
);

Answer 1

大多数数据库要求您 group by 在 select 中出现的每一列都未聚合。未聚合意味着不包含在聚合中，如 min、max 或 string_agg。所以你需要分组：op.id, op.name, c.name, op.receiving_account_id, ...，等等

此要求的原因是数据库必须确定组的值。通过将列添加到 group by 子句，您可以确认组中的每一行都具有相同的值。对于其他组，您必须指定要与聚合一起使用的值。例外是 MySQL，如果您没有做出有意识的选择，它只会选择一个任意值。

如果您的 group by 只是创建一个标签列表，您可以将其移至子查询：

left join
        (
        select  id
        ,       string_agg(tag.name, ', ') tags
        from    tag
        group by
                id
        ) t
on      ot.tag_id = t.id

并且您可以避免为外部查询使用非常长的分组依据。

Answer 2

与类似：大多数 RDBMS 要求按未聚合的每一列进行分组 - 查询中的任何其他位置（包括 SELECT 列表，但也在 WHERE 子句等中.)

PGError: ERROR: aggregates not allowed in WHERE clause on a AR query of an object and its has_many objects

SQL 标准还定义了 GROUP BY 子句中的表达式也应涵盖功能相关的表达式。 Postgres 实现了 PK 列覆盖相同 table.

的所有列

PostgreSQL - GROUP BY clause

所以 op.id 涵盖了整个 table 这应该适用于您当前的查询：

GROUP BY op.id, c.name, 5, t.name, p.name

5 是对 SELECT 列表的 位置引用 ，这在 Postgres 中也是允许的。它只是符号 shorthand 用于重复长表达式：

CASE
   WHEN op.receiving_account_id IS NOT NULL THEN
      CASE
         WHEN op.account_id IS NULL THEN ac2.name
         ELSE ac.name || ' -> ' || ac2.name
      END
   ELSE ac.name
END

Concatenate multiple result rows of one column into one, group by another column
Select first row in each GROUP BY group?

我从你的名字中得出你在 operation 和 tag 之间有一个 n:m 关系，用 operation_tag 实现。所有其他连接似乎都不会乘以行，因此单独聚合标签会更有效 - 就像@Andomar 暗示的那样，只要逻辑正确即可。

这应该有效：

SELECT op.id
     , op.name
     , c.name
     , CASE  -- amountsign
          WHEN op.receiving_account_id IS NOT NULL THEN
             CASE WHEN op.account_id IS NULL THEN '+' ELSE '=' END
          ELSE '-' 
       END || ' ' || op.amount || ' zł' AS amount
     , CASE  -- account
          WHEN op.receiving_account_id IS NOT NULL THEN
             CASE
                WHEN op.account_id IS NULL THEN ac2.name
                ELSE ac.name || ' -> ' || ac2.name
             END
          ELSE ac.name
       END AS account
     , t.name AS type
     , <b>to_char(op.date, 'DD.MM.YY') || ' ' || op.time AS date</b>  -- see below
     , p.name AS place
     , ot.tags
FROM   operation op
LEFT   JOIN category c   ON op.category_id = c.id
LEFT   JOIN type     t   ON op.type_id = t.id
LEFT   JOIN account  ac  ON op.account_id = ac.id
LEFT   JOIN account  ac2 ON op.receiving_account_id = ac2.id
LEFT   JOIN place    p   ON op.place_id = p.id
<b>LEFT   JOIN (
   SELECT operation_id, string_agg(t.name, ', ') AS tags
   FROM   operation_tag ot
   LEFT   JOIN tag      t  ON t.id = ot.tag_id
   GROUP  BY 1
   ) ot ON op.id = ot.operation_id</b>
<b>ORDER  BY op.date DESC, op.time DESC</b>;

旁白

您可以替换：

CASE --date
   WHEN op.time IS NOT NULL THEN to_char(op.date, 'DD.MM.YY') || ' ' || op.time
   ELSE to_char(op.date, 'DD.MM.YY')
END AS date

用这个较短的等价物：

concat_ws(' ', to_char(op.date, 'DD.MM.YY'), op.time) AS date

但由于两列都已定义 NOT NULL，您可以进一步简化为：

to_char(op.date, 'DD.MM.YY') || ' ' || op.time AS date

小心你的 ORDER BY 你至少有一个输入列也命名为 date。如果您使用非限定名称，它将引用 output 列 - 这就是您想要的（如评论中所述）。详情：

PostgreSQL: How to return rows with respect to a found row (relative results)?

但是，按文本表示排序不会根据您的时间轴正确排序。按照我上面查询中的建议，按原始值排序。

如何将带有 GROUP BY 子句的查询移植到 PostgreSQL？

How do I port query with GROUP BY clause to PostgreSQL?

sql

sqlite

postgresql

group-by

sql-view

原始`CREATE`声明

Postgres 中的当前查询

Table定义

旁白

如何将带有 GROUP BY 子句的查询移植到 PostgreSQL？

How do I port query with GROUP BY clause to PostgreSQL?

sql

sqlite

postgresql

group-by

sql-view

原始CREATE声明

Postgres 中的当前查询

Table定义

旁白

原始`CREATE`声明