PostgreSQL 中的多个单独连接

Multiple separate joins in PostgreSQL

我使用的是 PostgreSQL 9.1,但在处理多个连接时遇到了问题。我有两个 tables - 一个与客户有关,另一个可以与他们沟通。这是一个简单的设计:

CREATE TABLE Customers (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL
);

CREATE TABLE Communication (
    id_customer INTEGER NOT NULL REFERENCES Customers(id),
    type CHARACTER(1) NOT NULL,
    content TEXT NOT NULL
);

为简单起见,Communication.type 列有一个字母 'T' 代表 phone 号码或 'E' 代表电子邮件地址。以下是一些简单的值:

INSERT INTO Customers(id, name) VALUES (1, 'John'), (2, 'Patrick'), (3, 'Bill');

INSERT INTO Communication(id_customer, type, content) VALUES
    (1, 'T', '666 555 444'),
    (1, 'E', 'john@aa.com'),
    (1, 'E', 'doe@aa.com'),

    (2, 'T', '123456789'),
    (2, 'T', '987654321'),
    (2, 'T', '111111111'),
    (2, 'E', 'patrick@aa.com'),

    (3, 'T', '190'),
    (3, 'T', '90');

现在我想以 PostgreSQL 数组的形式列出所有客户及其所有电子邮件和 phone 号码。我首先进行了以下查询:

SELECT id, name, array_agg(phone.content), array_agg(email.content) FROM Customers
    LEFT JOIN Communication AS phone ON phone.type = 'T' AND phone.id_customer = id
    LEFT JOIN Communication AS email ON email.type = 'E' AND email.id_customer = id
    GROUP BY id;

结果如下:

 id |  name   |            array_agg            |                   array_agg
----+---------+---------------------------------+------------------------------------------------
  1 | John    | {"666 555 444","666 555 444"}   | {john@aa.com,doe@aa.com}
  2 | Patrick | {123456789,987654321,111111111} | {patrick@aa.com,patrick@aa.com,patrick@aa.com}
  3 | Bill    | {190,90}                        | {NULL,NULL}

最重要的问题是 PostgreSQL 出于某种原因决定强制两个数组的长度相同,并在需要时扩展较短的数组。结果,它复制了较短数组的最后一个元素以适应较长数组的长度。为什么?我怎样才能防止这种情况发生?

这是一个示例 fiddle:http://sqlfiddle.com/#!15/28420/1

旁注 - 我故意插入了一个带空格的 phone 数字。如果字符串可以表示为数字(即使在 table 定义中我明确声明了它 TEXT),结果会跳过引号。我可以禁用此行为并始终显示引号吗?毕竟它们是字符串,之后我必须正确解析它。越简单越好。

干杯

将聚合之一移动到外部查询:

SELECT id, name, phones, array_agg(email.content)
FROM (  
    SELECT 
        id, name, array_agg(phone.content) phones
    FROM Customers
    LEFT JOIN Communication AS phone 
        ON phone.type = 'T' AND phone.id_customer = id
    GROUP BY 1
    ) phone
    LEFT JOIN Communication AS email 
        ON email.type = 'E' AND email.id_customer = id
GROUP BY 1, 2, 3;

SqlFiddle.

然而,DISTINCT可能更简单:

SELECT 
    id, name, 
    array_agg(DISTINCT phone.content), 
    array_agg(DISTINCT email.content) 
FROM Customers
LEFT JOIN Communication AS phone 
    ON phone.type = 'T' AND phone.id_customer = id
LEFT JOIN Communication AS email 
    ON email.type = 'E' AND email.id_customer = id
GROUP BY id;

SqlFiddle.

以上两个查询都很好,但我最喜欢这个(Postgres 9.4+):

SELECT 
    id, name, 
    array_agg(content) FILTER (WHERE type = 'T') phone,
    array_agg(content) FILTER (WHERE type = 'E') email
FROM Customers
LEFT JOIN Communication ON id_customer = id
GROUP BY id;

请注意,您可以定义聚合值的顺序,例如:

array_agg(content ORDER BY content) FILTER (WHERE type = 'T') phone