PostgreSQL 中的多个单独连接
Multiple separate joins in PostgreSQL
我使用的是 PostgreSQL 9.1,但在处理多个连接时遇到了问题。我有两个 tables - 一个与客户有关,另一个可以与他们沟通。这是一个简单的设计:
CREATE TABLE Customers (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE Communication (
id_customer INTEGER NOT NULL REFERENCES Customers(id),
type CHARACTER(1) NOT NULL,
content TEXT NOT NULL
);
为简单起见,Communication.type
列有一个字母 'T' 代表 phone 号码或 'E' 代表电子邮件地址。以下是一些简单的值:
INSERT INTO Customers(id, name) VALUES (1, 'John'), (2, 'Patrick'), (3, 'Bill');
INSERT INTO Communication(id_customer, type, content) VALUES
(1, 'T', '666 555 444'),
(1, 'E', 'john@aa.com'),
(1, 'E', 'doe@aa.com'),
(2, 'T', '123456789'),
(2, 'T', '987654321'),
(2, 'T', '111111111'),
(2, 'E', 'patrick@aa.com'),
(3, 'T', '190'),
(3, 'T', '90');
现在我想以 PostgreSQL 数组的形式列出所有客户及其所有电子邮件和 phone 号码。我首先进行了以下查询:
SELECT id, name, array_agg(phone.content), array_agg(email.content) FROM Customers
LEFT JOIN Communication AS phone ON phone.type = 'T' AND phone.id_customer = id
LEFT JOIN Communication AS email ON email.type = 'E' AND email.id_customer = id
GROUP BY id;
结果如下:
id | name | array_agg | array_agg
----+---------+---------------------------------+------------------------------------------------
1 | John | {"666 555 444","666 555 444"} | {john@aa.com,doe@aa.com}
2 | Patrick | {123456789,987654321,111111111} | {patrick@aa.com,patrick@aa.com,patrick@aa.com}
3 | Bill | {190,90} | {NULL,NULL}
最重要的问题是 PostgreSQL 出于某种原因决定强制两个数组的长度相同,并在需要时扩展较短的数组。结果,它复制了较短数组的最后一个元素以适应较长数组的长度。为什么?我怎样才能防止这种情况发生?
这是一个示例 fiddle:http://sqlfiddle.com/#!15/28420/1
旁注 - 我故意插入了一个带空格的 phone 数字。如果字符串可以表示为数字(即使在 table 定义中我明确声明了它 TEXT
),结果会跳过引号。我可以禁用此行为并始终显示引号吗?毕竟它们是字符串,之后我必须正确解析它。越简单越好。
干杯
将聚合之一移动到外部查询:
SELECT id, name, phones, array_agg(email.content)
FROM (
SELECT
id, name, array_agg(phone.content) phones
FROM Customers
LEFT JOIN Communication AS phone
ON phone.type = 'T' AND phone.id_customer = id
GROUP BY 1
) phone
LEFT JOIN Communication AS email
ON email.type = 'E' AND email.id_customer = id
GROUP BY 1, 2, 3;
然而,DISTINCT
可能更简单:
SELECT
id, name,
array_agg(DISTINCT phone.content),
array_agg(DISTINCT email.content)
FROM Customers
LEFT JOIN Communication AS phone
ON phone.type = 'T' AND phone.id_customer = id
LEFT JOIN Communication AS email
ON email.type = 'E' AND email.id_customer = id
GROUP BY id;
以上两个查询都很好,但我最喜欢这个(Postgres 9.4+):
SELECT
id, name,
array_agg(content) FILTER (WHERE type = 'T') phone,
array_agg(content) FILTER (WHERE type = 'E') email
FROM Customers
LEFT JOIN Communication ON id_customer = id
GROUP BY id;
请注意,您可以定义聚合值的顺序,例如:
array_agg(content ORDER BY content) FILTER (WHERE type = 'T') phone
我使用的是 PostgreSQL 9.1,但在处理多个连接时遇到了问题。我有两个 tables - 一个与客户有关,另一个可以与他们沟通。这是一个简单的设计:
CREATE TABLE Customers (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE Communication (
id_customer INTEGER NOT NULL REFERENCES Customers(id),
type CHARACTER(1) NOT NULL,
content TEXT NOT NULL
);
为简单起见,Communication.type
列有一个字母 'T' 代表 phone 号码或 'E' 代表电子邮件地址。以下是一些简单的值:
INSERT INTO Customers(id, name) VALUES (1, 'John'), (2, 'Patrick'), (3, 'Bill');
INSERT INTO Communication(id_customer, type, content) VALUES
(1, 'T', '666 555 444'),
(1, 'E', 'john@aa.com'),
(1, 'E', 'doe@aa.com'),
(2, 'T', '123456789'),
(2, 'T', '987654321'),
(2, 'T', '111111111'),
(2, 'E', 'patrick@aa.com'),
(3, 'T', '190'),
(3, 'T', '90');
现在我想以 PostgreSQL 数组的形式列出所有客户及其所有电子邮件和 phone 号码。我首先进行了以下查询:
SELECT id, name, array_agg(phone.content), array_agg(email.content) FROM Customers
LEFT JOIN Communication AS phone ON phone.type = 'T' AND phone.id_customer = id
LEFT JOIN Communication AS email ON email.type = 'E' AND email.id_customer = id
GROUP BY id;
结果如下:
id | name | array_agg | array_agg
----+---------+---------------------------------+------------------------------------------------
1 | John | {"666 555 444","666 555 444"} | {john@aa.com,doe@aa.com}
2 | Patrick | {123456789,987654321,111111111} | {patrick@aa.com,patrick@aa.com,patrick@aa.com}
3 | Bill | {190,90} | {NULL,NULL}
最重要的问题是 PostgreSQL 出于某种原因决定强制两个数组的长度相同,并在需要时扩展较短的数组。结果,它复制了较短数组的最后一个元素以适应较长数组的长度。为什么?我怎样才能防止这种情况发生?
这是一个示例 fiddle:http://sqlfiddle.com/#!15/28420/1
旁注 - 我故意插入了一个带空格的 phone 数字。如果字符串可以表示为数字(即使在 table 定义中我明确声明了它 TEXT
),结果会跳过引号。我可以禁用此行为并始终显示引号吗?毕竟它们是字符串,之后我必须正确解析它。越简单越好。
干杯
将聚合之一移动到外部查询:
SELECT id, name, phones, array_agg(email.content)
FROM (
SELECT
id, name, array_agg(phone.content) phones
FROM Customers
LEFT JOIN Communication AS phone
ON phone.type = 'T' AND phone.id_customer = id
GROUP BY 1
) phone
LEFT JOIN Communication AS email
ON email.type = 'E' AND email.id_customer = id
GROUP BY 1, 2, 3;
然而,DISTINCT
可能更简单:
SELECT
id, name,
array_agg(DISTINCT phone.content),
array_agg(DISTINCT email.content)
FROM Customers
LEFT JOIN Communication AS phone
ON phone.type = 'T' AND phone.id_customer = id
LEFT JOIN Communication AS email
ON email.type = 'E' AND email.id_customer = id
GROUP BY id;
以上两个查询都很好,但我最喜欢这个(Postgres 9.4+):
SELECT
id, name,
array_agg(content) FILTER (WHERE type = 'T') phone,
array_agg(content) FILTER (WHERE type = 'E') email
FROM Customers
LEFT JOIN Communication ON id_customer = id
GROUP BY id;
请注意,您可以定义聚合值的顺序,例如:
array_agg(content ORDER BY content) FILTER (WHERE type = 'T') phone