以多种不同方式聚合同一列
Aggregate the same column in multiple different ways
我正在尝试获取与每个产品关联的类别数组,然后还在另一列中获取每个产品的顶级父类别,根据我的逻辑,这会为类别数组找到相同的值,但是只选择 where parent_id is NULL
应该只拉回一个值和每个 ID 1 条记录。
我真的不知道构建此查询的最佳方式。我有一些工作,但它还在父类别列中显示具有父 ID 的类别的 NULL 值,并为每个产品制作第二条记录,因为我被迫将它放在分组依据中。基本上,我认为我没有以正确或最有效的方式这样做。
想要的结果:
+----+----------------+------------------+------------------------------------------------+------------------+
| id | name | category_ids | category_names | parent_category |
+----+----------------+------------------+------------------------------------------------+------------------+
| 1 | Product Name 1 | {111,222,333} | {Electronics, computers, computer accessories} | Electronics |
+----+----------------+------------------+------------------------------------------------+------------------+
我当前的查询(不理想):
select p.id,
p.name,
array_agg(category_id) as category_ids,
regexp_replace(array_agg(c.name)::text,'"|''','','gi') as category_names,
c1.name as parent_category
from products p
join product_categorizations pc on pc.product_id = p.id
join categories c on pc.category_id = c.id
full outer join (
select name, id from categories
where parent_id is null and name is not null
) c1 on c.id = c1.id
group by 1,2,5;
+----+----------------+------------------+-----------------------------------+------------------+
| id | name | category_ids | category_names | parent_category |
+----+----------------+------------------+-----------------------------------+------------------+
| 1 | Product Name 1 | {111} | {Electronics} | Electronics |
+----+----------------+------------------+-----------------------------------+------------------+
| 1 | Product Name 1 | {222,333} | {computers, computer accessories} | NULL |
+----+----------------+------------------+-----------------------------------+------------------+
将 FULL JOIN
替换为聚合 FILTER
子句:
SELECT p.id
, p.name
, array_agg(pc.category_id) AS category_ids
, string_agg(c.name, ', ') AS category_names -- regexp_replace .. ?
<b> , min(c.name) FILTER (WHERE c.parent_id IS NULL) AS parent_category</b>
FROM products p
JOIN product_categorizations pc ON pc.product_id = p.id
JOIN categories c ON pc.category_id = c.id
GROUP BY p.id;
参见:
- Aggregate columns with additional (distinct) filters
(为什么要添加 AND name IS NOT NULL
?无论如何,min()
都会忽略 NULL
值。)
聚合 所有 产品,并强制执行参照完整性时,这应该会快一点:
SELECT p.name, pc.*
FROM products p
JOIN (
SELECT pc.product_id AS id
, array_agg(pc.category_id) AS category_ids
, string_agg(c.name, ', ') AS category_names
, min(c.name) FILTER (WHERE c.parent_id IS NULL) AS parent_category
FROM product_categorizations pc
JOIN categories c ON pc.category_id = c.id
GROUP BY 1
) pc USING (id);
重点是 product
仅在 聚合行后加入 。
旁白:"name" 不是一个非常有用的列名。相关:
- How to implement a many-to-many relationship in PostgreSQL?
我正在尝试获取与每个产品关联的类别数组,然后还在另一列中获取每个产品的顶级父类别,根据我的逻辑,这会为类别数组找到相同的值,但是只选择 where parent_id is NULL
应该只拉回一个值和每个 ID 1 条记录。
我真的不知道构建此查询的最佳方式。我有一些工作,但它还在父类别列中显示具有父 ID 的类别的 NULL 值,并为每个产品制作第二条记录,因为我被迫将它放在分组依据中。基本上,我认为我没有以正确或最有效的方式这样做。
想要的结果:
+----+----------------+------------------+------------------------------------------------+------------------+
| id | name | category_ids | category_names | parent_category |
+----+----------------+------------------+------------------------------------------------+------------------+
| 1 | Product Name 1 | {111,222,333} | {Electronics, computers, computer accessories} | Electronics |
+----+----------------+------------------+------------------------------------------------+------------------+
我当前的查询(不理想):
select p.id,
p.name,
array_agg(category_id) as category_ids,
regexp_replace(array_agg(c.name)::text,'"|''','','gi') as category_names,
c1.name as parent_category
from products p
join product_categorizations pc on pc.product_id = p.id
join categories c on pc.category_id = c.id
full outer join (
select name, id from categories
where parent_id is null and name is not null
) c1 on c.id = c1.id
group by 1,2,5;
+----+----------------+------------------+-----------------------------------+------------------+
| id | name | category_ids | category_names | parent_category |
+----+----------------+------------------+-----------------------------------+------------------+
| 1 | Product Name 1 | {111} | {Electronics} | Electronics |
+----+----------------+------------------+-----------------------------------+------------------+
| 1 | Product Name 1 | {222,333} | {computers, computer accessories} | NULL |
+----+----------------+------------------+-----------------------------------+------------------+
将 FULL JOIN
替换为聚合 FILTER
子句:
SELECT p.id
, p.name
, array_agg(pc.category_id) AS category_ids
, string_agg(c.name, ', ') AS category_names -- regexp_replace .. ?
<b> , min(c.name) FILTER (WHERE c.parent_id IS NULL) AS parent_category</b>
FROM products p
JOIN product_categorizations pc ON pc.product_id = p.id
JOIN categories c ON pc.category_id = c.id
GROUP BY p.id;
参见:
- Aggregate columns with additional (distinct) filters
(为什么要添加 AND name IS NOT NULL
?无论如何,min()
都会忽略 NULL
值。)
聚合 所有 产品,并强制执行参照完整性时,这应该会快一点:
SELECT p.name, pc.*
FROM products p
JOIN (
SELECT pc.product_id AS id
, array_agg(pc.category_id) AS category_ids
, string_agg(c.name, ', ') AS category_names
, min(c.name) FILTER (WHERE c.parent_id IS NULL) AS parent_category
FROM product_categorizations pc
JOIN categories c ON pc.category_id = c.id
GROUP BY 1
) pc USING (id);
重点是 product
仅在 聚合行后加入 。
旁白:"name" 不是一个非常有用的列名。相关:
- How to implement a many-to-many relationship in PostgreSQL?