给定一组 ID，Return 仅包含这些 ID 的订单子集

Question

给定一组product_ids，只有有那些product_ids的order_ids是什么？

对于下面的示例，我只想要 order_ids 具有 (a,b,c) 的某种组合。我有 2 个 table，如下所示：

“交易”table:

order_id | product_id |
---------+-------------
    1    |    a       |
    1    |    b       |
    2    |    a       |
    2    |    X       |
    3    |    a       |
    3    |    b       |
    3    |    c       |
    ...  |    ...     |
    999  |    Y       |

“产品”table:

product_id |
------------
     a     |
     b     |
     c     |
     d     |
     X     |
     Y     |
     ...   |
     ZZZ   |

期望的输出有 2 order_ids 和预期的 table 输出：

order_id |
----------
    1    |
    3    |

请注意，尽管 order_id == 2 具有 product_id == a，但它已被删除，但由于它具有 product_id == X，因此应将其删除。

所以不简单:

SELECT DISTINCT(order_id)
FROM transactions
WHERE product_id IN (a, b, c)

Answer 1

我们需要定义与您的要求相反的，并根据它进行过滤。那么，哪些订单有 至少一笔交易 不在 a,b,c 中。我们统计订单分组中此类交易的数量，过滤掉COUNT > 0的订单，只返回COUNT = 0.

的订单

SELECT order_id
FROM transactions
GROUP BY order_id
HAVING COUNT(CASE WHEN product_id NOT IN (a, b, c) THEN 1 END) = 0

如果您将 a,b,c 作为另一个 table 中的产品列表，并且您希望对其进行过滤而不是硬编码到查询中，那么它会稍微复杂一些：

SELECT order_id
FROM transactions AS t
LEFT JOIN listOfProducts AS l ON l.product_id = t.product_id
GROUP BY order_id
HAVING COUNT(CASE WHEN l.product_id IS NULL THEN 1 END) = 0

Answer 2

通常，有一个 orders table 与之对应，每个订单只有一行。

如果我们可以进一步假设每个订单总是至少有一笔交易，这就可以完成工作：

SELECT o.id
FROM   orders o
WHERE  NOT EXISTS (
   SELECT FROM transactions  -- SELECT list can be empty for EXISTS test
   WHERE  order_id = o.id
   AND    product_id <> ALL ('{a,b,c}')
   );

这适用于非常常见的 product_id 或长列表。

对于短名单或稀有产品，先从正面选择开始会更快。喜欢：

SELECT order_id
FROM  (
   SELECT DISTINCT order_id
   FROM   transactions
   WHERE  product_id = ANY ('{a,b,c}')
   ) t
WHERE  NOT EXISTS (
   SELECT FROM transactions
   WHERE  order_id = t.order_id
   AND    product_id <> ALL ('{a,b,c}')
   );

(product_id) 上的索引对于性能至关重要。更好的是，(product_id, order_id) 上的多列索引加上 (order_id, product_id) 上的另一个索引。参见：

Is a composite index also good for queries on the first field?

关于数组文字的手册：

https://www.postgresql.org/docs/current/arrays.html#ARRAYS-INPUT

关于 ANY 和 ALL 结构：

IN vs ANY operator in PostgreSQL
https://www.postgresql.org/docs/current/functions-subquery.html

给定一组 ID，Return 仅包含这些 ID 的订单子集

Given a set of IDs, Return the Subset of Orders with only those IDs

sql

postgresql

relational-division

amazon-redshift