SQL 总是在 and/or 从未在过滤器中
SQL ALWAYS IN and/or NEVER IN filter
我有合成数据如
A, B
-----
1, 1
1, 2
2, 1
2, 1
3, 2
3, 2
根据下面的伪代码,我有两个问题需要回答。
问题 1
SELECT A
WHERE B NEVER_IN('2')
FROM table
>>>
A
-
2
问题 2
SELECT A
WHERE B ALWAYS_IN('2')
FROM table
>>>
A
-
3
在这两种情况下,A=1 都已提交,因为 A 有时 不总是或从不 等于 2。
有没有直接的方法来做到这一点?此外,我使用 ALWAYS_IN 和 NEVER_IN 因为在实践中,我需要知道 A 是否始终等于数组中的元素或永远不等于数组中的元素。
在 SQL(使用 Presto)中完成此任务的最佳方法是什么?
这是我对 ALWAYS_IN 案例的最佳尝试,速度非常慢:
WITH results AS (
SELECT A, B
FROM TABLE),
possibly_good AS (
SELECT A
FROM results
WHERE B IN (2)
),
bad AS (SELECT R.A
FROM results R
WHERE R.A NOT IN (
SELECT P.A
FROM possibly_good P
)),
good AS (
SELECT P.A
FROM possibly_good P
WHERE P.A NOT IN(
SELECT B.A
FROM bad B
))
SELECT * FROM good
您可以使用 group by
和 array_agg
and then check the array contents (depended on Presto version either with all_match
or work around with not contains
or cardinality
+ filter
):
--sample data
WITH dataset (A, B) AS (
VALUES (1, 1),
(1, 2),
(2, 1),
(2, 1),
(3, 2),
(3, 2)
)
-- query
select a
from
(
select a
, array_agg(b) agg -- concatenate into array
from dataset
group by a
)
where
not contains(agg, 2) -- NEVER_IN('2')
-- For ALWAYS_IN('2') use all_match if available otherwise:
-- cardinality(agg) = cardinality(filter(agg, v -> v = 2))
输出:
a
2
我有合成数据如
A, B
-----
1, 1
1, 2
2, 1
2, 1
3, 2
3, 2
根据下面的伪代码,我有两个问题需要回答。
问题 1
SELECT A
WHERE B NEVER_IN('2')
FROM table
>>>
A
-
2
问题 2
SELECT A
WHERE B ALWAYS_IN('2')
FROM table
>>>
A
-
3
在这两种情况下,A=1 都已提交,因为 A 有时 不总是或从不 等于 2。
有没有直接的方法来做到这一点?此外,我使用 ALWAYS_IN 和 NEVER_IN 因为在实践中,我需要知道 A 是否始终等于数组中的元素或永远不等于数组中的元素。
在 SQL(使用 Presto)中完成此任务的最佳方法是什么?
这是我对 ALWAYS_IN 案例的最佳尝试,速度非常慢:
WITH results AS (
SELECT A, B
FROM TABLE),
possibly_good AS (
SELECT A
FROM results
WHERE B IN (2)
),
bad AS (SELECT R.A
FROM results R
WHERE R.A NOT IN (
SELECT P.A
FROM possibly_good P
)),
good AS (
SELECT P.A
FROM possibly_good P
WHERE P.A NOT IN(
SELECT B.A
FROM bad B
))
SELECT * FROM good
您可以使用 group by
和 array_agg
and then check the array contents (depended on Presto version either with all_match
or work around with not contains
or cardinality
+ filter
):
--sample data
WITH dataset (A, B) AS (
VALUES (1, 1),
(1, 2),
(2, 1),
(2, 1),
(3, 2),
(3, 2)
)
-- query
select a
from
(
select a
, array_agg(b) agg -- concatenate into array
from dataset
group by a
)
where
not contains(agg, 2) -- NEVER_IN('2')
-- For ALWAYS_IN('2') use all_match if available otherwise:
-- cardinality(agg) = cardinality(filter(agg, v -> v = 2))
输出:
a |
---|
2 |