SQL (Redshift):查找特定值是否在多个列中出现多次
SQL (Redshift): Find if a specific value occurs more than once in multiple columns
假设我有 5 列,它们可能都包含相同的值。我想计算一个新列,它 告诉我特定值是否出现不止一次 。不同情况下的期望输出示例:
我想扫描至少包含值 'X':
之一的所有行
id
A
B
C
D
E
Result
1
X
Y
X
Z
True
2
X
Y
Y
Z
False
3
Y
Y
Z
False
4
X
X
Y
X
True
A 'case when' 理论上是可行的,但是遍历所有选项是不可行的:这需要太多的组合。也许是一些内心的疑问?
编辑:
我实际上通过连接找到了解决方案。但是戈登·利诺夫的回答要干净得多。
select id,
case when b.num_X > 1 then True else False end as result
from foo f
join (
select a+b+c+d+e as num_X from (
select
id,
case when A = 'X' then 1 else 0
end as a,
case when B = 'X' then 1 else 0
end as b,
case when C = 'X' then 1 else 0
end as c,
case when D = 'X' then 1 else 0
end as d,
case when E = 'X' then 1 else 0
end as e
from foo
)
) b on f.id = b.id
一种方法是只计算它们:
select t.*,
( (a = 'X')::int + (b = 'X')::int + (c = 'X')::int + (d = 'X')::int + (e = 'X')::int) ) >= 2 as result
from t;
如果列可以包含NULL
个值,那么你需要注意这一点。一种方法是在上面的表达式中使用 coalesce()
:
( (coalesce(a, '') = 'X')::int +
(coalesce(b, '') = 'X')::int +
(coalesce(c, '') = 'X')::int +
(coalesce(d, '') = 'X')::int +
(coalesce(e, '') = 'X')::int)
) >= 2 as result
假设我有 5 列,它们可能都包含相同的值。我想计算一个新列,它 告诉我特定值是否出现不止一次 。不同情况下的期望输出示例:
我想扫描至少包含值 'X':
之一的所有行id | A | B | C | D | E | Result |
---|---|---|---|---|---|---|
1 | X | Y | X | Z | True | |
2 | X | Y | Y | Z | False | |
3 | Y | Y | Z | False | ||
4 | X | X | Y | X | True |
A 'case when' 理论上是可行的,但是遍历所有选项是不可行的:这需要太多的组合。也许是一些内心的疑问?
编辑:
我实际上通过连接找到了解决方案。但是戈登·利诺夫的回答要干净得多。
select id,
case when b.num_X > 1 then True else False end as result
from foo f
join (
select a+b+c+d+e as num_X from (
select
id,
case when A = 'X' then 1 else 0
end as a,
case when B = 'X' then 1 else 0
end as b,
case when C = 'X' then 1 else 0
end as c,
case when D = 'X' then 1 else 0
end as d,
case when E = 'X' then 1 else 0
end as e
from foo
)
) b on f.id = b.id
一种方法是只计算它们:
select t.*,
( (a = 'X')::int + (b = 'X')::int + (c = 'X')::int + (d = 'X')::int + (e = 'X')::int) ) >= 2 as result
from t;
如果列可以包含NULL
个值,那么你需要注意这一点。一种方法是在上面的表达式中使用 coalesce()
:
( (coalesce(a, '') = 'X')::int +
(coalesce(b, '') = 'X')::int +
(coalesce(c, '') = 'X')::int +
(coalesce(d, '') = 'X')::int +
(coalesce(e, '') = 'X')::int)
) >= 2 as result