Spark 中的 case 语句 SQL

Question

我正在为我的公司制定工作流程。因此我需要使用 Spark SQL case-statement 来过滤一些东西。

我有一个名为 OPP_amount_euro 的列（用于某事的金额保存在那里）和一个名为 OPP_amount_euro_binned 的列（默认值为 1）。所以我想编程某种间隔。如果 OPP_amount_euro 中的值 < 30000，则 OPP_amount_euro_binned 中的值应为 1，依此类推。

我已经尝试找到解决方案，但它不是最好的。

select
case when OPP_amount_eur < 30000 then 1
when OPP_amount_eur >= 30000 then 2
when OPP_amount_eur >= 50000 then 3
when OPP_amount_eur >= 100000 then 4
when OPP_amount_eur >= 300000 then 5
when OPP_amount_eur >= 500000 then 6
when OPP_amount_eur >= 1000000 then 7
end as OPP_amount_eur_binned
from inputTable

所以这段代码运行良好，但我不能 select table 中的任何其他列。如果我在 select 之后写一个 '*'，我将收到以下错误消息：

Exception in processing: ParseException: mismatched input 'when' expecting {, ',', 'FROM', 'WHERE', 'GROUP', 'ORDER', 'HAVING', 'LIMIT', 'LATERAL', 'WINDOW', 'UNION', 'EXCEPT', 'INTERSECT', 'SORT', 'CLUSTER', 'DISTRIBUTE'}(line 2, pos 5) == SQL == Select * case when OPP_amount_eur < 30000 then 1 -----^^^ when OPP_amount_eur >= 30000 then 2 when OPP_amount_eur >= 50000 then 3 when OPP_amount_eur >= 100000 then 4 when OPP_amount_eur >= 300000 then 5 when OPP_amount_eur >= 500000 then 6 when OPP_amount_eur >= 1000000 then 7 end as OPP_amount_eur_binned from temptable3083b308bcec4124b6a4650f2bb40695

为什么我不能这样做？我在互联网上搜索了它，在正常情况下 SQL 它似乎可以工作，为什么这在 Spark SQL 中是不可能的？有什么解决办法吗？

对于我的错误描述，我深表歉意，但我在这里绝对是新手，而且我从未接触过 Spark SQL。我是一名学生。

Answer 1

您应该使用别名：

SELECT CASE....,
       t.*
FROM YourTable t

Answer 2

这是我的问题的解决方案

 Select inputTable.*,

case 
     when OPP_amount_eur between 0 and 30000 then 1
     when OPP_amount_eur between 30000 and 50000 then 2
     when OPP_amount_eur between 50000 and 100000 then 3
     when OPP_amount_eur between 100000 and 300000 then 4
     when OPP_amount_eur between 300000 and 500000 then 5
     when OPP_amount_eur between 500000 and 1000000 then 6
     else '7'

     end as OPP_amount_eur_binned

from inputTable

Spark 中的 case 语句 SQL

case statement in Spark SQL

sql

case

case-statement

apache-spark-sql