CASE WHEN - LIKE - Hadoop Hive 中的 REGEXP
CASE WHEN - LIKE - REGEXP in Hadoop Hive
我想使用 CASE WHEN、LIKE 和正则表达式在配置单元 Table 中编写查询。我使用了 regexp
和 rlike
,但我没有得到想要的结果。到目前为止,我的尝试如下
select distinct ending from
(select date, ending, name, count(distinct id)
from (select CONCAT_WS("/",year,month,day,hour) as date, id, name,
case when type = 'TRAN' then 'tran'
when events regexp '%[:]no_reply[:]%[^o][^n][:]incomplete[:]%' and type rlike '%HUP' then 'con'
when events not regexp '%[:]no_reply[:]%[^o][^n][:]incomplete[:]%' and type rlike '%HUP' then 'aban'
else 'other'
end as ending
from data_struct1) tmp
group by date, ending, name) tmp2;
还有
select distinct ending from
(select date, ending, name, count(distinct id)
from (select CONCAT_WS("/",year,month,day,hour) as date, id, name,
case when type = 'TRAN' then 'tran'
when events rlike '%[:]no_reply[:]%[^o][^n][:]incomplete[:]%' and type rlike '%HUP' then 'con'
when events not rlike '%[:]no_reply[:]%[^o][^n][:]incomplete[:]%' and type rlike '%HUP' then 'aban'
else 'other'
end as ending
from data_struct1) tmp
group by date, ending, name) tmp2;
两个查询 return 不正确的结果(不是语法错误,只是结果不正确)。
有很多关于正则表达式量词的文档,例如这个:https://docs.microsoft.com/en-us/dotnet/standard/base-types/quantifiers-in-regular-expressions
select 'opencase_2,initial_state:inquiry,inquiry:no_reply:initial_state:incomplete::,inquiry:reask:secondary_state:complete::' regexp 'no_reply:[^:]+:incomplete';
OK
true
这也是错误的:rlike '%HUP'
。它应该像这样 '.*HUP$'
(在字符串的末尾)或者简单地 'HUP' 如果 HUP 位于何处无关紧要:在字符串的中间或末尾或开头
rlike
和 regexp
在您的查询中工作相同,最好使用相同的运算符:仅 regexp 或 rlike。这两个是同义词。
我想使用 CASE WHEN、LIKE 和正则表达式在配置单元 Table 中编写查询。我使用了 regexp
和 rlike
,但我没有得到想要的结果。到目前为止,我的尝试如下
select distinct ending from
(select date, ending, name, count(distinct id)
from (select CONCAT_WS("/",year,month,day,hour) as date, id, name,
case when type = 'TRAN' then 'tran'
when events regexp '%[:]no_reply[:]%[^o][^n][:]incomplete[:]%' and type rlike '%HUP' then 'con'
when events not regexp '%[:]no_reply[:]%[^o][^n][:]incomplete[:]%' and type rlike '%HUP' then 'aban'
else 'other'
end as ending
from data_struct1) tmp
group by date, ending, name) tmp2;
还有
select distinct ending from
(select date, ending, name, count(distinct id)
from (select CONCAT_WS("/",year,month,day,hour) as date, id, name,
case when type = 'TRAN' then 'tran'
when events rlike '%[:]no_reply[:]%[^o][^n][:]incomplete[:]%' and type rlike '%HUP' then 'con'
when events not rlike '%[:]no_reply[:]%[^o][^n][:]incomplete[:]%' and type rlike '%HUP' then 'aban'
else 'other'
end as ending
from data_struct1) tmp
group by date, ending, name) tmp2;
两个查询 return 不正确的结果(不是语法错误,只是结果不正确)。
有很多关于正则表达式量词的文档,例如这个:https://docs.microsoft.com/en-us/dotnet/standard/base-types/quantifiers-in-regular-expressions
select 'opencase_2,initial_state:inquiry,inquiry:no_reply:initial_state:incomplete::,inquiry:reask:secondary_state:complete::' regexp 'no_reply:[^:]+:incomplete';
OK
true
这也是错误的:rlike '%HUP'
。它应该像这样 '.*HUP$'
(在字符串的末尾)或者简单地 'HUP' 如果 HUP 位于何处无关紧要:在字符串的中间或末尾或开头
rlike
和 regexp
在您的查询中工作相同,最好使用相同的运算符:仅 regexp 或 rlike。这两个是同义词。