在执行 concat_ws 和 collect_list 后,Hive Coalesce 函数未返回预期结果

Hive Coalesce function isn't returning expected results after performing concat_ws and and collect_list

我需要在 table 服务中转换列 servicename,然后合并值。

services:
userid servicename
1      A
2      B
2      C
3      B
1      C
4
5      C

我正在尝试的查询是

SELECT userid,
       COALESCE(a,b,c) AS servicename
FROM (SELECT userid,
             concat_ws('',a) AS a,
             concat_ws('',b) AS b,
             concat_ws('',c) AS c
      FROM (SELECT userid,
                   collect_list(a.group_map['A']) AS a,
                   collect_list(a.group_map['B']) AS b,
                   collect_list(a.group_map['C']) AS c
            FROM (SELECT userid,
                         MAP(servicename,servicename) AS group_map
                  FROM (SELECT userid, servicename, servicename FROM services) AS table1) AS table2) AS table3) AS table4

这里是每个 SELECT 查询的细目及其结果:

查询 1:

SELECT userid,
       servicename,
       servicename
FROM services

结果 1:

userid servicename servicename
   1      A           A
   2      B           B
   2      C           C
   3      B           B
   1      C           C
   4 
   5      C           C 

查询 2:

SELECT userid,
       MAP(servicename,servicename) AS group_map
FROM table1

结果 2:

userid    group_map
    1      {"A":"A"}
    2      {"B":"B"}
    2      {"C":"C"}
    3      {"B":"B"}
    1      {"C":"C"}
    4      {"null":"null"}
    5      {"C":"C"}

查询 3:

SELECT userid,
       collect_list(a.group_map['A']) AS a,
       collect_list(a.group_map['B']) AS b,
       collect_list(a.group_map['C']) AS c
FROM table2

结果 3:

userid    a      b      c
1       ["A"]         ["C"]
2              ["B"]  ["C"]
3              ["B"]
4           
5                     ["C"]

下面的concat_ws函数是将数组转String

查询 4:

SELECT userid,
       concat_ws('',a) AS a,
       concat_ws('',b) AS b,
       concat_ws('',c) AS c
FROM table3

结果 4:

userid    a    b   c
1         A        C
2              B   C
3              B
4           
5                  C

最后在查询中使用 COALESCE 函数得到如下结果:

查询 5

SELECT userid,
       COALESCE(a,b,c) AS servicename
FROM table4

下面是我希望看到的结果

结果 5:

userid  servicename
1       A
2       B
3       B
4
5       C

但是,我只得到 Service A 或 null,如下所示:

userid  servicename
    1       A
    2       
    3       
    4
    5       

有人怀疑有什么问题吗?是否需要铸造?

我一直在尝试不同的东西,也尝试在网上搜索,但还没有成功。

提前致谢!

看起来您只需要每个 ID 的 1 个服务名称

select * from 
(
     select userid,servicename,row_number() over (partition by userid order by servicename) as row_no 
) res
where res.row_no = 1

这应该让你

userid  servicename
1       A
2       B
3       B
4
5       C

如果您尝试为每个用户 ID 转换服务名称

select userid, collect_set(servicename)
from table
group by userid;

这应该让你

userid  servicename
1       A,C
2       B,C
3       B
4
5       C