如何使用 Hive 在同一个 table 上合并两列
How to combine two columns on the same table using Hive
现在我有:
Scorecard
team1
team2
Winner
Margin
Ground
Match Date
Year
ODI # 1
Australia
England
Australia
5 wickets
Melbourne
5-Jan-71
1971
ODI # 2
England
Australia
England
6 wickets
Manchester
24-Aug-72
1972
ODI # 3
England
Australia
Australia
5 wickets
Lord's
26-Aug-72
1972
ODI # 4
England
Australia
England
2 wickets
Birmingham
28-Aug-72
1972
ODI # 5
New Zealand
Pakistan
New Zealand
22 runs
Christchurch
11-Feb-73
1973
而我想要的是组合 team1 和 team2 然后得到远程列表
基于我上面的示例:
teams
Australia
England
New Zealand
Pakistan
我正在使用 Cloudera Hive - 我正在尝试让工会工作。
我也试过:
SELECT concat_ws('^',(SPLIT('${team1,team2}',',')));
但是,输出只是给我:
${team1^team2}
最简单的方法是使用 union
:
select team1 as teams from tablename
union distinct
select team2 from tablename
这是使用子查询的另一种方式:
Select distinct teams from (
select team1 as teams from tablename
union
select team2 from tablename
) t
现在我有:
Scorecard | team1 | team2 | Winner | Margin | Ground | Match Date | Year |
---|---|---|---|---|---|---|---|
ODI # 1 | Australia | England | Australia | 5 wickets | Melbourne | 5-Jan-71 | 1971 |
ODI # 2 | England | Australia | England | 6 wickets | Manchester | 24-Aug-72 | 1972 |
ODI # 3 | England | Australia | Australia | 5 wickets | Lord's | 26-Aug-72 | 1972 |
ODI # 4 | England | Australia | England | 2 wickets | Birmingham | 28-Aug-72 | 1972 |
ODI # 5 | New Zealand | Pakistan | New Zealand | 22 runs | Christchurch | 11-Feb-73 | 1973 |
而我想要的是组合 team1 和 team2 然后得到远程列表
基于我上面的示例:
teams |
---|
Australia |
England |
New Zealand |
Pakistan |
我正在使用 Cloudera Hive - 我正在尝试让工会工作。
我也试过:
SELECT concat_ws('^',(SPLIT('${team1,team2}',',')));
但是,输出只是给我: ${team1^team2}
最简单的方法是使用 union
:
select team1 as teams from tablename
union distinct
select team2 from tablename
这是使用子查询的另一种方式:
Select distinct teams from (
select team1 as teams from tablename
union
select team2 from tablename
) t