SQL/Vertica - 分组多属性组合

SQL/Vertica - grouping multi-attribute combinations

我有以下类型的数据集:

user_id   country1  city1      country2  city2
1         usa       new york   france    paris 
2         usa       dallas     japan     tokyo 
3         india     mumbai     italy     rome 
4         france    paris      usa       new york 
5         brazil    sao paulo  russia    moscow 

我想对 country1city1country2city2 的组合进行分组,其中顺序(是 country1country2) 应该没关系。通常,我会尝试:

SELECT   country1 
       , city1
       , country2
       , city2 
       , COUNT(*) 
FROM dataset
GROUP BY country1 
       , city1
       , country2
       , city2 

但是,此代码片段将带有 user_id=1user_id=4 的行视为两个不同的情况,我希望将它们视为等同的。

有人知道如何解决这个问题吗?

提前致谢!

通常,您使用 least()greatest() 来处理此类问题,但您有两列,而不是一列。所以,让我们通过比较城市来做到这一点。我猜 citycountry:

更独特
select (case when city1 < city2 then country1 else country2 end) as country1,
       (case when city1 < city2 then city1 else city2 end) as city1,
       (case when city1 < city2 then country2 else country1 end) as country2,
       (case when city1 < city2 then city2 else city1 end) as city2,
       count(*)
from dataset
group by (case when city1 < city2 then country1 else country2 end),
       (case when city1 < city2 then city1 else city2 end),
       (case when city1 < city2 then country2 else country1 end),
       (case when city1 < city2 then city2 else city1 end)