Impala SQL 查询 1 table 并找到 3 个主机名的共同点

Impala SQL query with 1 table and finding common with 3 hostnames

我有一个 table 并且正在尝试使用 Impala SQL.

获取所有用户共有的目标主机名

代理 table:

sourcehostname destinationhostname
comp1          google.com
comp2          google.com
comp1          yahoo.com
comp1          facebook.com
comp2          facebook.com
comp3          facebook.com

当我 运行 从 2 个源主机名上的一个 table 取回不同的目标主机名时,这有效:

SELECT DISTINCT t1.destinationhostname
FROM proxy_table t1 JOIN proxy_table t2
  ON t1.destinationhostname = t2.destinationhostname AND t1.sourcehostname  ="comp1" AND t2.sourcehostname="comp2";

它returns:

google.comfacebook.com

我正在尝试 return 值,其中 comp1 comp2comp3 都有一些共同点,即 facebook.com 但我无法得到这个查询相当正确:

SELECT DISTINCT t1.destinationhostname
FROM proxy_table t1 JOIN proxy_table t2 JOIN proxy_table t3
  ON t1.destinationhostname = t2.destinationhostname AND t1.sourcehostname  ="comp1" AND t2.sourcehostname="comp2" t3.sourcehostname = "comp3";

在查询中我想指定不同的 3 台计算机,因为它们有数千台,但我只想 select 特定的计算机。

使用聚合。假设没有重复行:

select destinationhostname
from proxy_table 
group by destinationhostname
having count(*) = (select count(distinct sourcehostname) from proxy_table);

如果可以有重复的行,只需更改 having:

having count(distinct sourcehostname) = (select count(distinct sourcehostname) from proxy_table);

如果您正好需要三个用户,则只需使用 = 3

你能试试下面吗

SELECT DISTINCT t1.destinationhostname
FROM proxy_table t1 JOIN proxy_table t2
ON t1.destinationhostname = t2.destinationhostname 
JOIN proxy_table t3
ON t1.destinationhostname = t3.destinationhostname 
and t2.destinationhostname = t3.destinationhostname 
WHERE
t1.sourcehostname  ="comp1" 
AND t2.sourcehostname="comp2"
AND t3.sourcehostname = "comp3";

如果您遇到问题,请告诉我