Impala SQL 查询 1 table 并找到 3 个主机名的共同点
Impala SQL query with 1 table and finding common with 3 hostnames
我有一个 table 并且正在尝试使用 Impala SQL.
获取所有用户共有的目标主机名
代理 table:
sourcehostname destinationhostname
comp1 google.com
comp2 google.com
comp1 yahoo.com
comp1 facebook.com
comp2 facebook.com
comp3 facebook.com
当我 运行 从 2 个源主机名上的一个 table 取回不同的目标主机名时,这有效:
SELECT DISTINCT t1.destinationhostname
FROM proxy_table t1 JOIN proxy_table t2
ON t1.destinationhostname = t2.destinationhostname AND t1.sourcehostname ="comp1" AND t2.sourcehostname="comp2";
它returns:
google.com
和 facebook.com
我正在尝试 return 值,其中 comp1
comp2
和 comp3
都有一些共同点,即 facebook.com
但我无法得到这个查询相当正确:
SELECT DISTINCT t1.destinationhostname
FROM proxy_table t1 JOIN proxy_table t2 JOIN proxy_table t3
ON t1.destinationhostname = t2.destinationhostname AND t1.sourcehostname ="comp1" AND t2.sourcehostname="comp2" t3.sourcehostname = "comp3";
在查询中我想指定不同的 3 台计算机,因为它们有数千台,但我只想 select 特定的计算机。
使用聚合。假设没有重复行:
select destinationhostname
from proxy_table
group by destinationhostname
having count(*) = (select count(distinct sourcehostname) from proxy_table);
如果可以有重复的行,只需更改 having
:
having count(distinct sourcehostname) = (select count(distinct sourcehostname) from proxy_table);
如果您正好需要三个用户,则只需使用 = 3
。
你能试试下面吗
SELECT DISTINCT t1.destinationhostname
FROM proxy_table t1 JOIN proxy_table t2
ON t1.destinationhostname = t2.destinationhostname
JOIN proxy_table t3
ON t1.destinationhostname = t3.destinationhostname
and t2.destinationhostname = t3.destinationhostname
WHERE
t1.sourcehostname ="comp1"
AND t2.sourcehostname="comp2"
AND t3.sourcehostname = "comp3";
如果您遇到问题,请告诉我
我有一个 table 并且正在尝试使用 Impala SQL.
获取所有用户共有的目标主机名代理 table:
sourcehostname destinationhostname
comp1 google.com
comp2 google.com
comp1 yahoo.com
comp1 facebook.com
comp2 facebook.com
comp3 facebook.com
当我 运行 从 2 个源主机名上的一个 table 取回不同的目标主机名时,这有效:
SELECT DISTINCT t1.destinationhostname
FROM proxy_table t1 JOIN proxy_table t2
ON t1.destinationhostname = t2.destinationhostname AND t1.sourcehostname ="comp1" AND t2.sourcehostname="comp2";
它returns:
google.com
和 facebook.com
我正在尝试 return 值,其中 comp1
comp2
和 comp3
都有一些共同点,即 facebook.com
但我无法得到这个查询相当正确:
SELECT DISTINCT t1.destinationhostname
FROM proxy_table t1 JOIN proxy_table t2 JOIN proxy_table t3
ON t1.destinationhostname = t2.destinationhostname AND t1.sourcehostname ="comp1" AND t2.sourcehostname="comp2" t3.sourcehostname = "comp3";
在查询中我想指定不同的 3 台计算机,因为它们有数千台,但我只想 select 特定的计算机。
使用聚合。假设没有重复行:
select destinationhostname
from proxy_table
group by destinationhostname
having count(*) = (select count(distinct sourcehostname) from proxy_table);
如果可以有重复的行,只需更改 having
:
having count(distinct sourcehostname) = (select count(distinct sourcehostname) from proxy_table);
如果您正好需要三个用户,则只需使用 = 3
。
你能试试下面吗
SELECT DISTINCT t1.destinationhostname
FROM proxy_table t1 JOIN proxy_table t2
ON t1.destinationhostname = t2.destinationhostname
JOIN proxy_table t3
ON t1.destinationhostname = t3.destinationhostname
and t2.destinationhostname = t3.destinationhostname
WHERE
t1.sourcehostname ="comp1"
AND t2.sourcehostname="comp2"
AND t3.sourcehostname = "comp3";
如果您遇到问题,请告诉我