Google Bigquery：删除列值不同的行（计数 = 1）而不引用 table

Question

给定一个 table，我希望删除列值不同的行。

所以如果我们想对矩阵 A 上的第 2 列执行此操作

     c1 c2 c3
A = |1  2  4 |
    |1  2  5 |
    |1  1  6 |

产量

     c1 c2 c3
A = |1  2  4 |
    |1  2  5 |

这可以通过

轻松完成

SELECT * FROM Table WHERE c2 IN
(SELECT c2 FROM Table GROUP BY c2 HAVING COUNT(*) > 1)

不幸的是，在子查询的中间，您没有将数据存储在 Table 中，而且我不想创建视图，因为我需要在一个查询中进行所有过滤.

关于如何在不引用子查询中的 Table 的情况下仍然可以过滤掉关于单个列的不同行的任何想法？

解决方案的形式应为：

SELECT <something goes here>
FROM <the subquery which outputs A goes here>
<anything you want here that is legal Bigquery - e.g. can't reference A>

而且没有table可以参考

Answer 1

Bigquery 支持 window 函数，因此您可以这样做：

select t.*
from (select t.*, count(*) over (partition by col2) as cnt
      from table t
     ) t
where cnt >= 2;

这仍然在子查询中引用 table，但这是对 table 的唯一引用。

Google Bigquery: Remove rows where column value is distinct (count = 1) without referencing table