提取其他字段也匹配的重复值

Extracting duplicate values where other fields also match

我正在使用以下查询在我的数据集中查找重复的 zip 值。

这确实可以显示任何重复的 zip 值的国家、城市和街道,但我真的希望它只包含具有相同国家、城市和街道而不仅仅是 zip 值的重复项?

SELECT
  Country,
  City,
  Street,
  zip
FROM
  project.dataset.tablename
WHERE
  zip > 1
  AND CAST(zip AS string) IN (
  SELECT
    CAST(zip AS string)
  FROM
    project.dataset.tablename
  GROUP BY
    CAST(zip AS string)
  HAVING
    COUNT(CAST(zip AS string)) > 1 )
ORDER BY
  zip DESC

我想你想要:

SELECT t.*
FROM (SELECT t.*,
             COUNT(*) OVER (PARTITION BY zip, country, city, street) as cnt
      FROM project.dataset.tablename t
     ) t 
WHERE cnt > 1
ORDER BY zip;

无论如何,对于此类问题,window 函数通常提供最佳解决方案。