具有相同键的行占总行数的百分比 table

Percantage of rows with same key in from total amount of rows table

我需要找到 table 中重复的日志百分比。因此,我使用 "having" 进行了查询,以检查密钥是否重复。问题是在这样做之后 "having" 我丢失了所有未重复的日志。

这里是 table:

这是我的查询:

(SELECT count(params_advertiserId) AS duplicates 
 FROM android_clicks 
 GROUP BY params_advertiserId ,app_id ,date --my key is a triplet
 HAVING COUNT(params_advertiserId) > 1)

不胜感激。

GROUP BY 使用逗号 , 而不是 AND

 SELECT count(params_advertiserId) AS duplicates 
 FROM android_clicks 
 GROUP BY params_advertiserId , app_id , date
 HAVING COUNT(params_advertiserId) > 1

这就是你想要的吗?

select (count(*) - count(distinct params_advertiserId, app_id, date)) / count(*) as duplicate_ratio
from android_clicks ac;

您的查询不正确,因为 AND 用于布尔表达式。所以 GROUP BY 表达式的结果为真、假或 NULL.

如果需要计数,则将其包装为子查询:

SELECT COUNT(*) as num_duplicates
FROM (SELECT params_advertiserId, app_id, date AS duplicates 
      FROM android_clicks ac
      GROUP BY params_advertiserId, app_id, date 
      HAVING COUNT(*) > 1
     );