需要找到列的平均值和重复次数

Need to find average and number of repetitions of column

我有一个 SQL 句子 :

SELECT application.id,title,url,company.name AS company_name,package_name,ranking,date,platform,country.name AS country_name,collection.name AS collection_name,category.name AS category_name FROM application
JOIN application_history ON application_history.application_id = application.id
JOIN company ON application.company_id = company.id
JOIN country ON application_history.country_id = country.id
JOIN collection ON application_history.collection_id = collection.id
JOIN category ON application_history.category_id = category.id
WHERE application.platform=0
AND country.name ='CZ'
AND collection.name='topfreeapplications'
AND category.name='UTILITIES'
AND application_history.ranking <= 10
AND date::date BETWEEN date (CURRENT_DATE - INTERVAL '1 month') AND CURRENT_DATE
ORDER BY application_history.ranking ASC

它产生这个结果:

我想添加给定包裹的列平均排名和出现次数列,这将计算包裹在列表中出现的次数。我还想按 package_name 对结果进行分组,这样我就不会出现冗余。

到目前为止,我已经尝试在 ORDER BY 之前添加一个 GROUP BY By 子句:

GROUP BY package_name

但是 returns 我出错了:

column "application.id" must appear in the GROUP BY clause or be used in an aggregate function

如果我添加它要求我提供的每一列,那是行不通的。 我还尝试通过在 SELECT 之后添加来计算包名称的数量:

COUNT(package_name) AS count

它会产生类似的错误。

我怎样才能得到我想要的结果?我应该改为进行两个查询,还是可以一次获取所有内容? 我准确地说,我看过 S.O 上的其他答案,但其中 none 试图在 "produced" 列上进行计数。

感谢您的帮助。

编辑:

这是我最初预期的结果:

虽然 Gordon 的建议没有给我正确的结果,但它让我走上了正轨,当我读到这篇文章时: 来自 docs : "Unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row."

所以我又开始单独使用 COUNT 和 AVG。我的问题是我想显示排名列和日期以检查事情是否正确。但是将这些列放入 Select 会阻止 GROUP BY 按预期工作,正如 Jarlh 在评论中提到的那样。

工作查询:

SELECT application.id,title,url,company.name AS company_name,package_name,platform,country.name AS country_name,collection.name AS collection_name,category.name AS category_name, 
    COUNT(package_name) AS count, AVG(application_history.ranking) AS avg
    FROM application
    JOIN application_history ON application_history.application_id = application.id
    JOIN company ON application.company_id = company.id
    JOIN country ON application_history.country_id = country.id
    JOIN collection ON application_history.collection_id = collection.id
    JOIN category ON application_history.category_id = category.id
    WHERE application.platform=0
    AND country.name ='CZ'
    AND collection.name='topfreeapplications'
    AND category.name='UTILITIES'
    AND application_history.ranking <= 10
    AND date::date BETWEEN date (CURRENT_DATE - INTERVAL '1 month') AND CURRENT_DATE
    GROUP BY package_name,application.id,company.name,country.name,collection.name,category.name
    ORDER BY count DESC

我认为您需要 window/analytic 函数。下面添加两列,一列是每个包裹的行数,另一列是它们的平均排名:

SELECT application.id, title, url, company.name AS company_name, package_name, 
       ranking, date, platform, country.name AS country_name,
       collection.name AS collection_name, category.name AS category_name,
       count(*) over (partition by package_name) as count,
       avg(ranking) over (partition by package_name) as avg_package_ranking
FROM application . . .