获得每组得分的前 5 行
Getting the Top 5 rows by score for each group
我正在尝试为每个 Reddit post 获取得分最高的 5 条评论。我只想检索每个 post 标题的得分最高的 N 条评论。
示例:我只想要评论 1 和评论 2 post。
Post 1 | Comment 1 | Comment Score 10
Post 1 | Comment 2 | Comment Score 9
Post 1 | Comment 3 | Comment Score 8
Post 2 | Comment 1 | Comment Score 10
Post 2 | Comment 2 | Comment Score 9
Post 2 | Comment 3 | Comment Score 8
标准SQL
SELECT
posts.title,
posts.url,
posts.score AS postsscore,
DATE_TRUNC(DATE(TIMESTAMP_SECONDS(posts.created_utc)), MONTH),
SUBSTR(comments.body, 0, 80),
comments.score AS commentsscore,
comments.id
FROM
`fh-bigquery.reddit_posts.2015*` AS posts
JOIN `fh-bigquery.reddit_comments.2015*` AS comments
ON posts.id = SUBSTR(comments.link_id, 4)
WHERE
posts.subreddit = 'Showerthoughts'
AND posts.score >100
AND comments.score >100
ORDER BY
posts.score DESC,
posts.title DESC,
comments.score DESC
以下适用于 BigQuery 标准 SQL
#standardSQL
SELECT * EXCEPT(pos) FROM (
SELECT
posts.title,
posts.url,
posts.score AS postsscore,
DATE_TRUNC(DATE(TIMESTAMP_SECONDS(posts.created_utc)), MONTH),
SUBSTR(comments.body, 0, 80),
comments.score AS commentsscore,
comments.id,
ROW_NUMBER() OVER(PARTITION BY posts.url ORDER BY comments.score DESC) pos
FROM `fh-bigquery.reddit_posts.2015*` AS posts
JOIN `fh-bigquery.reddit_comments.2015*` AS comments
ON posts.id = SUBSTR(comments.link_id, 4)
WHERE posts.subreddit = 'Showerthoughts'
AND posts.score >100
AND comments.score >100
)
WHERE pos < 3
ORDER BY postsscore DESC, title DESC, commentsscore DESC
我正在尝试为每个 Reddit post 获取得分最高的 5 条评论。我只想检索每个 post 标题的得分最高的 N 条评论。
示例:我只想要评论 1 和评论 2 post。
Post 1 | Comment 1 | Comment Score 10
Post 1 | Comment 2 | Comment Score 9
Post 1 | Comment 3 | Comment Score 8
Post 2 | Comment 1 | Comment Score 10
Post 2 | Comment 2 | Comment Score 9
Post 2 | Comment 3 | Comment Score 8
标准SQL
SELECT
posts.title,
posts.url,
posts.score AS postsscore,
DATE_TRUNC(DATE(TIMESTAMP_SECONDS(posts.created_utc)), MONTH),
SUBSTR(comments.body, 0, 80),
comments.score AS commentsscore,
comments.id
FROM
`fh-bigquery.reddit_posts.2015*` AS posts
JOIN `fh-bigquery.reddit_comments.2015*` AS comments
ON posts.id = SUBSTR(comments.link_id, 4)
WHERE
posts.subreddit = 'Showerthoughts'
AND posts.score >100
AND comments.score >100
ORDER BY
posts.score DESC,
posts.title DESC,
comments.score DESC
以下适用于 BigQuery 标准 SQL
#standardSQL
SELECT * EXCEPT(pos) FROM (
SELECT
posts.title,
posts.url,
posts.score AS postsscore,
DATE_TRUNC(DATE(TIMESTAMP_SECONDS(posts.created_utc)), MONTH),
SUBSTR(comments.body, 0, 80),
comments.score AS commentsscore,
comments.id,
ROW_NUMBER() OVER(PARTITION BY posts.url ORDER BY comments.score DESC) pos
FROM `fh-bigquery.reddit_posts.2015*` AS posts
JOIN `fh-bigquery.reddit_comments.2015*` AS comments
ON posts.id = SUBSTR(comments.link_id, 4)
WHERE posts.subreddit = 'Showerthoughts'
AND posts.score >100
AND comments.score >100
)
WHERE pos < 3
ORDER BY postsscore DESC, title DESC, commentsscore DESC