Select 来自 BigQuery 的最佳执行者 Table
Select Top Performers From BigQuery Table
我有一个 BigQuery table,如下所示:
User | URL | Sessions
user1 | example.com/1/ | 3000
user2 | example.com/2/ | 4000
user3 | example.com/2/ | 5000
user4 | example.com/1/ | 1000
... | ... | ...
我希望为每个 URL 拉出表现最好的用户。所以理想情况下,最终输出给我一个较小的 table,每个 URL 都有一个用户值,这是顶级会话驱动程序。
我尝试了一个 SQL 查询,例如:
SELECT User, URL, ARRAY_AGG(Sessions ORDER BY Sessions DESC LIMIT 1) FROM 'table'
但是一直报错。非常感谢任何帮助!
假设我正确地回答了您的问题,您只想在 per-URL 的基础上对所有会话求和并拆分这些值 per-user?如果没有用户有重复的 URL,总和实际上不会有任何聚合,但它允许您在对其他列进行分组时仍然显示它。
试一试以下内容:
SELECT
User,
URL,
SUM(Sessions) AS Total_Sessions
FROM `table`
GROUP BY User, URL
ORDER BY Total_Sessions DESC
你将不得不做一些事情,比如使用排名或 row_number 函数:
样本:
WITH input AS
(SELECT 1 as user, 'x' as url, 100 as session
UNION ALL SELECT 1 as user, 'x' as url, 200 as session
UNION ALL SELECT 1 as user, 'y' as url, 400 as session
UNION ALL SELECT 2 as user, 'x' as url, 200 as session
UNION ALL SELECT 2 as user, 'x' as url, 300 as session
)
select user, url, session from (
SELECT user, url, session,
ROW_NUMBER() OVER (partition by user, url ORDER BY session desc) AS top_rank
FROM input)
where top_rank = 1
我有一个 BigQuery table,如下所示:
User | URL | Sessions
user1 | example.com/1/ | 3000
user2 | example.com/2/ | 4000
user3 | example.com/2/ | 5000
user4 | example.com/1/ | 1000
... | ... | ...
我希望为每个 URL 拉出表现最好的用户。所以理想情况下,最终输出给我一个较小的 table,每个 URL 都有一个用户值,这是顶级会话驱动程序。
我尝试了一个 SQL 查询,例如:
SELECT User, URL, ARRAY_AGG(Sessions ORDER BY Sessions DESC LIMIT 1) FROM 'table'
但是一直报错。非常感谢任何帮助!
假设我正确地回答了您的问题,您只想在 per-URL 的基础上对所有会话求和并拆分这些值 per-user?如果没有用户有重复的 URL,总和实际上不会有任何聚合,但它允许您在对其他列进行分组时仍然显示它。
试一试以下内容:
SELECT
User,
URL,
SUM(Sessions) AS Total_Sessions
FROM `table`
GROUP BY User, URL
ORDER BY Total_Sessions DESC
你将不得不做一些事情,比如使用排名或 row_number 函数:
样本:
WITH input AS
(SELECT 1 as user, 'x' as url, 100 as session
UNION ALL SELECT 1 as user, 'x' as url, 200 as session
UNION ALL SELECT 1 as user, 'y' as url, 400 as session
UNION ALL SELECT 2 as user, 'x' as url, 200 as session
UNION ALL SELECT 2 as user, 'x' as url, 300 as session
)
select user, url, session from (
SELECT user, url, session,
ROW_NUMBER() OVER (partition by user, url ORDER BY session desc) AS top_rank
FROM input)
where top_rank = 1