BigQuery:Select 使用 GroupBy 的前 3 天销售额并保存在单独的列中,留下其他列
BigQuery: Select top 3 Day of Sales with GroupBy and save in separate columns leaving others
输入:
得到了Table'A'
Store Category Sales Day
11 aaa 1.5 Sunday
11 aaa 0.5 Monday
11 aaa 2.5 Tuesday
11 aaa 2.0 Wednesday
11 aaa 3.0 Thursday
11 aaa 3.5 Friday
11 aaa 0.5 Saturday
22 bbb 0.5 Sunday
22 bbb 1.5 Monday
22 bbb 2.3 Tuesday
22 bbb 0.3 Wednesday
22 bbb 1.4 Thursday
22 bbb 4.1 Friday
22 bbb 0.2 Saturday
场景:
必须取按商店、类别分组的销售额的平均值并保存在单独的列中,以及选择前 3 个销售日并保存在单独的列中。结果,一行对应一个商店,类别如下示例输出所示。
预期输出:
Store Category AvgSales PeakDay1 PeakDay2 PeakDay3
11 aaa 1.92 Friday Thursday Tuesday
22 bbb 1.47 Friday Tuesday Monday
尝试查询:
SELECT
Store,
Category,
avg(Sales) as AvgSales,
ARRAY_AGG(Sales ORDER BY Sales DESC LIMIT 3) #but this line will not produce results in 3 separate columns
FROM A
GROUP BY Site, Category
提前致谢!
您可以使用数组聚合,但 row_number()
看起来很简单:
select store, category, avg(sales),
max(case when seqnum = 1 then day end) as peakday1,
max(case when seqnum = 2 then day end) as peakday2,
max(case when seqnum = 3 then day end) as peakday3
from (select store, category, day,
row_number() over (partition by store, category order by sales desc) as seqnum
from a
) a
group by store, category;
如果您想将其放入数组列中,您可以使用:
SELECT Store, Category, avg(Sales) as AvgSales,
ARRAY_AGG(day ORDER BY Sales DESC LIMIT 3)
FROM A
GROUP BY Site, Category
以下适用于 BigQuery 标准 SQL
#standardSQL
SELECT Store, Category, AvgSales,
Days[OFFSET(0)] PeakDay1,
Days[SAFE_OFFSET(1)] PeakDay2,
Days[SAFE_OFFSET(2)] PeakDay3
FROM (
SELECT Store, Category,
ROUND(AVG(Sales), 2) AvgSales,
ARRAY_AGG(Day ORDER BY Sales DESC LIMIT 3) Days
FROM `project.dataset.table` t
GROUP BY Store, Category
)
如果应用于您问题中的示例数据 - 输出为
Row Store Category AvgSales PeakDay1 PeakDay2 PeakDay3
1 11 aaa 1.93 Friday Thursday Tuesday
2 22 bbb 1.47 Friday Tuesday Monday
输入: 得到了Table'A'
Store Category Sales Day
11 aaa 1.5 Sunday
11 aaa 0.5 Monday
11 aaa 2.5 Tuesday
11 aaa 2.0 Wednesday
11 aaa 3.0 Thursday
11 aaa 3.5 Friday
11 aaa 0.5 Saturday
22 bbb 0.5 Sunday
22 bbb 1.5 Monday
22 bbb 2.3 Tuesday
22 bbb 0.3 Wednesday
22 bbb 1.4 Thursday
22 bbb 4.1 Friday
22 bbb 0.2 Saturday
场景: 必须取按商店、类别分组的销售额的平均值并保存在单独的列中,以及选择前 3 个销售日并保存在单独的列中。结果,一行对应一个商店,类别如下示例输出所示。
预期输出:
Store Category AvgSales PeakDay1 PeakDay2 PeakDay3
11 aaa 1.92 Friday Thursday Tuesday
22 bbb 1.47 Friday Tuesday Monday
尝试查询:
SELECT
Store,
Category,
avg(Sales) as AvgSales,
ARRAY_AGG(Sales ORDER BY Sales DESC LIMIT 3) #but this line will not produce results in 3 separate columns
FROM A
GROUP BY Site, Category
提前致谢!
您可以使用数组聚合,但 row_number()
看起来很简单:
select store, category, avg(sales),
max(case when seqnum = 1 then day end) as peakday1,
max(case when seqnum = 2 then day end) as peakday2,
max(case when seqnum = 3 then day end) as peakday3
from (select store, category, day,
row_number() over (partition by store, category order by sales desc) as seqnum
from a
) a
group by store, category;
如果您想将其放入数组列中,您可以使用:
SELECT Store, Category, avg(Sales) as AvgSales,
ARRAY_AGG(day ORDER BY Sales DESC LIMIT 3)
FROM A
GROUP BY Site, Category
以下适用于 BigQuery 标准 SQL
#standardSQL
SELECT Store, Category, AvgSales,
Days[OFFSET(0)] PeakDay1,
Days[SAFE_OFFSET(1)] PeakDay2,
Days[SAFE_OFFSET(2)] PeakDay3
FROM (
SELECT Store, Category,
ROUND(AVG(Sales), 2) AvgSales,
ARRAY_AGG(Day ORDER BY Sales DESC LIMIT 3) Days
FROM `project.dataset.table` t
GROUP BY Store, Category
)
如果应用于您问题中的示例数据 - 输出为
Row Store Category AvgSales PeakDay1 PeakDay2 PeakDay3
1 11 aaa 1.93 Friday Thursday Tuesday
2 22 bbb 1.47 Friday Tuesday Monday