在这种情况下如何在不手动列出所有类别的情况下转换 table (BigQuery)?
How to pivot the table (BigQuery) in this case without listing all categories manually?
我有一个 table 架构大致如下:
user_id | segment_id | day
segment_id
的取值范围不小:从1到70。
对于 day
,它是 0-2。
理想情况下,我想将此 table 转换为
user_id | segment_1_day_1_count | segment_2_day_1_count | ... segment_70_day_1_count | ... | segment_70_day_3_count | ... segment_1_count | segment_2_count | segment_3_count | day_1_count | day_2_count | day_3_count | total_count
粗略地说,对于每个 user_id
我想获得所有细分组合的计数:
- 按细分
- 按天
- 按段和天
- 总计
这看起来很像旋转,但我不确定是否有一种基于多列旋转的方法。
我目前的尝试如下:
SELECT
user_id,
segment_id,
day,
COUNT(*) OVER (PARTITION BY user_id, segment_id),
COUNT(*) OVER (PARTITION BY user_id, day),
COUNT(*) OVER (PARTITION BY user_id, segment_id, day),
COUNT(*) OVER (PARTITION BY user_id)
FROM some_table
这是我需要的数据,但不是我想要的格式。
考虑以下方法
execute immediate (select '''select * from your_table
pivot (count(*) for 'segment_' || segment_id || '_day_' || day || '_count' in (''' ||
string_agg('"segment_' || segment_id || '_day_' || day || '_count"', ',' order by day, segment_id) || '))'
from (select distinct segment_id from your_table),
(select distinct day from your_table));
如果应用于类似于您所描述的虚拟数据 - 输出为
is it possible to also add counts for segment, day and total separately?
当然可以。只是[相对]简单扩展我最初提出的解决方案
execute immediate (select '''
select * from (
select *
from your_table
pivot (count(*) for 'segment_' || segment_id || '_day_' || day || '_count' in (''' || list1 || '''))
)
join (
select *
from (select * except(day) from your_table)
pivot (count(*) for 'segment_' || segment_id || '_count' in (''' || list2 || '''))
)
using(user_id)
join (
select *
from (select * except(segment_id) from your_table)
pivot (count(*) for 'day_' || day || '_count' in (''' || list3 || '''))
)
using(user_id)
join (
select user_id, count(*) total
from your_table
group by user_id
)
using(user_id)
'''
from (
select string_agg('"segment_' || segment_id || '_day_' || day || '_count"', ',' order by day, segment_id) list1
from (select distinct segment_id from your_table), (select distinct day from your_table)
),(
select string_agg('"segment_' || segment_id || '_count"', ',' order by segment_id) list2
from (select distinct segment_id from your_table)
),(
select string_agg('"day_' || day || '_count"', ',' order by day) list3
from (select distinct day from your_table)
)
)
如果应用于类似于您描述的虚拟数据 - 输出是
我有一个 table 架构大致如下:
user_id | segment_id | day
segment_id
的取值范围不小:从1到70。
对于 day
,它是 0-2。
理想情况下,我想将此 table 转换为
user_id | segment_1_day_1_count | segment_2_day_1_count | ... segment_70_day_1_count | ... | segment_70_day_3_count | ... segment_1_count | segment_2_count | segment_3_count | day_1_count | day_2_count | day_3_count | total_count
粗略地说,对于每个 user_id
我想获得所有细分组合的计数:
- 按细分
- 按天
- 按段和天
- 总计
这看起来很像旋转,但我不确定是否有一种基于多列旋转的方法。
我目前的尝试如下:
SELECT
user_id,
segment_id,
day,
COUNT(*) OVER (PARTITION BY user_id, segment_id),
COUNT(*) OVER (PARTITION BY user_id, day),
COUNT(*) OVER (PARTITION BY user_id, segment_id, day),
COUNT(*) OVER (PARTITION BY user_id)
FROM some_table
这是我需要的数据,但不是我想要的格式。
考虑以下方法
execute immediate (select '''select * from your_table
pivot (count(*) for 'segment_' || segment_id || '_day_' || day || '_count' in (''' ||
string_agg('"segment_' || segment_id || '_day_' || day || '_count"', ',' order by day, segment_id) || '))'
from (select distinct segment_id from your_table),
(select distinct day from your_table));
如果应用于类似于您所描述的虚拟数据 - 输出为
is it possible to also add counts for segment, day and total separately?
当然可以。只是[相对]简单扩展我最初提出的解决方案
execute immediate (select '''
select * from (
select *
from your_table
pivot (count(*) for 'segment_' || segment_id || '_day_' || day || '_count' in (''' || list1 || '''))
)
join (
select *
from (select * except(day) from your_table)
pivot (count(*) for 'segment_' || segment_id || '_count' in (''' || list2 || '''))
)
using(user_id)
join (
select *
from (select * except(segment_id) from your_table)
pivot (count(*) for 'day_' || day || '_count' in (''' || list3 || '''))
)
using(user_id)
join (
select user_id, count(*) total
from your_table
group by user_id
)
using(user_id)
'''
from (
select string_agg('"segment_' || segment_id || '_day_' || day || '_count"', ',' order by day, segment_id) list1
from (select distinct segment_id from your_table), (select distinct day from your_table)
),(
select string_agg('"segment_' || segment_id || '_count"', ',' order by segment_id) list2
from (select distinct segment_id from your_table)
),(
select string_agg('"day_' || day || '_count"', ',' order by day) list3
from (select distinct day from your_table)
)
)
如果应用于类似于您描述的虚拟数据 - 输出是