如何对 python pandas 或大查询的多列进行数据透视操作。最好是大查询
How to do pivot operation on multiple columns on python pandas or big query. Preferably on big query
原始数据:
所需转换后数据的外观:
我在 python pandas 中尝试过 melt 函数,但我只能在一列上旋转。我确定我一定遗漏了一些东西。
以下适用于 BigQuery 标准 SQL
execute immediate (
with types as (
select
array_to_string(types, ',') values_list,
regexp_replace(array_to_string(types, ','), r'([^,]+)', r'""') columns_list
from (
select regexp_extract_all(to_json_string(t), r'"([^""]+)":') types
from (
select * except(Country, Branch, Category)
from `project.dataset.your_table` limit 1
) t
)
), categories as (
select distinct Category
from `project.dataset.your_table`
)
select '''
select Country, Branch, Output, ''' ||
(select string_agg('''
max(if(Category = "''' || Category || '''", val, null)) as ''' || Category )
from categories)
|| '''
from (
select Country, Branch, Category,
type[offset(offset)] Output, val
from `project.dataset.your_table` t,
unnest([''' || values_list || ''']) val with offset,
unnest([struct([''' || columns_list || '''] as type)])
)
group by Country, Branch, Output
'''
from types
);
如果应用于您问题中的示例数据 - 输出为
原始数据:
所需转换后数据的外观:
我在 python pandas 中尝试过 melt 函数,但我只能在一列上旋转。我确定我一定遗漏了一些东西。
以下适用于 BigQuery 标准 SQL
execute immediate (
with types as (
select
array_to_string(types, ',') values_list,
regexp_replace(array_to_string(types, ','), r'([^,]+)', r'""') columns_list
from (
select regexp_extract_all(to_json_string(t), r'"([^""]+)":') types
from (
select * except(Country, Branch, Category)
from `project.dataset.your_table` limit 1
) t
)
), categories as (
select distinct Category
from `project.dataset.your_table`
)
select '''
select Country, Branch, Output, ''' ||
(select string_agg('''
max(if(Category = "''' || Category || '''", val, null)) as ''' || Category )
from categories)
|| '''
from (
select Country, Branch, Category,
type[offset(offset)] Output, val
from `project.dataset.your_table` t,
unnest([''' || values_list || ''']) val with offset,
unnest([struct([''' || columns_list || '''] as type)])
)
group by Country, Branch, Output
'''
from types
);
如果应用于您问题中的示例数据 - 输出为