更改数据框 pivot_table headers 以将其保存在 excel 文件中
change dataframe pivot_table headers to save it in an excel file
旋转数据框后,我在数据框中有一个嵌套的听众,我想让它变平,每行都有重复的项目(就像在 excel 中一样)
你能帮帮我吗?
import pandas as pd
data = {
"year": [2022, 2022 , 2021 , 2021 , 2020 ],
"client": ["A", "B", "B", "C", "C"],
"product" : [ "P1", "P2" , "P1", "P2", "P1"],
"sales" : [ 10,20, 20, 22, 25]
}
df = pd.DataFrame(data)
df2 = df.pivot_table ( index = ["year","client"] , columns = ["product"] , values = ["sales"] , aggfunc ="sum")
df2
sales
product P1 P2
year client
2020 C 25.0 NaN
2021 B 20.0 NaN
C NaN 22.0
2022 A 10.0 NaN
B NaN 20.0
我想得到一个平面 header 和重复的行 :
year client P1 P2
2020 C 25.0 NaN
2021 B 20.0 NaN
2021 C NaN 22.0
2022 A 10.0 NaN
2022 B NaN 20.0
非常感谢
妮可
您需要使用 droplevel(0)
这将删除顶层(销售)。将 columns.name
设置为 None
将删除 Products
。因此,在创建 df2
之后,添加这些行...
>> df2.columns = df2.columns.droplevel(0)
>> df2.columns.name = None
>> df2 =df2.reset_index()
>> df2
year client P1 P2
0 2020 C 25.0 NaN
1 2021 B 20.0 NaN
2 2021 C NaN 22.0
3 2022 A 10.0 NaN
4 2022 B NaN 20.0
你可以试试
df = (df2.droplevel(0, axis=1)
.rename_axis(None, axis=1)
.reset_index())
print(df)
year client P1 P2
0 2020 C 25.0 NaN
1 2021 B 20.0 NaN
2 2021 C NaN 22.0
3 2022 A 10.0 NaN
4 2022 B NaN 20.0
旋转数据框后,我在数据框中有一个嵌套的听众,我想让它变平,每行都有重复的项目(就像在 excel 中一样) 你能帮帮我吗?
import pandas as pd
data = {
"year": [2022, 2022 , 2021 , 2021 , 2020 ],
"client": ["A", "B", "B", "C", "C"],
"product" : [ "P1", "P2" , "P1", "P2", "P1"],
"sales" : [ 10,20, 20, 22, 25]
}
df = pd.DataFrame(data)
df2 = df.pivot_table ( index = ["year","client"] , columns = ["product"] , values = ["sales"] , aggfunc ="sum")
df2
sales
product P1 P2
year client
2020 C 25.0 NaN
2021 B 20.0 NaN
C NaN 22.0
2022 A 10.0 NaN
B NaN 20.0
我想得到一个平面 header 和重复的行 :
year client P1 P2
2020 C 25.0 NaN
2021 B 20.0 NaN
2021 C NaN 22.0
2022 A 10.0 NaN
2022 B NaN 20.0
非常感谢
妮可
您需要使用 droplevel(0)
这将删除顶层(销售)。将 columns.name
设置为 None
将删除 Products
。因此,在创建 df2
之后,添加这些行...
>> df2.columns = df2.columns.droplevel(0)
>> df2.columns.name = None
>> df2 =df2.reset_index()
>> df2
year client P1 P2
0 2020 C 25.0 NaN
1 2021 B 20.0 NaN
2 2021 C NaN 22.0
3 2022 A 10.0 NaN
4 2022 B NaN 20.0
你可以试试
df = (df2.droplevel(0, axis=1)
.rename_axis(None, axis=1)
.reset_index())
print(df)
year client P1 P2
0 2020 C 25.0 NaN
1 2021 B 20.0 NaN
2 2021 C NaN 22.0
3 2022 A 10.0 NaN
4 2022 B NaN 20.0