如何在 pandas 中自定义枢轴 table

How to custom pivot a table in pandas

获取数据帧的代码df

import pandas as pd
from io import StringIO
df = pd.read_csv(StringIO('qi_variable,qi_variable_type,value\nDEBIT_STUCCO_FT303A_KG_MIN,Time,2021-04-01 00:00:10\nDEBIT_STUCCO_FT303A_KG_MIN,Time,2021-04-01 00:00:30\nDEBIT_STUCCO_FT303A_KG_MIN,ValueY,"338,25"\nDEBIT_STUCCO_FT303A_KG_MIN,ValueY,"337,799987792969"\nDEBIT_EAU_MOUSSE_KG_MIN,Time,2021-04-01 00:00:10\nDEBIT_EAU_MOUSSE_KG_MIN,Time,2021-04-01 00:00:30\nDEBIT_EAU_MOUSSE_KG_MIN,ValueY,"55,1691627502441"\nDEBIT_EAU_MOUSSE_KG_MIN,ValueY,"55,3335952758789"\nCORRECTION_MOUSSE,Time,2021-04-01 00:04:12\nCORRECTION_MOUSSE,Time,2021-04-01 00:04:35\nCORRECTION_MOUSSE,ValueY,"1,04863631725311"\nCORRECTION_MOUSSE,ValueY,"1,04946064949036"\n'))

当前数据帧df:

预期结果: 我正在尝试将我的 table 扩展到如下所示的结果。我试过 pd.pivot pd.pivot_table 但没能成功。忽略空白行,我将它们留空以提高可读性。

一种方式:

df = df.pivot_table(index = 'qi_variable', columns = 'qi_variable_type', values = 'value', aggfunc= list).apply(pd.Series.explode)

选择:

df = df.pivot_table(index = ['qi_variable', df.groupby(['qi_variable','qi_variable_type']).cumcount()], columns = 'qi_variable_type', values = 'value', aggfunc= ''.join).reset_index(-1, drop=True)

还有一个选择:

df = df.set_index(['qi_variable',df.groupby(['qi_variable', 'qi_variable_type']).cumcount(), 'qi_variable_type']).unstack(-1).reset_index(-1, drop=True)

输出:

qi_variable_type                           Time            ValueY
qi_variable                                                      
CORRECTION_MOUSSE           2021-04-01 00:04:12  1,04863631725311
CORRECTION_MOUSSE           2021-04-01 00:04:35  1,04946064949036
DEBIT_EAU_MOUSSE_KG_MIN     2021-04-01 00:00:10  55,1691627502441
DEBIT_EAU_MOUSSE_KG_MIN     2021-04-01 00:00:30  55,3335952758789
DEBIT_STUCCO_FT303A_KG_MIN  2021-04-01 00:00:10            338,25
DEBIT_STUCCO_FT303A_KG_MIN  2021-04-01 00:00:30  337,799987792969
df_1 = df[df.qi_variable_type=='Time'].rename(columns={'value': 'Time'})[['qi_variable', 'Time']].reset_index()
df_2 = df[df.qi_variable_type=='ValueY'].rename(columns={'value': 'ValueY'})[['qi_variable', 'ValueY']].reset_index()
df_1.join(df_2, lsuffix='_2')[['qi_variable', 'Time','ValueY']]
    qi_variable                 Time                  ValueY
0   DEBIT_STUCCO_FT303A_KG_MIN  2021-04-01 00:00:10   338,25
1   DEBIT_STUCCO_FT303A_KG_MIN  2021-04-01 00:00:30   337,799987792969
2   DEBIT_EAU_MOUSSE_KG_MIN     2021-04-01 00:00:10   55,1691627502441
3   DEBIT_EAU_MOUSSE_KG_MIN     2021-04-01 00:00:30   55,3335952758789
4   CORRECTION_MOUSSE           2021-04-01 00:04:12   1,04863631725311
5   CORRECTION_MOUSSE           2021-04-01 00:04:35   1,04946064949036