在行而不是列中使用 TimeSeries 重塑 Pandas DataFrame

Reshape Pandas DataFrame with TimeSeries in rows instead of columns

我有一个 DataFrame df,其中包含 2010 年 1 月至 2021 年 12 月期间每一天的价格数据(开盘价、收盘价、最高价、最低价):

Name ISIN Data 02.01.2010 05.01.2010 06.01.2010 ... 31.12.2021
Apple US9835635986 Price Open 12.45 13.45 12.48 ... 54.12
Apple US9835635986 Price Close 12.58 15.35 12.38 ... 54.43
Apple US9835635986 Price High 12.78 15.85 12.83 ... 54.91
Apple US9835635986 Price Low 12.18 13.35 12.21 ... 53.98
Microsoft US1223928384 Price Open 12.45 13.45 12.48 ... 43.56
... .. ... ... ... ... ... ...

我正在尝试将 table 重塑为以下格式:

Date Name ISIN Price Open Price Close Price High Price Low
02.01.2010 Apple US9835635986 12.45 12.58 12.78 12.18
05.01.2010 Apple US9835635986 13.45 15.35 15.85 13.35
... ... ... ... ... ... ... ...
02.01.2010 Microsoft US1223928384 12.45 13.67 13.74 12.35

简单地转置 DateFrame 不起作用。我还尝试了 pivot,它给出了操作数不能广播到不同形状的错误消息。

dates = ['NAME','ISIN']
dates.append(df.columns.tolist()[3:]) # appends all columns names starting with 02.01.2010
df.pivot(index = dates, columns = 'Data', Values = 'Data')

如何获得所需格式的 DataFrame?

在转换日期时间之前使用 DataFrame.melt,最后排序 MultiIndex:

df = (df.melt(['Name','ISIN','Data'], var_name='Date')
        .assign(Date = lambda x: pd.to_datetime(x['Date'], format='%d.%m.%Y'))
        .pivot(index = ['Date','Name','ISIN'], columns = 'Data', values = 'value')
        .sort_index(level=[1,2,0])
        .reset_index()
        )
print (df)
Data       Date       Name          ISIN  Price Close  Price High  Price Low  \
0    2010-01-02      Apple  US9835635986        12.58       12.78      12.18   
1    2010-01-05      Apple  US9835635986        15.35       15.85      13.35   
2    2010-01-06      Apple  US9835635986        12.38       12.83      12.21   
3    2021-12-31      Apple  US9835635986        54.43       54.91      53.98   
4    2010-01-02  Microsoft  US1223928384          NaN         NaN        NaN   
5    2010-01-05  Microsoft  US1223928384          NaN         NaN        NaN   
6    2010-01-06  Microsoft  US1223928384          NaN         NaN        NaN   
7    2021-12-31  Microsoft  US1223928384          NaN         NaN        NaN   

Data  Price Open  
0          12.45  
1          13.45  
2          12.48  
3          54.12  
4          12.45  
5          13.45  
6          12.48  
7          43.56  

另一个想法是先将列名转换为日期时间,然后按 DataFrame.stack and Series.unstack:

重塑
L = df.columns.tolist()
df = (df.set_axis(L[:3] + pd.to_datetime(L[3:], format='%d.%m.%Y').tolist(), axis=1)
         .rename_axis('Date', axis=1)
         .set_index(L[:3])
         .stack()
         .unstack(2)
         .reorder_levels([2,0,1])
         .reset_index())
print (df)
Data       Date       Name          ISIN  Price Close  Price High  Price Low  \
0    2010-01-02      Apple  US9835635986        12.58       12.78      12.18   
1    2010-01-05      Apple  US9835635986        15.35       15.85      13.35   
2    2010-01-06      Apple  US9835635986        12.38       12.83      12.21   
3    2021-12-31      Apple  US9835635986        54.43       54.91      53.98   
4    2010-01-02  Microsoft  US1223928384          NaN         NaN        NaN   
5    2010-01-05  Microsoft  US1223928384          NaN         NaN        NaN   
6    2010-01-06  Microsoft  US1223928384          NaN         NaN        NaN   
7    2021-12-31  Microsoft  US1223928384          NaN         NaN        NaN   

Data  Price Open  
0          12.45  
1          13.45  
2          12.48  
3          54.12  
4          12.45  
5          13.45  
6          12.48  
7          43.56