转置数据框问题:对于每个 df.index 和 df.column 组合在新数据框中创建一行

Transpose Dataframe problem: For each df.index and df.column combination create a row in new dataframe

我有一个如下所示的数据框:

我的数据框的索引是“日期”列。


Dates   3M INDIA LTD    ALKYL AMINES CHEMICALS LTD  AAVAS FINANCIERS LTD    ABB INDIA LTD   ADITYA BIRLA CAPITAL LTD
01-01-2020  1.738819    -0.054496   -0.600676   -0.535873   -1.837524   0.514004    -0.853701   -0.101420   2.192982
02-01-2020  -1.110939   3.668744    1.371749    1.346907    4.367026    2.930212    3.540222    4.080081    1.185880
03-01-2020  -0.862856   0.008598    2.543608    2.104247    0.795136    -0.290943   -0.726246   -1.021898   1.368421
06-01-2020  -2.135963   -1.952790   -2.201474   -2.643822   -4.166667   -2.250709   -1.815881   -2.933202   0.300000
07-01-2020  1.692019    8.431578    -1.116379   0.674114    0.097800    -3.166751   0.677638    -1.873767   0.837922

我想创建一个新的数据框,这样对于每个日期和公司名称组合,我将在数据框中有 1 行。

生成的数据框将如下所示: 日期公司名称值

如何使用 python pandas 操作实现此转换?

您可以使用 pandas 中的 pd.melt 并重塑您的数据集。假设您的数据框称为 df,请使用以下内容:

import pandas as pd
df_reshaped = pd.melt(df,id_vars='Dates   3M') 

这会给你:

df_reshaped
Out[13]: 
   Dates   3M                                           variable                                              value
0  2020-01-01                                          INDIA LTD                                              1.739
1  2020-02-01                                          INDIA LTD                                             -1.111
2  2020-03-01                                          INDIA LTD                                             -0.863
3  2020-06-01                                          INDIA LTD                                             -2.136
4  2020-07-01                                          INDIA LTD                                              1.692
5  2020-01-01                             ALKYL AMINES CHEMICALS                                             -0.655
6  2020-02-01                             ALKYL AMINES CHEMICALS                               3.668744    1.371749
7  2020-03-01                             ALKYL AMINES CHEMICALS                               0.008598    2.543608
8  2020-06-01                             ALKYL AMINES CHEMICALS                                             -4.154
9  2020-07-01                             ALKYL AMINES CHEMICALS                              8.431578    -1.116379
10 2020-01-01                                         LTD  AAVAS                                             -0.536
11 2020-02-01                                         LTD  AAVAS                                              1.347
12 2020-03-01                                         LTD  AAVAS                                              2.104
13 2020-06-01                                         LTD  AAVAS                                             -2.644
14 2020-07-01                                         LTD  AAVAS                                              0.674
15 2020-01-01  FINANCIERS LTD    ABB INDIA LTD   ADITYA BIRLA...  -1.837524   0.514004    -0.853701   -0.101420 ...
16 2020-02-01  FINANCIERS LTD    ABB INDIA LTD   ADITYA BIRLA...  4.367026    2.930212    3.540222    4.080081  ...
17 2020-03-01  FINANCIERS LTD    ABB INDIA LTD   ADITYA BIRLA...  0.795136    -0.290943   -0.726246   -1.021898 ...
18 2020-06-01  FINANCIERS LTD    ABB INDIA LTD   ADITYA BIRLA...  -4.166667   -2.250709   -1.815881   -2.933202 ...
19 2020-07-01  FINANCIERS LTD    ABB INDIA LTD   ADITYA BIRLA...  0.097800    -3.166751   0.677638    -1.873767 ...

请注意,您可以 rename 通过在上面的代码中添加新创建的列:

df_reshaped = pd.melt(df,id_vars='Dates   3M',var_name = "newname1", value_name = "newname2")
df = df.set_index('Dates').stack().reset_index()
df.columns = ['Dates','Company Name','Value']

df.sort_values(by=['Company Name', 'Dates'])

或者

pd.melt(df, 
        id_vars=['Dates'], 
        value_vars=[x for x in df.columns if x!='Dates'],
        var_name='Company Name',
        value_name='Values')