在多索引中设置列 pandas

setting columns in multiindex pandas

我有这个 pandas 从 csv 导入的 df:

df
0  0 apple  banana  orange                        dates apple  banana  orange
1  1      1       1      1     Friday, January 01, 2021      1       1      1
2  2      1       1      1   Saturday, January 02, 2021      2       2      2
3  3      1       1      1     Sunday, January 03, 2021      3       3      3
4  4      1       1      1     Monday, January 04, 2021      4       4      4
5  5      1       1      1    Tuesday, January 05, 2021      5       5      5
6  6      1       1      1  Wednesday, January 06, 2021      6       6      6
7  7      1       1      1   Thursday, January 07, 2021      7       7      7
8  8      1       4      1     Friday, January 08, 2021      8       8      8
9  9      1       1      1   Saturday, January 09, 2021      9       9      9

是否可以在多索引格式中将左侧的所有内容分组在 fresh 下,并将日期右侧的所有内容分组在 spoil 列下。例如,有一列包含 [apple, banana, orange]。我想这样做是因为稍后当我将日期设置为索引时,不会混淆,因为列的两边具有相同的名称。

这可能有帮助 df.columns.values[1] = "苹果 1" df.columns.values[2] = "香蕉 1"

df.columns = pd.MultiIndex.from_arrays([['', '', 'fresh', 'fresh', 'fresh', '', 'spoil', 'spoil', 'spoil'],
                                        df.columns])

输出:

         fresh                                              spoil                  
   0   0 apple banana orange                        dates   apple   banana   orange
0  1   1     1      1      1     Friday, January 01, 2021       1        1        1
1  2   2     1      1      1   Saturday, January 02, 2021       2        2        2
2  3   3     1      1      1     Sunday, January 03, 2021       3        3        3
3  4   4     1      1      1     Monday, January 04, 2021       4        4        4
4  5   5     1      1      1    Tuesday, January 05, 2021       5        5        5
5  6   6     1      1      1  Wednesday, January 06, 2021       6        6        6
6  7   7     1      1      1   Thursday, January 07, 2021       7        7        7
7  8   8     1      4      1     Friday, January 08, 2021       8        8        8
8  9   9     1      1      1   Saturday, January 09, 2021       9        9        9

注意。如果你想set_index('dates')在这个操作之前做,这会更容易

你可以试试:

# Get the column number of column `dates`
dates_loc = df.columns.get_loc('dates')

arrays = [['fresh'] * dates_loc + [''] + ['spoil'] * (len(df.columns) - dates_loc -1), df.columns.tolist()]

df.columns = pd.MultiIndex.from_arrays(arrays)



  fresh                                                        spoil                  
      0   0 apple banana orange                        dates   apple   banana  orange
0     1   1     1      1      1     Friday, January 01, 2021       1        1        1
1     2   2     1      1      1   Saturday, January 02, 2021       2        2        2
2     3   3     1      1      1     Sunday, January 03, 2021       3        3        3
3     4   4     1      1      1     Monday, January 04, 2021       4        4        4
4     5   5     1      1      1    Tuesday, January 05, 2021       5        5        5
5     6   6     1      1      1  Wednesday, January 06, 2021       6        6        6
6     7   7     1      1      1   Thursday, January 07, 2021       7        7        7
7     8   8     1      4      1     Friday, January 08, 2021       8        8        8
8     9   9     1      1      1   Saturday, January 09, 2021       9        9        9