从 Excel 导入多索引数据框

Import from Excel into Multi Indexed dataframe

我正在尝试从 Excel 导入数据框并保持多索引格式。

这次导入没问题:

def import_cp(cp_sheet_name):
    xl = pd.ExcelFile('FileNameA.xlsx')
    df_first = xl.parse(cp_sheet_name)
    df_second = xl.parse(cp_sheet_name)
    # there's many more
    return df_first, df_second

df_first = import_cp("Sheet 1")

Excel 的格式如下:

|        |       Alpha       |       Bravo      |    Charlie     |
|Position|  Area   |  Gain   |   Area  |  Gain  |  Area  |  Gain |
|    1   |   0.5   |   1.1   |    0.5  |  1.1   |   1.7  |  1.6  |
|    2   |   0.6   |   1.0   |    0.6  |  1.0   |   1.5  |  1.4  |

Alpha Bravo 单元格合并的地方。

当我导入时,我得到:

(  |Unnamed: 0 Alpha| Unnamed: 2 Bravo| Unnamed: 4 Charlie|
0  |Position    Area|   Gain    Area  |  Gain    Area     |
1  |    1     0.5   |   1.17    0.5   |    1.13     0.5   |
2  |    2     0.5   |   1.17    0.5   |    1.13     0.5   |

我尝试使用 header=0,但变化不大,而且 fillna 不理想,因为我不想要 Alpha Alpha Bravo Bravo Charlie Charlie

如有任何帮助,我们将不胜感激。

我认为您需要将参数 header=[0,1] 添加到 read_excel for reading columns to MultiIndex, index_col=0 for reading first column to index and sheetname='sheet1' for reading sheet with name sheet1. Then you can reset columns names by rename_axispandas 0.18.0 中的新参数)

import pandas as pd

df = pd.read_excel('test.xlsx', header=[0,1], index_col=0, sheetname='sheet1')
print df
         Alpha      Bravo      Charlie     
Position  Area Gain  Area Gain    Area Gain
1          0.5  1.1   0.5  1.1     1.7  1.6
2          0.6  1.0   0.6  1.0     1.5  1.4

df = df.rename_axis((None,None), axis=1)
print df
  Alpha      Bravo      Charlie     
   Area Gain  Area Gain    Area Gain
1   0.5  1.1   0.5  1.1     1.7  1.6
2   0.6  1.0   0.6  1.0     1.5  1.4

print df.index
Int64Index([1, 2], dtype='int64')

print df.columns
MultiIndex(levels=[[u'Alpha', u'Bravo', u'Charlie'],
                   [u'Area', u'Gain']],
           labels=[[0, 0, 1, 1, 2, 2], [0, 1, 0, 1, 0, 1]])