在 pandas 多索引数据帧上设置单个值

Set single value on pandas multiindex dataframe

对于单索引数据框,我们可以使用 loc 来获取、设置和更改值:

>>> df=pd.DataFrame()
>>> df.loc['A',1]=1
>>> df
     1
A  1.0
>>> df.loc['A',1]=2
>>> df.loc['A',1]
2.0

但是,对于多索引数据框,loc 可以获取和更改值:

>>> df=pd.DataFrame([['A','B',1]])
>>> df=df.set_index([0,1])
>>> df.loc[('A','B'),2]
1
>>> df.loc[('A','B'),2]=3
>>> df.loc[('A','B'),2]
3

但设置它们似乎失败了:

>>> df=pd.DataFrame()
>>> df.loc[('A','B'),2]=3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Program Files\Python39\lib\site-packages\pandas\core\indexing.py", line 688, in __setitem__
    indexer = self._get_setitem_indexer(key)
  File "C:\Program Files\Python39\lib\site-packages\pandas\core\indexing.py", line 630, in _get_setitem_indexer
    return self._convert_tuple(key, is_setter=True)
  File "C:\Program Files\Python39\lib\site-packages\pandas\core\indexing.py", line 754, in _convert_tuple
    idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
  File "C:\Program Files\Python39\lib\site-packages\pandas\core\indexing.py", line 1212, in _convert_to_indexer
    return self._get_listlike_indexer(key, axis, raise_missing=True)[1]
  File "C:\Program Files\Python39\lib\site-packages\pandas\core\indexing.py", line 1266, in _get_listlike_indexer
    self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
  File "C:\Program Files\Python39\lib\site-packages\pandas\core\indexing.py", line 1308, in _validate_read_indexer
    raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Index(['A', 'B'], dtype='object')] are in the [index]"

为什么会这样,使用 loc 在多索引数据帧中设置单个值的“正确”方法是什么?

这会失败,因为您在 MultiIndex 中没有正确的级别数。

您需要使用正确的层数初始化一个空的 DataFrame,例如使用 pandas.MultiIndex.from_arrays:

idx = pd.MultiIndex.from_arrays([[],[]])
df = pd.DataFrame(index=idx)

df.loc[('A','B'), 2] = 3

输出:

       2
A B  3.0