如何根据先前在 pandas 数据框中找到的值填充 nan 值？

Question

我有以下示例数据框（普通数据框有超过 1000 行）

df = pd.DataFrame({'P1':['jaap','tim','piet','tim','tim'],
                   'P2':['piet','jaap','jaap','piet','jaap'],
                   'Count1':[2, 3, np.nan, np.nan, np.nan], 'Count2':[3, 1, np.nan, np.nan, np.nan]})
print(df)

     P1    P2  Count1  Count2
0  jaap  piet     2.0     3.0
1   tim  jaap     3.0     1.0
2  piet  jaap     NaN     NaN
3   tim  piet     NaN     NaN
4   tim  jaap     NaN     NaN

现在我想找到一种巧妙的方法来根据以下规则填写 NaN 值：

The names found in P1 and P2 have to be the same.

因此，在行号 2 中找到的 nan 值必须与行 0 中的值相同，只有值需要交换，因为名称也被交换了。 3 行中的 nan 值应保留为 nan，因为在任何前面的行中都找不到 tim 和 piet 的组合。 4 行中的 nan 值必须与 1 行中的值相同。所以想要的结果是：

     P1    P2  Count1  Count2
0  jaap  piet     2.0     3.0
1   tim  jaap     3.0     1.0
2  piet  jaap     3.0     2.0
3   tim  piet     NaN     NaN
4   tim  jaap     3.0     1.0

这个问题很相似：

仅将那篇文章中提出的解决方案应用于本post中的问题，结果略有偏差：

df.groupby(['P1','P2'])[['Count1','Count2']].apply(lambda x: x.fillna(method = 'ffill'))
print(df)

             Count1  Count2
  P1   P2                  
0 jaap piet     2.0     3.0
1 tim  jaap     3.0     1.0
2 piet jaap     NaN     NaN
3 tim  piet     NaN     NaN
4 tim  jaap     3.0     1.0

如您所见，行 0 中的名称 jaap 和 piet 更改了行 2 中的列，因此它不起作用。

Answer 1

想法是首先在 concat with rename, remove rows with missing values and possible duplicates and change original values by DataFrame.update:

中创建更改顺序的 DataFrame

d = {'P2':'P1','P1':'P2','Count1':'Count2','Count2':'Count1'}
df1 = (pd.concat([df, df.rename(columns=d)])
         .dropna(subset=['Count1','Count2'])
         .drop_duplicates(['P1','P2']))

df = df.set_index(['P1','P2'])
df1 = df1.set_index(['P1','P2'])

df.update(df1)

df = df.reset_index()
print (df)

     P1    P2  Count1  Count2
0  jaap  piet     2.0     3.0
1   tim  jaap     3.0     1.0
2  piet  jaap     3.0     2.0
3   tim  piet     NaN     NaN
4   tim  jaap     3.0     1.0

如何根据先前在 pandas 数据框中找到的值填充 nan 值？

How to fill in the nan values based on prior found values in the pandas dataframe?

python

duplicates

pandas

fillna

pandas-groupby