用其他列填充 NaN 和空值

Question

我想用其他列值填充 NaN 和空值，在本例中，列 barcode_y 由列 barcode_x

填充

这是我的数据

    id      barcode_x     barcode_y A   B
0   7068    38927887      38927895  0   12
1   7068    38927895      38927895  0   1
2   7068    39111141      38927895  0   4
3   7116    73094237                18  309
4   7154    37645215      37645215  0   9
5   7342    86972909           NaN  7   25

这是我需要的

    id      barcode_x     barcode_y A   B
0   7068    38927887      38927895  0   12
1   7068    38927895      38927895  0   1
2   7068    39111141      38927895  0   4
3   7116    73094237      73094237  18  309
4   7154    37645215      37645215  0   9
5   7342    86972909      86972909  7   25

我该怎么做？

Answer 1

我建议使用掩码来完成你想要的：

df['barcode_y'][df['barcode_y'].isna()] = df['barcode_x'][df['barcode_y'].isna()]

这将普遍适用，而不取决于列是否以某种方式排序，例如 barcode_y 是在 barcode_x 之前还是之后。

Answer 2

在这种情况下我会使用 combine_first...特别是如果 barcode_y 不是 dtype object

df.barcode_y.combine_first(df.barcode_x)

如果 barcode_y 是 dtype object，我想你可以像下面这样进行额外的步骤：

>>> df
   barcode_x barcode_y
0          1         0
1        123      None
2        543
>>> df.barcode_y = df.barcode_y.combine_first(df.barcode_x)
>>> df
   barcode_x barcode_y
0          1         0
1        123       123
2        543
>>> df.loc[df.barcode_y.str.strip()=='', 'barcode_y'] = df.loc[df.barcode_y.str.strip()=='', 'barcode_x']
>>> df
   barcode_x  barcode_y
0          1          0
1        123        123
2        543        543

Answer 3

您可以使用 NaN 转换空值，然后使用 .fillna()。

df['barcode_y'].replace(r'\s+', np.nan, regex=True).replace('',np.nan).fillna(df['barcode_x']).astype(int)

输出：

0    38927895
1    38927895
2    38927895
3    73094237
4    37645215
5    86972909
Name: barcode_y, dtype: int32

Answer 4

试试这个，

def fillValues(x):    
   x = x['barcode_x'] if np.isnan(x['barcode_y']) else x['barcode_y']
   return x

df["barcode_y"] = df.apply(lambda x : fillValues(x),axis=1)
print(df)

Answer 5

使用mask

x, y = df['barcode_x'], df['barcode_y']
y.mask(y.eq('') | y.isna(), x)

0    38927895
1    38927895
2    38927895
3    73094237
4    37645215
5    86972909
Name: barcode_y, dtype: object

用其他列填充 NaN 和空值

Filling NaN and Empty Value wit other column

python

mask

dataframe

pandas