前向填充 NA

Forward Fill NA

我有一个 table 和 NaN。

import pandas as pd

data = {'name': ['may','may', 'mary', 'james','james','john','paul', 'paul', 'joseph'],
       'email' : ['may@gmail.com','NaN','Mary@gmail.com','James@gmail.com','NaN','NaN','Paul@gmail.com','NaN','NaN']}

df = pd.DataFrame(data)

之前

期望输出

但是,当我使用 ffill 时,我得到的结果是不正确的。有没有一种方法可以使用 ffill 但有条件?

请尝试

df.groupby('name').email.apply(lambda x: x.fillna(method='ffill'))

在您的示例中,NaN 值是字符串,值为 "NaN"。因此,在填写之前,您必须将它们转换为实际的空值。

import pandas as pd
import numpy as np

data = {'name': ['may','may', 'mary', 'james','james','john','paul', 'paul', 'joseph'],
       'email' : ['may@gmail.com','NaN','Mary@gmail.com','James@gmail.com','NaN','NaN','Paul@gmail.com','NaN','NaN']}

df = pd.DataFrame(data)

df['email'] = df['email'].replace({'NaN':np.nan})
df['email'] = df.groupby('name')['email'].fillna(method='ffill')
df
     name            email
0     may    may@gmail.com
1     may    may@gmail.com
2    mary   Mary@gmail.com
3   james  James@gmail.com
4   james  James@gmail.com
5    john              NaN
6    paul   Paul@gmail.com
7    paul   Paul@gmail.com
8  joseph              NaN

另一种方式可能是

import pandas as pd
import numpy as np
df = df.replace("NaN", np.nan)
df.update(df.groupby('name')['email'].ffill().fillna("NaN"))
df

    name    email
0   may     may@gmail.com
1   may     may@gmail.com
2   mary    Mary@gmail.com
3   james   James@gmail.com
4   james   James@gmail.com
5   john    NaN
6   paul    Paul@gmail.com
7   paul    Paul@gmail.com
8   joseph  NaN