前向填充 NA
Forward Fill NA
我有一个 table 和 NaN。
import pandas as pd
data = {'name': ['may','may', 'mary', 'james','james','john','paul', 'paul', 'joseph'],
'email' : ['may@gmail.com','NaN','Mary@gmail.com','James@gmail.com','NaN','NaN','Paul@gmail.com','NaN','NaN']}
df = pd.DataFrame(data)
之前
期望输出
但是,当我使用 ffill
时,我得到的结果是不正确的。有没有一种方法可以使用 ffill
但有条件?
请尝试
df.groupby('name').email.apply(lambda x: x.fillna(method='ffill'))
在您的示例中,NaN
值是字符串,值为 "NaN"
。因此,在填写之前,您必须将它们转换为实际的空值。
import pandas as pd
import numpy as np
data = {'name': ['may','may', 'mary', 'james','james','john','paul', 'paul', 'joseph'],
'email' : ['may@gmail.com','NaN','Mary@gmail.com','James@gmail.com','NaN','NaN','Paul@gmail.com','NaN','NaN']}
df = pd.DataFrame(data)
df['email'] = df['email'].replace({'NaN':np.nan})
df['email'] = df.groupby('name')['email'].fillna(method='ffill')
df
name email
0 may may@gmail.com
1 may may@gmail.com
2 mary Mary@gmail.com
3 james James@gmail.com
4 james James@gmail.com
5 john NaN
6 paul Paul@gmail.com
7 paul Paul@gmail.com
8 joseph NaN
另一种方式可能是
import pandas as pd
import numpy as np
df = df.replace("NaN", np.nan)
df.update(df.groupby('name')['email'].ffill().fillna("NaN"))
df
name email
0 may may@gmail.com
1 may may@gmail.com
2 mary Mary@gmail.com
3 james James@gmail.com
4 james James@gmail.com
5 john NaN
6 paul Paul@gmail.com
7 paul Paul@gmail.com
8 joseph NaN
我有一个 table 和 NaN。
import pandas as pd
data = {'name': ['may','may', 'mary', 'james','james','john','paul', 'paul', 'joseph'],
'email' : ['may@gmail.com','NaN','Mary@gmail.com','James@gmail.com','NaN','NaN','Paul@gmail.com','NaN','NaN']}
df = pd.DataFrame(data)
之前
期望输出
但是,当我使用 ffill
时,我得到的结果是不正确的。有没有一种方法可以使用 ffill
但有条件?
请尝试
df.groupby('name').email.apply(lambda x: x.fillna(method='ffill'))
在您的示例中,NaN
值是字符串,值为 "NaN"
。因此,在填写之前,您必须将它们转换为实际的空值。
import pandas as pd
import numpy as np
data = {'name': ['may','may', 'mary', 'james','james','john','paul', 'paul', 'joseph'],
'email' : ['may@gmail.com','NaN','Mary@gmail.com','James@gmail.com','NaN','NaN','Paul@gmail.com','NaN','NaN']}
df = pd.DataFrame(data)
df['email'] = df['email'].replace({'NaN':np.nan})
df['email'] = df.groupby('name')['email'].fillna(method='ffill')
df
name email
0 may may@gmail.com
1 may may@gmail.com
2 mary Mary@gmail.com
3 james James@gmail.com
4 james James@gmail.com
5 john NaN
6 paul Paul@gmail.com
7 paul Paul@gmail.com
8 joseph NaN
另一种方式可能是
import pandas as pd
import numpy as np
df = df.replace("NaN", np.nan)
df.update(df.groupby('name')['email'].ffill().fillna("NaN"))
df
name email
0 may may@gmail.com
1 may may@gmail.com
2 mary Mary@gmail.com
3 james James@gmail.com
4 james James@gmail.com
5 john NaN
6 paul Paul@gmail.com
7 paul Paul@gmail.com
8 joseph NaN