Python - 带减法的 ffill()
Python - ffill() with subtraction
我正在尝试使用之前的 Total 然后减去 Change 来填充 NAs
Change Total
01/01/2021 -12 100
02/01/2021 -54 154
03/01/2021 -23 177
04/01/2021 -2 NaN
05/01/2021 -54 NaN
06/01/2021 -72 NaN
期望的输出;
Change Total
01/01/2021 -12 100
02/01/2021 -54 154
03/01/2021 -23 177
04/01/2021 -2 179
05/01/2021 -54 233
06/01/2021 -72 305
我尝试了多种操作 ffill() 的方法,但都没有成功;
df['Total'] = df['Total'].fillna(method = 'ffill' - df['Change'])
有没有更好的方法来尝试这个?
非常感谢任何帮助!
您可以使用 np.where
和 fillna()
如下::
change_when_total_null = df.loc[df['Total'].isnull(), 'Change']
df['Total'] = np.where(df['Total'].isnull(),df['Total'].fillna(method='ffill') -change_when_total_null.cumsum(),df['Total'])
打印:
Change Total
01/01/2021 -12 100.0
02/01/2021 -54 154.0
03/01/2021 -23 177.0
04/01/2021 -2 179.0
05/01/2021 -54 233.0
06/01/2021 -72 305.0
您可以使用 pd.Series.where
:
df["Total"] = df["Total"].where(df["Total"].notnull(),
df['Total'].ffill() - df.loc[df['Total'].isnull(), 'Change'].cumsum())
print (df)
Change Total
01/01/2021 -12 100.0
02/01/2021 -54 154.0
03/01/2021 -23 177.0
04/01/2021 -2 179.0
05/01/2021 -54 233.0
06/01/2021 -72 305.0
预期的输出看起来像 ffill
minus cumsum
。向前填充最后一个有效值,然后减去每个 NaN 行的累计总数:
# Select NaN rows
m = df['Total'].isna()
# Update NaN rows with the last valid value minus the current total Change
df.loc[m, 'Total'] = df['Total'].ffill() - df.loc[m, 'Change'].cumsum()
df
:
Change Total
01/01/2021 -12 100.0
02/01/2021 -54 154.0
03/01/2021 -23 177.0
04/01/2021 -2 179.0
05/01/2021 -54 233.0
06/01/2021 -72 305.0
让我们试试combine_first
df = df.combine_first(df[['Total']].ffill().sub(df.loc[df.Total.isnull(),'Change'].cumsum(),axis=0))
df
Change Total
01/01/2021 -12 100.0
02/01/2021 -54 154.0
03/01/2021 -23 177.0
04/01/2021 -2 179.0
05/01/2021 -54 233.0
06/01/2021 -72 305.0
我正在尝试使用之前的 Total 然后减去 Change 来填充 NAs
Change Total
01/01/2021 -12 100
02/01/2021 -54 154
03/01/2021 -23 177
04/01/2021 -2 NaN
05/01/2021 -54 NaN
06/01/2021 -72 NaN
期望的输出;
Change Total
01/01/2021 -12 100
02/01/2021 -54 154
03/01/2021 -23 177
04/01/2021 -2 179
05/01/2021 -54 233
06/01/2021 -72 305
我尝试了多种操作 ffill() 的方法,但都没有成功;
df['Total'] = df['Total'].fillna(method = 'ffill' - df['Change'])
有没有更好的方法来尝试这个? 非常感谢任何帮助!
您可以使用 np.where
和 fillna()
如下::
change_when_total_null = df.loc[df['Total'].isnull(), 'Change']
df['Total'] = np.where(df['Total'].isnull(),df['Total'].fillna(method='ffill') -change_when_total_null.cumsum(),df['Total'])
打印:
Change Total
01/01/2021 -12 100.0
02/01/2021 -54 154.0
03/01/2021 -23 177.0
04/01/2021 -2 179.0
05/01/2021 -54 233.0
06/01/2021 -72 305.0
您可以使用 pd.Series.where
:
df["Total"] = df["Total"].where(df["Total"].notnull(),
df['Total'].ffill() - df.loc[df['Total'].isnull(), 'Change'].cumsum())
print (df)
Change Total
01/01/2021 -12 100.0
02/01/2021 -54 154.0
03/01/2021 -23 177.0
04/01/2021 -2 179.0
05/01/2021 -54 233.0
06/01/2021 -72 305.0
预期的输出看起来像 ffill
minus cumsum
。向前填充最后一个有效值,然后减去每个 NaN 行的累计总数:
# Select NaN rows
m = df['Total'].isna()
# Update NaN rows with the last valid value minus the current total Change
df.loc[m, 'Total'] = df['Total'].ffill() - df.loc[m, 'Change'].cumsum()
df
:
Change Total
01/01/2021 -12 100.0
02/01/2021 -54 154.0
03/01/2021 -23 177.0
04/01/2021 -2 179.0
05/01/2021 -54 233.0
06/01/2021 -72 305.0
让我们试试combine_first
df = df.combine_first(df[['Total']].ffill().sub(df.loc[df.Total.isnull(),'Change'].cumsum(),axis=0))
df
Change Total
01/01/2021 -12 100.0
02/01/2021 -54 154.0
03/01/2021 -23 177.0
04/01/2021 -2 179.0
05/01/2021 -54 233.0
06/01/2021 -72 305.0