Pandas 使用等式的新列
Pandas new column using equation
我有以下数据集:
df = pd.DataFrame ({'index': [10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26],
'avg': [130, np.NaN,np.NaN,np.NaN,np.NaN,np.NaN,np.NaN,np.NaN, 135, np.NaN, np.NaN,np.NaN,np.NaN,np.NaN, 136, np.NaN,np.NaN],
'slope':[.02,np.NaN,np.NaN,np.NaN,np.NaN,np.NaN,np.NaN,np.NaN, .08,np.NaN, np.NaN,np.NaN,np.NaN,np.NaN, .03, np.NaN,np.NaN] })
我想创建新列 'fit',它将在 'avg' 中出现的整数值之间拟合线性方程。我写了下面的代码:
df.loc[0,'fit'] = df.loc [0,'avg']
def fitt ():
for i in range (0, len(df)):
if df.loc [i,'avg'] > 0:
a = df.loc[i,'index']
b = df.loc [i,'slope']
c= df.loc [i,'avg']
df.loc [i,'fit'] = df.loc [i, 'avg']
continue
while df.loc [i,'avg'] == np.NaN:
df.loc[i,'fit'] = c + b * (i-a)
return df
输出列 'fit' 应包含以下值:
df['fit]= [130,130.02,130.04,130.06,130.08,130.10,130.12,130.14,135,135.08,135.16,135.24,135.32,135.40,136, 136.03,136.06]
我想知道如何获得正确的代码。非常感谢任何帮助
如果先将斜率传播到所有后续缺失值,则可以轻松地逐步计算 'fit' 值,只需将斜率累加到先前的值即可:
df['slope'] = df.slope.fillna(method='ffill')
fit = df.avg.values.copy()
missing = df.avg.isna()
for i in range(len(df)):
if missing[i]:
fit[i] = fit[i - 1] + df.slope[i]
df['fit'] = fit
我有以下数据集:
df = pd.DataFrame ({'index': [10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26],
'avg': [130, np.NaN,np.NaN,np.NaN,np.NaN,np.NaN,np.NaN,np.NaN, 135, np.NaN, np.NaN,np.NaN,np.NaN,np.NaN, 136, np.NaN,np.NaN],
'slope':[.02,np.NaN,np.NaN,np.NaN,np.NaN,np.NaN,np.NaN,np.NaN, .08,np.NaN, np.NaN,np.NaN,np.NaN,np.NaN, .03, np.NaN,np.NaN] })
我想创建新列 'fit',它将在 'avg' 中出现的整数值之间拟合线性方程。我写了下面的代码:
df.loc[0,'fit'] = df.loc [0,'avg']
def fitt ():
for i in range (0, len(df)):
if df.loc [i,'avg'] > 0:
a = df.loc[i,'index']
b = df.loc [i,'slope']
c= df.loc [i,'avg']
df.loc [i,'fit'] = df.loc [i, 'avg']
continue
while df.loc [i,'avg'] == np.NaN:
df.loc[i,'fit'] = c + b * (i-a)
return df
输出列 'fit' 应包含以下值:
df['fit]= [130,130.02,130.04,130.06,130.08,130.10,130.12,130.14,135,135.08,135.16,135.24,135.32,135.40,136, 136.03,136.06]
我想知道如何获得正确的代码。非常感谢任何帮助
如果先将斜率传播到所有后续缺失值,则可以轻松地逐步计算 'fit' 值,只需将斜率累加到先前的值即可:
df['slope'] = df.slope.fillna(method='ffill')
fit = df.avg.values.copy()
missing = df.avg.isna()
for i in range(len(df)):
if missing[i]:
fit[i] = fit[i - 1] + df.slope[i]
df['fit'] = fit