将列表中的 np.nans 替换为从多项式回归中获得的计算值
Replace np.nans in list with calculated values obtained from polynomial regression
我有两个 y 值列表:
y_list1 = [45,np.nan,np.nan,np.nan, 40,50,6,2,7,np.nan, np.nan,np.nan, np.nan, np.nan]
y_list2 = [4,23,np.nan, np.nan, np.nan, np.nan, np.nan,5, np.nan, np.nan, np.nan, np.nan, np.nan]
并且这两个值都是在一组时间点获得的:
x = np.array([0,3,4,5,6,7,8,9,10,11,12,13,14,15])
目的:Return y_list1 和 y_list2 将 np.nans 替换为值,通过对那里的数据进行多项式回归,然后计算缺失的点。
我能够拟合多项式:
import sys
import numpy as np
x = np.array([0,3,4,5,6,7,8,9,10,11,12,13,14,15])
id_list = ['1','2']
list_y = np.array([[45,np.nan,np.nan,np.nan, 40,50,6,2,7,np.nan, np.nan,np.nan, np.nan, np.nan],[4,23,np.nan, np.nan, np.nan, np.nan, np.nan,5, np.nan, np.nan, np.nan, np.nan, np.nan]]
for each_id,y in zip(id_list,list_y):
#treat the missing data
idx = np.isfinite(x) & np.isfinite(y)
#fit
ab = np.polyfit(x[idx], y[idx], len(list_y[0]))
于是想用这个fit来代替y中缺失的值,于是找到了this,并实现:
replace_nan = np.polyval(x,y)
print(replace_nan)
输出为:
[2.13161598e+20 nan nan nan
5.20634185e+19 7.52453405e+20 8.35884417e+09 3.27510000e+04
5.11358666e+10 nan nan nan
nan nan]
test_polyreg.py:16: RankWarning: Polyfit may be poorly conditioned
ab = np.polyfit(x[idx], y[idx], len(list_y[0])) #understand how many degrees
[7.45653990e+07 6.97736286e+16 nan nan
nan nan nan 9.91821285e+08
nan nan nan nan
nan nan]
我不担心条件差警告,因为这只是测试数据以尝试理解它应该如何工作,但输出中仍然包含 nans(并且没有使用我之前生成的拟合), 有人应该如何用多项式回归估计的点替换 y 值中的 nans 吗?
首先您应该将 ab
定义修改为:
ab = np.polyfit(x[idx], np.array(y)[idx], idx.sum())
ab
是你的多项式系数,所以你必须将它们传递给 np.polyval
作为:
replace_nan = np.polyval(ab,x)
print(replace_nan)
输出:
[ 4. 23. 26.54413638 28.01419869 27.00250156
23.10135965 15.90308758 5. -10.01558845 -29.55136312
-54.01500938 -83.81421259 -119.3566581 -161.05003127]
我有两个 y 值列表:
y_list1 = [45,np.nan,np.nan,np.nan, 40,50,6,2,7,np.nan, np.nan,np.nan, np.nan, np.nan]
y_list2 = [4,23,np.nan, np.nan, np.nan, np.nan, np.nan,5, np.nan, np.nan, np.nan, np.nan, np.nan]
并且这两个值都是在一组时间点获得的:
x = np.array([0,3,4,5,6,7,8,9,10,11,12,13,14,15])
目的:Return y_list1 和 y_list2 将 np.nans 替换为值,通过对那里的数据进行多项式回归,然后计算缺失的点。
我能够拟合多项式:
import sys
import numpy as np
x = np.array([0,3,4,5,6,7,8,9,10,11,12,13,14,15])
id_list = ['1','2']
list_y = np.array([[45,np.nan,np.nan,np.nan, 40,50,6,2,7,np.nan, np.nan,np.nan, np.nan, np.nan],[4,23,np.nan, np.nan, np.nan, np.nan, np.nan,5, np.nan, np.nan, np.nan, np.nan, np.nan]]
for each_id,y in zip(id_list,list_y):
#treat the missing data
idx = np.isfinite(x) & np.isfinite(y)
#fit
ab = np.polyfit(x[idx], y[idx], len(list_y[0]))
于是想用这个fit来代替y中缺失的值,于是找到了this,并实现:
replace_nan = np.polyval(x,y)
print(replace_nan)
输出为:
[2.13161598e+20 nan nan nan
5.20634185e+19 7.52453405e+20 8.35884417e+09 3.27510000e+04
5.11358666e+10 nan nan nan
nan nan]
test_polyreg.py:16: RankWarning: Polyfit may be poorly conditioned
ab = np.polyfit(x[idx], y[idx], len(list_y[0])) #understand how many degrees
[7.45653990e+07 6.97736286e+16 nan nan
nan nan nan 9.91821285e+08
nan nan nan nan
nan nan]
我不担心条件差警告,因为这只是测试数据以尝试理解它应该如何工作,但输出中仍然包含 nans(并且没有使用我之前生成的拟合), 有人应该如何用多项式回归估计的点替换 y 值中的 nans 吗?
首先您应该将 ab
定义修改为:
ab = np.polyfit(x[idx], np.array(y)[idx], idx.sum())
ab
是你的多项式系数,所以你必须将它们传递给 np.polyval
作为:
replace_nan = np.polyval(ab,x)
print(replace_nan)
输出:
[ 4. 23. 26.54413638 28.01419869 27.00250156
23.10135965 15.90308758 5. -10.01558845 -29.55136312
-54.01500938 -83.81421259 -119.3566581 -161.05003127]