将曲线拟合到数据,获取分析形式,并检查曲线何时超过阈值
Fit curve to data, get analytical form, & check when curve crosses threshold
每条曲线有 40 个点,我想平滑函数并估计曲线在 y 轴上越过阈值的时间。
有没有我可以轻松应用它的拟合函数,我可以使用插值来绘制新函数,但我不知道如何请求 y = threshold 的 x 值。
不幸的是,曲线的形状并不完全相同,所以我不能使用 scipy.optimize.curve_fit。
谢谢!
更新:添加两条曲线:
曲线 1
[942.153,353.081,53.088,125.110,140.851,188.170,70.536,-122.473,-369.061,-407.945,88.734,484.334,267.762,65.831,74.010,-55.781,-260.024,-466.830,-524.511,-76.833,-36.779,-117.366,218.578,175.662,185.653,299.285,215.276,546.048,1210.132,3087.326,7052.849,13867.824,27156.939,51379.664,91908.266,148874.563,215825.031,290073.219,369567.781,437031.688]
曲线 2
[-39034.039,-34637.941,-24945.094,-16697.996,-9247.398,-2002.051,3409.047,3658.145,7542.242,11781.340,11227.688,10089.035,9155.883,8413.980,5289.578,3150.676,4590.023,6342.871,3294.719,580.567,-938.586,-3919.738,-5580.390,-3141.793,-2785.945,-2683.597,-4287.750,-4947.902,-7347.554,-8919.457,-6403.359,-6722.011,-8181.414,-6807.566,-7603.218,-6298.371,-6909.523,-5878.675,-5193.578,-7193.980]
x 值为
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40]
为了拟合平滑曲线,可以拟合Legendre polynomials using numpy.polynomial.legendre.Legendre's fit方法。
# import packages we need later
import matplotlib.pyplot as plt
import numpy as np
拟合勒让德多项式
正在准备数据as numpy arrays:
curve1 = \
np.asarray([942.153,353.081,53.088,125.110,140.851,188.170,70.536,-122.473,-369.061,-407.945,88.734,484.334,267.762,65.831,74.010,-55.781,-260.024,-466.830,-524.511,-76.833,-36.779,-117.366,218.578,175.662,185.653,299.285,215.276,546.048,1210.132,3087.326,7052.849,13867.824,27156.939,51379.664,91908.266,148874.563,215825.031,290073.219,369567.781,437031.688])
curve2 = \
np.asarray([-39034.039,-34637.941,-24945.094,-16697.996,-9247.398,-2002.051,3409.047,3658.145,7542.242,11781.340,11227.688,10089.035,9155.883,8413.980,5289.578,3150.676,4590.023,6342.871,3294.719,580.567,-938.586,-3919.738,-5580.390,-3141.793,-2785.945,-2683.597,-4287.750,-4947.902,-7347.554,-8919.457,-6403.359,-6722.011,-8181.414,-6807.566,-7603.218,-6298.371,-6909.523,-5878.675,-5193.578,-7193.980])
xvals = \
np.asarray([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40])
让我们拟合勒让德多项式,degree
是使用的最高阶多项式,the first few is here for example。
degree=10
legendrefit_curve1 = np.polynomial.legendre.Legendre.fit(xvals, curve1, deg=degree)
legendrefit_curve2 = np.polynomial.legendre.Legendre.fit(xvals, curve2, deg=degree)
使用 linspace method 在均匀分布的点处计算这些拟合曲线。 n
是我们想要的点对的数量。
n=100
fitted_vals_curve1 = legendrefit_curve1.linspace(n=n)
fitted_vals_curve2 = legendrefit_curve2.linspace(n=n)
让我们绘制结果,以及 threshold
(使用 axvline):
plt.scatter(xvals, curve1)
plt.scatter(xvals, curve2)
plt.plot(fitted_vals_curve1[0],fitted_vals_curve1[1],c='r')
plt.plot(fitted_vals_curve2[0],fitted_vals_curve2[1],c='k')
threshold=100000
plt.axhline(y=threshold)
曲线完美贴合。
什么时候超过门槛?
要检查每个系列中 threshold
的交叉位置,您可以这样做:
for x, y in zip(fitted_vals_curve1[0], fitted_vals_curve1[1]):
if y > threshold:
xcross_curve1 = x
break
for x, y in zip(fitted_vals_curve2[0], fitted_vals_curve2[1]):
if y > threshold:
xcross_curve2 = x
break
xcross_curve1
和 xcross_curve2
将保持 x
值,其中 curve1
和 curve2
如果确实穿过 threshold
=31=];如果没有,它们将是未定义的。
让我们绘制它们以检查它是否有效 (link to axhline docs):
plt.scatter(xvals, curve1)
plt.scatter(xvals, curve2)
plt.plot(fitted_vals_curve1[0],fitted_vals_curve1[1],c='r')
plt.plot(fitted_vals_curve2[0],fitted_vals_curve2[1],c='k')
plt.axhline(y=threshold)
try: plt.axvline(x=xcross_curve1)
except NameError: print('curve1 is not passing the threshold',c='b')
try: plt.axvline(x=xcross_curve2)
except NameError: print('curve2 is not passing the threshold')
不出所料,我们得到了这个图:
(和文本输出:curve2 is not passing the threshold
。)
如果您想提高 xcross_curve1
或 xcross_curve2
的准确性,您可以增加上面定义的 degree
和 n
。
从勒让德到多项式
我们拟合了一条曲线,大致有以下形式:
其中 P_n
是第 n
th Legendre 多项式,s(x)
是一些将 x
转换为 P_n
期望范围的函数(一些我们不知道的数学东西现在需要知道)。
我们希望拟合线的形式为:
我们将使用 legendre()
of scipy.special:
from scipy.special import legendre
我们还将使用 np.pad
(docs, ).
legendredict={}
for icoef, coef in enumerate(legendrefit_curve1.coef):
legendredict[icoef]=coef*np.pad(legendre(icoef).coef,(10-icoef,0),mode='constant')
legendredict
将保持 keys
从 0
到 10
,并且 dict
中的每个值将是 float
的列表秒。 key
指的是多项式的阶数,float
的列表表示我们拟合的构成多项式中 x**n
值的系数,以倒序排列.
例如:
P_4
是:
legendredict[4]
是:
isarray([ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
0.00000000e+00, 0.00000000e+00, 3.29634565e+05, 3.65967884e-11,
-2.82543913e+05, 1.82983942e-11, 2.82543913e+04])
意味着在 P_n
的 总和 中(f(x)
,上面),我们有 q_4
很多 P_4
,相当于 1
的 2.82543913e+04
,x
的 1.82983942e-11
,x^2
的 -2.82543913e+05
,等等,仅来自 P_4
组件。
所以如果我们想知道有多少1
s, x
s, x^2
s等我们需要形成多项式和,我们需要添加对1
s、x
s、x^2
s 等来自所有不同的 P_n
s。这就是我们所做的:
polycoeffs = np.sum(np.stack(list(legendredict.values())),axis=0)
然后让我们形成一个多项式和:
for icoef, coef in enumerate(reversed(polycoeffs)):
print(str(coef)+'*s(x)**'+str(icoef),end='\n +')
给出输出:
-874.1456709637822*s(x)**0
+2893.7228005540596*s(x)**1
+50415.38472217957*s(x)**2
+-6979.322584205707*s(x)**3
+-453363.49985790614*s(x)**4
+-250464.7549807652*s(x)**5
+1250129.5521521813*s(x)**6
+1267709.5031024509*s(x)**7
+-493280.0177807359*s(x)**8
+-795684.224334346*s(x)**9
+-134370.1696946264*s(x)**10
+
(我们将忽略最后的+
符号,格式不是这里的重点。)
我们还需要计算s(x)
。如果我们在 Jupyter Notebook / Google Colab 中工作,只执行一个带有 legendrefit_curve1
returns:
的单元格
从这里我们可以清楚地看出s(x)
是-1.0512820512820513+0.05128205128205128x
。如果我们想以更编程的方式进行:
2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
是 0.05128205128205128
&
-1-2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
只是 -1.0512820512820513
出于某些数学原因,这是正确的,但与此无关 ()。
所以我们可以定义:
def s(input):
a=-1-2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
b=2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
return a+b*input
另外,让我们根据上面得到的s(x)
的多项式之和来定义:
def polyval(x):
return -874.1456709637822*s(x)**0+2893.7228005540596*s(x)**1+50415.38472217957*s(x)**2+-6979.322584205707*s(x)**3+-453363.49985790614*s(x)**4+-250464.7549807652*s(x)**5+1250129.5521521813*s(x)**6+1267709.5031024509*s(x)**7+-493280.0177807359*s(x)**8+-795684.224334346*s(x)**9+-134370.1696946264*s(x)**10
以更程序化的方式:
def polyval(x):
return sum([coef*s(x)**icoef for icoef, coef in enumerate(reversed(polycoeffs))])
检查我们的多项式是否确实适合:
plt.scatter(fitted_vals_curve1[0],fitted_vals_curve1[1],c='r')
plt.plot(fitted_vals_curve1[0],[polyval(val) for val in fitted_vals_curve1[0]])
确实如此:
所以让我们打印出我们的纯多项式和,其中 s(x)
被显式函数替换:
for icoef, coef in enumerate(reversed(polycoeffs)):
print(str(coef)+'*(-1.0512820512820513+0512820512820513*x)**'+str(icoef),end='\n +')
给出输出:
-874.1456709637822*(-1.0512820512820513+0512820512820513*x)**0
+2893.7228005540596*(-1.0512820512820513+0512820512820513*x)**1
+50415.38472217957*(-1.0512820512820513+0512820512820513*x)**2
+-6979.322584205707*(-1.0512820512820513+0512820512820513*x)**3
+-453363.49985790614*(-1.0512820512820513+0512820512820513*x)**4
+-250464.7549807652*(-1.0512820512820513+0512820512820513*x)**5
+1250129.5521521813*(-1.0512820512820513+0512820512820513*x)**6
+1267709.5031024509*(-1.0512820512820513+0512820512820513*x)**7
+-493280.0177807359*(-1.0512820512820513+0512820512820513*x)**8
+-795684.224334346*(-1.0512820512820513+0512820512820513*x)**9
+-134370.1696946264*(-1.0512820512820513+0512820512820513*x)**10
+
可以根据需要进行简化。 (忽略最后一个 +
符号。)
如果想要高(低)次多项式拟合,就拟合高(低)次勒让德多项式。
每条曲线有 40 个点,我想平滑函数并估计曲线在 y 轴上越过阈值的时间。 有没有我可以轻松应用它的拟合函数,我可以使用插值来绘制新函数,但我不知道如何请求 y = threshold 的 x 值。
不幸的是,曲线的形状并不完全相同,所以我不能使用 scipy.optimize.curve_fit。
谢谢!
更新:添加两条曲线:
曲线 1
[942.153,353.081,53.088,125.110,140.851,188.170,70.536,-122.473,-369.061,-407.945,88.734,484.334,267.762,65.831,74.010,-55.781,-260.024,-466.830,-524.511,-76.833,-36.779,-117.366,218.578,175.662,185.653,299.285,215.276,546.048,1210.132,3087.326,7052.849,13867.824,27156.939,51379.664,91908.266,148874.563,215825.031,290073.219,369567.781,437031.688]
曲线 2
[-39034.039,-34637.941,-24945.094,-16697.996,-9247.398,-2002.051,3409.047,3658.145,7542.242,11781.340,11227.688,10089.035,9155.883,8413.980,5289.578,3150.676,4590.023,6342.871,3294.719,580.567,-938.586,-3919.738,-5580.390,-3141.793,-2785.945,-2683.597,-4287.750,-4947.902,-7347.554,-8919.457,-6403.359,-6722.011,-8181.414,-6807.566,-7603.218,-6298.371,-6909.523,-5878.675,-5193.578,-7193.980]
x 值为
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40]
为了拟合平滑曲线,可以拟合Legendre polynomials using numpy.polynomial.legendre.Legendre's fit方法。
# import packages we need later
import matplotlib.pyplot as plt
import numpy as np
拟合勒让德多项式
正在准备数据as numpy arrays:
curve1 = \
np.asarray([942.153,353.081,53.088,125.110,140.851,188.170,70.536,-122.473,-369.061,-407.945,88.734,484.334,267.762,65.831,74.010,-55.781,-260.024,-466.830,-524.511,-76.833,-36.779,-117.366,218.578,175.662,185.653,299.285,215.276,546.048,1210.132,3087.326,7052.849,13867.824,27156.939,51379.664,91908.266,148874.563,215825.031,290073.219,369567.781,437031.688])
curve2 = \
np.asarray([-39034.039,-34637.941,-24945.094,-16697.996,-9247.398,-2002.051,3409.047,3658.145,7542.242,11781.340,11227.688,10089.035,9155.883,8413.980,5289.578,3150.676,4590.023,6342.871,3294.719,580.567,-938.586,-3919.738,-5580.390,-3141.793,-2785.945,-2683.597,-4287.750,-4947.902,-7347.554,-8919.457,-6403.359,-6722.011,-8181.414,-6807.566,-7603.218,-6298.371,-6909.523,-5878.675,-5193.578,-7193.980])
xvals = \
np.asarray([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40])
让我们拟合勒让德多项式,degree
是使用的最高阶多项式,the first few is here for example。
degree=10
legendrefit_curve1 = np.polynomial.legendre.Legendre.fit(xvals, curve1, deg=degree)
legendrefit_curve2 = np.polynomial.legendre.Legendre.fit(xvals, curve2, deg=degree)
使用 linspace method 在均匀分布的点处计算这些拟合曲线。 n
是我们想要的点对的数量。
n=100
fitted_vals_curve1 = legendrefit_curve1.linspace(n=n)
fitted_vals_curve2 = legendrefit_curve2.linspace(n=n)
让我们绘制结果,以及 threshold
(使用 axvline):
plt.scatter(xvals, curve1)
plt.scatter(xvals, curve2)
plt.plot(fitted_vals_curve1[0],fitted_vals_curve1[1],c='r')
plt.plot(fitted_vals_curve2[0],fitted_vals_curve2[1],c='k')
threshold=100000
plt.axhline(y=threshold)
曲线完美贴合。
什么时候超过门槛?
要检查每个系列中 threshold
的交叉位置,您可以这样做:
for x, y in zip(fitted_vals_curve1[0], fitted_vals_curve1[1]):
if y > threshold:
xcross_curve1 = x
break
for x, y in zip(fitted_vals_curve2[0], fitted_vals_curve2[1]):
if y > threshold:
xcross_curve2 = x
break
xcross_curve1
和 xcross_curve2
将保持 x
值,其中 curve1
和 curve2
如果确实穿过 threshold
=31=];如果没有,它们将是未定义的。
让我们绘制它们以检查它是否有效 (link to axhline docs):
plt.scatter(xvals, curve1)
plt.scatter(xvals, curve2)
plt.plot(fitted_vals_curve1[0],fitted_vals_curve1[1],c='r')
plt.plot(fitted_vals_curve2[0],fitted_vals_curve2[1],c='k')
plt.axhline(y=threshold)
try: plt.axvline(x=xcross_curve1)
except NameError: print('curve1 is not passing the threshold',c='b')
try: plt.axvline(x=xcross_curve2)
except NameError: print('curve2 is not passing the threshold')
不出所料,我们得到了这个图:
(和文本输出:curve2 is not passing the threshold
。)
如果您想提高 xcross_curve1
或 xcross_curve2
的准确性,您可以增加上面定义的 degree
和 n
。
从勒让德到多项式
我们拟合了一条曲线,大致有以下形式:
P_n
是第 n
th Legendre 多项式,s(x)
是一些将 x
转换为 P_n
期望范围的函数(一些我们不知道的数学东西现在需要知道)。
我们希望拟合线的形式为:
我们将使用 legendre()
of scipy.special:
from scipy.special import legendre
我们还将使用 np.pad
(docs,
legendredict={}
for icoef, coef in enumerate(legendrefit_curve1.coef):
legendredict[icoef]=coef*np.pad(legendre(icoef).coef,(10-icoef,0),mode='constant')
legendredict
将保持 keys
从 0
到 10
,并且 dict
中的每个值将是 float
的列表秒。 key
指的是多项式的阶数,float
的列表表示我们拟合的构成多项式中 x**n
值的系数,以倒序排列.
例如:
P_4
是:
legendredict[4]
是:
isarray([ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
0.00000000e+00, 0.00000000e+00, 3.29634565e+05, 3.65967884e-11,
-2.82543913e+05, 1.82983942e-11, 2.82543913e+04])
意味着在 P_n
的 总和 中(f(x)
,上面),我们有 q_4
很多 P_4
,相当于 1
的 2.82543913e+04
,x
的 1.82983942e-11
,x^2
的 -2.82543913e+05
,等等,仅来自 P_4
组件。
所以如果我们想知道有多少1
s, x
s, x^2
s等我们需要形成多项式和,我们需要添加对1
s、x
s、x^2
s 等来自所有不同的 P_n
s。这就是我们所做的:
polycoeffs = np.sum(np.stack(list(legendredict.values())),axis=0)
然后让我们形成一个多项式和:
for icoef, coef in enumerate(reversed(polycoeffs)):
print(str(coef)+'*s(x)**'+str(icoef),end='\n +')
给出输出:
-874.1456709637822*s(x)**0
+2893.7228005540596*s(x)**1
+50415.38472217957*s(x)**2
+-6979.322584205707*s(x)**3
+-453363.49985790614*s(x)**4
+-250464.7549807652*s(x)**5
+1250129.5521521813*s(x)**6
+1267709.5031024509*s(x)**7
+-493280.0177807359*s(x)**8
+-795684.224334346*s(x)**9
+-134370.1696946264*s(x)**10
+
(我们将忽略最后的+
符号,格式不是这里的重点。)
我们还需要计算s(x)
。如果我们在 Jupyter Notebook / Google Colab 中工作,只执行一个带有 legendrefit_curve1
returns:
从这里我们可以清楚地看出s(x)
是-1.0512820512820513+0.05128205128205128x
。如果我们想以更编程的方式进行:
2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
是 0.05128205128205128
&
-1-2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
只是 -1.0512820512820513
出于某些数学原因,这是正确的,但与此无关 (
所以我们可以定义:
def s(input):
a=-1-2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
b=2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
return a+b*input
另外,让我们根据上面得到的s(x)
的多项式之和来定义:
def polyval(x):
return -874.1456709637822*s(x)**0+2893.7228005540596*s(x)**1+50415.38472217957*s(x)**2+-6979.322584205707*s(x)**3+-453363.49985790614*s(x)**4+-250464.7549807652*s(x)**5+1250129.5521521813*s(x)**6+1267709.5031024509*s(x)**7+-493280.0177807359*s(x)**8+-795684.224334346*s(x)**9+-134370.1696946264*s(x)**10
以更程序化的方式:
def polyval(x):
return sum([coef*s(x)**icoef for icoef, coef in enumerate(reversed(polycoeffs))])
检查我们的多项式是否确实适合:
plt.scatter(fitted_vals_curve1[0],fitted_vals_curve1[1],c='r')
plt.plot(fitted_vals_curve1[0],[polyval(val) for val in fitted_vals_curve1[0]])
确实如此:
所以让我们打印出我们的纯多项式和,其中 s(x)
被显式函数替换:
for icoef, coef in enumerate(reversed(polycoeffs)):
print(str(coef)+'*(-1.0512820512820513+0512820512820513*x)**'+str(icoef),end='\n +')
给出输出:
-874.1456709637822*(-1.0512820512820513+0512820512820513*x)**0
+2893.7228005540596*(-1.0512820512820513+0512820512820513*x)**1
+50415.38472217957*(-1.0512820512820513+0512820512820513*x)**2
+-6979.322584205707*(-1.0512820512820513+0512820512820513*x)**3
+-453363.49985790614*(-1.0512820512820513+0512820512820513*x)**4
+-250464.7549807652*(-1.0512820512820513+0512820512820513*x)**5
+1250129.5521521813*(-1.0512820512820513+0512820512820513*x)**6
+1267709.5031024509*(-1.0512820512820513+0512820512820513*x)**7
+-493280.0177807359*(-1.0512820512820513+0512820512820513*x)**8
+-795684.224334346*(-1.0512820512820513+0512820512820513*x)**9
+-134370.1696946264*(-1.0512820512820513+0512820512820513*x)**10
+
可以根据需要进行简化。 (忽略最后一个 +
符号。)
如果想要高(低)次多项式拟合,就拟合高(低)次勒让德多项式。