fourier3 级数的拟合数据总是产生一条直线
fitting data to fourier3 series always produce a straight line
我有适合 Fourier3 级数的数据,我查看了这个答案: 并尝试了来自不同包的不同算法(如 symfit 和 scipy)。但是当我绘制数据时,不同的包给我得到了这个结果:
enter image description here
目前,我正在使用 scipy 的 curve_fit 包,这是我的代码:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import pandas as pd
def fourier(x, *as_bs):
sum_a = 0
sum_b = 0
j = 1
w = as_bs[0]
a0 = as_bs[1]
for i in range(2, len(as_bs)-1, 2):
sum_a += as_bs[i] * np.cos(j * w * x)
sum_b += as_bs[i+1] * np.sin(j * w * x)
j = j + 1
return a0 + sum_a + sum_b
T = pd.read_excel('FS_data.xlsx')
A = pd.DataFrame(T)
xdata = np.array(A.iloc[:, 0])
ydata = np.array(A.iloc[:, 1])
# fits
popt, pcov = curve_fit(fourier, xdata, ydata, [np.random.rand(1)] * 8)
print(popt)
data_fit = fourier(ydata, *popt)
print(data_fit)
plt.plot(ydata)
plt.plot(data_fit, label='after fitting')
plt.legend()
plt.show()
所以,我的代码基本上会读取随机的 8 个数字,并将它们分别分配为 (f, a0, a1, b1, a2, b2, a3, b3) 的初始猜测值。
我尝试在 Matlab 上拟合数据,以检查数据是否可以用 fourier3 拟合,结果非常好:
enter image description here
我在 Python 和 Matlab 上都打印了输出以进行比较,这是两者的结果:
Python:
w = 5.66709943e-01
a0 = 3.80499132e+01
a1 = 5.56883486e-04
b1 = -3.88408379e-04
a2 = -3.88408379e-04
b2 = 3.32951592e-04
a3 = 3.15641900e-04
b3 = 1.96414168e-04
Matlab:
a0 = 38.07 (38.07, 38.08)
a1 = 0.5352 (0.4951, 0.5753)
b1 = -0.5788 (-0.5863, -0.5714)
a2 = -0.3728 (-0.413, -0.3326)
b2 = 0.5411 (0.492, 0.5901)
a3 = 0.2357 (0.2226, 0.2488)
b3 = 0.05895 (0.02773, 0.09018)
w = 0.0003088
如前所述,只有 a0 的值是正确的,其他的与 Matlab 相差甚远。
那么为什么我在 Python 中得到这个结果?我做错了什么?
下面是喜欢测试的数据:
我不喜欢 Matlab,所以我不知道 Matlab 拟合还做了哪些额外的工作来估计非线性拟合的起始值。不过,我可以说 curve_fit
根本没有,即所有值都假定为 1
的顺序。最简单的方法是将 x
轴重新缩放到范围 [0, 2 pi]
。因此,OP 的问题再次是错误的起始值。然而,重新缩放需要知道要拟合的主波大约是数据集的宽度。此外,我们需要假设所有其他拟合参数也是 1
阶的。幸运的是,情况就是这样,所以这会奏效:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
xdat, ydat = np.loadtxt( "data.tsv", unpack=True, skiprows=1 )
def fourier(x, *as_bs):
sum_a = 0
sum_b = 0
j = 1
w = as_bs[0]
a0 = as_bs[1]
for i in range(2, len( as_bs ) - 1, 2 ):
sum_a += as_bs[i] * np.cos( j * w * x )
sum_b += as_bs[i+1] * np.sin( j * w * x )
j = j + 1
return a0 + sum_a + sum_b
"""
lets rescale the data to get the base frequency in the range of one
"""
xmin = min( xdat )
xmax = max( xdat )
xdat = ( xdat - xmin ) / (xmax - xmin ) * 2 * np.pi
popt, pcov = curve_fit(
fourier,
xdat, ydat,
p0 = np.ones(8)
)
### here I assume that higher order are similar to lower orders
### but slightly smaller. ... hoping that the fit correts errors in
### this assumption
print(popt)
### scale back w noting that it scales inverse to x
print( popt[0] * 2 * np.pi / (xmax - xmin ) )
data_fit = fourier( xdat, *popt )
如果我们不能做出上述假设,我们可能只假设存在一个对信号有主要贡献的基频(请注意,这并不总是正确的)。在这种情况下,我们可以以非迭代方式预先计算起始猜测。
解决方案看起来有点复杂:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
from scipy.integrate import cumtrapz
xdat, ydat = np.loadtxt( "data.tsv", unpack=True, skiprows=1 )
def fourier(x, *as_bs):
sum_a = 0
sum_b = 0
j = 1
w = as_bs[0]
a0 = as_bs[1]
for i in range(2, len( as_bs ) - 1, 2 ):
sum_a += as_bs[i] * np.cos( j * w * x )
sum_b += as_bs[i+1] * np.sin( j * w * x )
j = j + 1
return a0 + sum_a + sum_b
#### initial guess
"""
This uses the fact that if y = a sin w t + b cos w t + c we have
int int y = -y/w^2 + c/2 t^2 + d t + e
i.e. we can get 1/w^2 as linear fit parameter without the danger of
a non-linear fit iterative process running into a local minimum
for details see:
https://scikit-guess.readthedocs.io/en/sine/_downloads/4b4ed1e691ff195be3ca73879a674234/Regressions-et-equations-integrales.pdf
"""
Sy = cumtrapz( ydat, xdat, initial=0 )
SSy = cumtrapz( Sy, xdat, initial=0 )
ST = np.array( [
ydat, xdat**2, xdat, np.ones( len( xdat ) )
] )
S = np.transpose( ST )
eta = np.dot( ST, SSy )
A = np.dot( ST, S )
sol = np.linalg.solve( A, eta )
wFit = np.sqrt( -1 / sol[0] )
### linear parameters
"""
Once we have a good guess for w we can get starting guesses for
a, b and c from a standard linear fit
"""
ST = np.array( [
np.sin( wFit * xdat ), np.cos( wFit * xdat ), np.ones( len( xdat ) )
])
S = np.transpose( ST )
eta = np.dot( ST, ydat )
A = np.dot( ST, S )
sol = np.linalg.solve( A, eta )
a1 = sol[0]
b1 = sol[1]
a0 = sol[2]
### final non-linear fit
"""
Now we can use the guesses from above as input for the final
non-linear fit. Hopefully, we are now close enough to the global minimum
and have the algorithm converge reasonably
"""
popt, pcov = curve_fit(
fourier,
xdat, ydat,
p0=[
wFit, a0, a1, b1,
a1 / 2, b1 / 2,
a1 / 4, b1 / 4
]
)
### here I assume that higher order are similar to lower orders
### but slightly smaller. ... hoping that the fit correts errors in
### this assumption
print(popt)
data_fit = fourier( xdat, *popt )
plt.plot( xdat, ydat, ls="", marker="o", ms=0.5, label="data" )
plt.plot( xdat, data_fit, label='fitting')
plt.legend()
plt.show()
两者都提供了基本相同的解决方案,后者的代码适用于更多情况且假设更少。
我有适合 Fourier3 级数的数据,我查看了这个答案:
目前,我正在使用 scipy 的 curve_fit 包,这是我的代码:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import pandas as pd
def fourier(x, *as_bs):
sum_a = 0
sum_b = 0
j = 1
w = as_bs[0]
a0 = as_bs[1]
for i in range(2, len(as_bs)-1, 2):
sum_a += as_bs[i] * np.cos(j * w * x)
sum_b += as_bs[i+1] * np.sin(j * w * x)
j = j + 1
return a0 + sum_a + sum_b
T = pd.read_excel('FS_data.xlsx')
A = pd.DataFrame(T)
xdata = np.array(A.iloc[:, 0])
ydata = np.array(A.iloc[:, 1])
# fits
popt, pcov = curve_fit(fourier, xdata, ydata, [np.random.rand(1)] * 8)
print(popt)
data_fit = fourier(ydata, *popt)
print(data_fit)
plt.plot(ydata)
plt.plot(data_fit, label='after fitting')
plt.legend()
plt.show()
所以,我的代码基本上会读取随机的 8 个数字,并将它们分别分配为 (f, a0, a1, b1, a2, b2, a3, b3) 的初始猜测值。
我尝试在 Matlab 上拟合数据,以检查数据是否可以用 fourier3 拟合,结果非常好: enter image description here
我在 Python 和 Matlab 上都打印了输出以进行比较,这是两者的结果: Python:
w = 5.66709943e-01
a0 = 3.80499132e+01
a1 = 5.56883486e-04
b1 = -3.88408379e-04
a2 = -3.88408379e-04
b2 = 3.32951592e-04
a3 = 3.15641900e-04
b3 = 1.96414168e-04
Matlab:
a0 = 38.07 (38.07, 38.08)
a1 = 0.5352 (0.4951, 0.5753)
b1 = -0.5788 (-0.5863, -0.5714)
a2 = -0.3728 (-0.413, -0.3326)
b2 = 0.5411 (0.492, 0.5901)
a3 = 0.2357 (0.2226, 0.2488)
b3 = 0.05895 (0.02773, 0.09018)
w = 0.0003088
如前所述,只有 a0 的值是正确的,其他的与 Matlab 相差甚远。 那么为什么我在 Python 中得到这个结果?我做错了什么?
下面是喜欢测试的数据:
我不喜欢 Matlab,所以我不知道 Matlab 拟合还做了哪些额外的工作来估计非线性拟合的起始值。不过,我可以说 curve_fit
根本没有,即所有值都假定为 1
的顺序。最简单的方法是将 x
轴重新缩放到范围 [0, 2 pi]
。因此,OP 的问题再次是错误的起始值。然而,重新缩放需要知道要拟合的主波大约是数据集的宽度。此外,我们需要假设所有其他拟合参数也是 1
阶的。幸运的是,情况就是这样,所以这会奏效:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
xdat, ydat = np.loadtxt( "data.tsv", unpack=True, skiprows=1 )
def fourier(x, *as_bs):
sum_a = 0
sum_b = 0
j = 1
w = as_bs[0]
a0 = as_bs[1]
for i in range(2, len( as_bs ) - 1, 2 ):
sum_a += as_bs[i] * np.cos( j * w * x )
sum_b += as_bs[i+1] * np.sin( j * w * x )
j = j + 1
return a0 + sum_a + sum_b
"""
lets rescale the data to get the base frequency in the range of one
"""
xmin = min( xdat )
xmax = max( xdat )
xdat = ( xdat - xmin ) / (xmax - xmin ) * 2 * np.pi
popt, pcov = curve_fit(
fourier,
xdat, ydat,
p0 = np.ones(8)
)
### here I assume that higher order are similar to lower orders
### but slightly smaller. ... hoping that the fit correts errors in
### this assumption
print(popt)
### scale back w noting that it scales inverse to x
print( popt[0] * 2 * np.pi / (xmax - xmin ) )
data_fit = fourier( xdat, *popt )
如果我们不能做出上述假设,我们可能只假设存在一个对信号有主要贡献的基频(请注意,这并不总是正确的)。在这种情况下,我们可以以非迭代方式预先计算起始猜测。
解决方案看起来有点复杂:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
from scipy.integrate import cumtrapz
xdat, ydat = np.loadtxt( "data.tsv", unpack=True, skiprows=1 )
def fourier(x, *as_bs):
sum_a = 0
sum_b = 0
j = 1
w = as_bs[0]
a0 = as_bs[1]
for i in range(2, len( as_bs ) - 1, 2 ):
sum_a += as_bs[i] * np.cos( j * w * x )
sum_b += as_bs[i+1] * np.sin( j * w * x )
j = j + 1
return a0 + sum_a + sum_b
#### initial guess
"""
This uses the fact that if y = a sin w t + b cos w t + c we have
int int y = -y/w^2 + c/2 t^2 + d t + e
i.e. we can get 1/w^2 as linear fit parameter without the danger of
a non-linear fit iterative process running into a local minimum
for details see:
https://scikit-guess.readthedocs.io/en/sine/_downloads/4b4ed1e691ff195be3ca73879a674234/Regressions-et-equations-integrales.pdf
"""
Sy = cumtrapz( ydat, xdat, initial=0 )
SSy = cumtrapz( Sy, xdat, initial=0 )
ST = np.array( [
ydat, xdat**2, xdat, np.ones( len( xdat ) )
] )
S = np.transpose( ST )
eta = np.dot( ST, SSy )
A = np.dot( ST, S )
sol = np.linalg.solve( A, eta )
wFit = np.sqrt( -1 / sol[0] )
### linear parameters
"""
Once we have a good guess for w we can get starting guesses for
a, b and c from a standard linear fit
"""
ST = np.array( [
np.sin( wFit * xdat ), np.cos( wFit * xdat ), np.ones( len( xdat ) )
])
S = np.transpose( ST )
eta = np.dot( ST, ydat )
A = np.dot( ST, S )
sol = np.linalg.solve( A, eta )
a1 = sol[0]
b1 = sol[1]
a0 = sol[2]
### final non-linear fit
"""
Now we can use the guesses from above as input for the final
non-linear fit. Hopefully, we are now close enough to the global minimum
and have the algorithm converge reasonably
"""
popt, pcov = curve_fit(
fourier,
xdat, ydat,
p0=[
wFit, a0, a1, b1,
a1 / 2, b1 / 2,
a1 / 4, b1 / 4
]
)
### here I assume that higher order are similar to lower orders
### but slightly smaller. ... hoping that the fit correts errors in
### this assumption
print(popt)
data_fit = fourier( xdat, *popt )
plt.plot( xdat, ydat, ls="", marker="o", ms=0.5, label="data" )
plt.plot( xdat, data_fit, label='fitting')
plt.legend()
plt.show()
两者都提供了基本相同的解决方案,后者的代码适用于更多情况且假设更少。