使用 Numpy poly1d 进行基线拟合

baseline fitting using Numpy poly1d

我有以下基线:

并且可以看出,它具有几乎正弦曲线的形状。我正在尝试在其上使用 polyfit。实际上我有两个数据数组,一个叫 x,另一个叫 y。所以我使用的是:

porder = 2
coefs = np.polyfit(x, y, porder)
baseline = np.poly1d(coefs)
cleanspec = y - baseline(x)

我的目标是最终得到干净的谱图,基线平直无起伏。 但是,配件不起作用。关于使用另一种更有效的方法有什么建议吗? 我曾尝试将 porder 更改为 3,但我收到此警告,但它没有任何改变: Polyfit may be poorly conditioned

我的 x 数据:

[1.10192816e+11 1.10192893e+11 1.10192969e+11 1.10193045e+11
 1.10193122e+11 1.10193198e+11 1.10193274e+11 1.10193350e+11
 1.10193427e+11 1.10193503e+11 1.10193579e+11 1.10193656e+11
 1.10193732e+11 1.10193808e+11 1.10193885e+11 1.10193961e+11
 1.10194037e+11 1.10194113e+11 1.10194190e+11 1.10194266e+11
 1.10194342e+11 1.10194419e+11 1.10194495e+11 1.10194571e+11
 1.10194647e+11 1.10194724e+11 1.10194800e+11 1.10194876e+11
 1.10194953e+11 1.10195029e+11 1.10195105e+11 1.10195182e+11
 1.10195258e+11 1.10195334e+11 1.10195410e+11 1.10195487e+11
 1.10195563e+11 1.10195639e+11 1.10195716e+11 1.10195792e+11
 1.10195868e+11 1.10195944e+11 1.10196021e+11 1.10196097e+11
 1.10196173e+11 1.10196250e+11 1.10196326e+11 1.10196402e+11
 1.10196479e+11 1.10196555e+11 1.10196631e+11 1.10196707e+11
 1.10196784e+11 1.10196860e+11 1.10196936e+11 1.10197013e+11
 1.10197089e+11 1.10197165e+11 1.10197241e+11 1.10197318e+11
 1.10197394e+11 1.10197470e+11 1.10197547e+11 1.10197623e+11
 1.10197699e+11 1.10197776e+11 1.10197852e+11 1.10197928e+11
 1.10198004e+11 1.10198081e+11 1.10198157e+11 1.10198233e+11
 1.10198310e+11 1.10198386e+11 1.10198462e+11 1.10198538e+11
 1.10198615e+11 1.10198691e+11 1.10198767e+11 1.10198844e+11
 1.10198920e+11 1.10198996e+11 1.10199073e+11 1.10199149e+11
 1.10199225e+11 1.10199301e+11 1.10199378e+11 1.10199454e+11
 1.10199530e+11 1.10199607e+11 1.10199683e+11 1.10199759e+11
 1.10199835e+11 1.10199912e+11 1.10199988e+11 1.10200064e+11
 1.10200141e+11 1.10202582e+11 1.10202658e+11 1.10202735e+11
 1.10202811e+11 1.10202887e+11 1.10202963e+11 1.10203040e+11
 1.10203116e+11 1.10203192e+11 1.10203269e+11 1.10203345e+11
 1.10203421e+11 1.10203498e+11 1.10203574e+11 1.10203650e+11
 1.10203726e+11 1.10203803e+11 1.10203879e+11 1.10203955e+11
 1.10204032e+11 1.10204108e+11 1.10204184e+11 1.10204260e+11
 1.10204337e+11 1.10204413e+11 1.10204489e+11 1.10204566e+11
 1.10204642e+11 1.10204718e+11 1.10204795e+11 1.10204871e+11
 1.10204947e+11 1.10205023e+11 1.10205100e+11 1.10205176e+11
 1.10205252e+11 1.10205329e+11 1.10205405e+11 1.10205481e+11
 1.10205557e+11 1.10205634e+11 1.10205710e+11 1.10205786e+11
 1.10205863e+11 1.10205939e+11 1.10206015e+11 1.10206092e+11
 1.10206168e+11 1.10206244e+11 1.10206320e+11 1.10206397e+11
 1.10206473e+11 1.10206549e+11 1.10206626e+11 1.10206702e+11
 1.10206778e+11 1.10206854e+11 1.10206931e+11 1.10207007e+11
 1.10207083e+11 1.10207160e+11 1.10207236e+11 1.10207312e+11
 1.10207389e+11 1.10207465e+11 1.10207541e+11 1.10207617e+11
 1.10207694e+11 1.10207770e+11 1.10207846e+11 1.10207923e+11
 1.10207999e+11 1.10208075e+11 1.10208151e+11 1.10208228e+11
 1.10208304e+11 1.10208380e+11 1.10208457e+11 1.10208533e+11
 1.10208609e+11 1.10208686e+11 1.10208762e+11 1.10208838e+11
 1.10208914e+11 1.10208991e+11 1.10209067e+11 1.10209143e+11
 1.10209220e+11 1.10209296e+11 1.10209372e+11 1.10209448e+11
 1.10209525e+11 1.10209601e+11 1.10209677e+11 1.10209754e+11
 1.10209830e+11] 

对于 y:

[ 0.00143858  0.05495827  0.07481739  0.03287334 -0.06275658  0.03744501
 -0.04392341  0.02849104  0.03173781  0.09748282  0.02854265  0.06573162
  0.08215295  0.0240697   0.00931477  0.17572605  0.06783381  0.04853354
 -0.00226023  0.03722596  0.09687121  0.10767829  0.04922701  0.08036865
  0.02371989  0.13885361  0.13903188  0.09910567  0.08793601  0.06048823
  0.03932097  0.04061129  0.03706228  0.13764936  0.14150589  0.12226208
  0.09041878  0.13638676  0.11107155  0.12261369  0.11765545  0.07425344
  0.06643712  0.1449991   0.14256909  0.0924173   0.09291525  0.12216271
  0.11272059  0.07618891  0.16787807  0.07832849  0.10786856  0.12381844
  0.14182937  0.08078092  0.11932429  0.06383649  0.02923562  0.0864741
  0.07806758  0.04514088  0.12929371  0.11769577  0.03619867  0.02811366
  0.06401639  0.06883735  0.01162673  0.0956252   0.11206549  0.0485106
  0.07269545  0.01662149  0.01287365  0.13401546  0.06300487  0.01994627
  0.00721926  0.04863274 -0.01578364  0.0235379   0.03102316  0.00392559
  0.05662182  0.04643381 -0.00665026  0.05532307 -0.01533339  0.04838893
  0.02097954  0.02551123  0.03727188 -0.04001189 -0.04294883  0.02837669
 -0.06062512 -0.0743994  -0.04665618 -0.03553261 -0.07057554 -0.07028277
 -0.07502298 -0.07247965 -0.03540266 -0.03226398 -0.08014487 -0.11907543
 -0.18521053 -0.1117617  -0.14377897 -0.07113503 -0.02480966 -0.07459746
 -0.07994097 -0.02648713 -0.10288478 -0.13328137 -0.08121377 -0.13742166
 -0.024583   -0.11391389 -0.02717251 -0.08876166 -0.04369363 -0.0790144
 -0.09589054 -0.12058701  0.00041344 -0.06646403 -0.06368366 -0.10335613
 -0.04508286 -0.18360729 -0.0551775  -0.06476622 -0.0834523  -0.01276785
 -0.04145486 -0.14549992 -0.11186823 -0.07663398 -0.11920359 -0.0539315
 -0.10507118 -0.09112374 -0.09751319 -0.06848278 -0.09031172 -0.07218853
 -0.03129234 -0.04543539 -0.00942861 -0.06711099 -0.00712202 -0.11696418
 -0.06344093  0.03624227 -0.04798777  0.01174394 -0.08326314 -0.06761215
 -0.12063419 -0.05236908 -0.03914692 -0.05370061 -0.01620056  0.06731788
 -0.06600111 -0.04601257 -0.02144361  0.00256863 -0.00093034  0.00629604
 -0.0252835  -0.00907992  0.03583489 -0.03761906  0.10325763  0.08016437
 -0.04900467  0.0110328   0.05019604 -0.04428984 -0.03208058  0.05095359
 -0.01807463  0.0691733   0.07472691  0.00659871  0.00947692  0.0014422
  0.05227057]

x 中有这么大的偏移量可能没有帮助。在为装配过程移除它时,它绝对有效。看起来像这样:

import matplotlib.pyplot as plt
import numpy as np

scaledx = xdata * 1e-8  - 1100

coefs = np.polyfit( scaledx, ydata, 7)
base = np.poly1d( coefs )
xt = np.linspace( 1.9,2.1,150)
yt = base( xt )
fig = plt.figure()
ax = fig.add_subplot( 2, 1, 1 )
bx = fig.add_subplot( 2, 1, 2 )
ax.scatter( scaledx , ydata )
ax.plot( xt , yt )
bx.plot( scaledx , ydata - base( scaledx ) )

plt.show()

其中 xdataydata 是 OP 数据列表的 numpy 数组。

提供:

插件

关于 条件不佳 的人应该记住线性优化的工作原理是多么简单。如果是多项式,则构建矩阵:

A = [
 [1, x1, x1**2, ...],
 [1, x2, x2**2, ...],
 ...
 [1, xn, xn**2, ...]
]

并且需要 B^(-1) B 的倒数,其中 B = AT.AATA 的转置。现在以 1e11 的顺序查看 x 值,B 将在对角线的一侧具有顺序 1 并且对于二阶多项式顺序 1e44 在另一。因此,在三阶多项式的情况下,情况会变得更糟。因此,在数值上进行逆运算变得不稳定。幸运的是,正如上面所使用的,这可以通过简单地重新缩放手头的问题来轻松解决。