找到拐点：将梯度样条连接到原始数据

Question

我的最终目标是确定这两个主要峰的拐点。因此，我想对数据进行样条拟合，然后以某种方式找到拐点。

t, c, k = interpolate.splrep(df_sensors_100ppm["Measurement_no"], np.gradient(df_sensors_100ppm["101"]), 
                             s=len(df_sensors_100ppm["Measurement_no"]), k=3)

N = 500
xmin, xmax = df_sensors_100ppm["Measurement_no"].min(), df_sensors_100ppm["Measurement_no"].max()
xx = np.linspace(xmin, xmax, N)
spline = interpolate.BSpline(t, c, k, extrapolate=False)

plt.plot(df_sensors_100ppm["Measurement_no"], df_sensors_100ppm["101"], 'bo', label='Original points')
plt.plot(df_sensors_100ppm["Measurement_no"], df_sensors_100ppm["101"], '-', label='', alpha = 0.3)
plt.plot(xx, spline(xx), 'r', label='BSpline')
plt.grid()
plt.legend(loc='best')
plt.show()

max_idx = np.argmax(spline(xx))
> 336

我的问题是我不知道这个数字 336 代表什么。我认为这将是梯度最高的数据点。但是只有 61 个数据点。如何将渐变样条与我的数据点连接起来以找到我要查找的数据点？拐点不落在数据点的问题并不重要，所以我很高兴旁边有一个数据点。我想我也不需要数据点的确切编号（在范围上方的 x 轴上是从 6830 到 ~6890）。因此，要么是这个编号，要么只是从零开始的数据点编号。感谢您的帮助！

df_sensors_100ppm
Measurement_no 101
6833    1081145.8
6834    1071195.6
6835    1061668.0
6836    841877.0
6837    227797.5
6838    154449.2
6839    130070.3
6840    119169.5
6841    113275.4
6842    92762.5
6843    103557.7
6844    324869.6
6845    318933.3
6846    275562.4
6847    243599.4
6848    220276.8
6849    203228.2
6850    189876.8
6851    178849.3
6852    169680.8
6853    162223.4
6854    156308.3
6855    151195.9
6856    147203.1
6857    143907.5
6858    141076.7
6859    138626.1
6860    136471.3
6861    134422.2
6862    132542.0
6863    130661.8
6864    128845.0
6865    126880.3
6866    125084.6
6867    123162.2
6868    121282.0
6869    119275.1
6870    117352.7
6871    115219.0
6872    113402.2
6873    111353.0
6874    94959.5
6875    102269.0
6876    327911.7
6877    318193.9
6878    273175.2
6879    241212.2
6880    218354.3
6881    201073.4
6882    187806.5
6883    176821.2
6884    167864.0
6885    160406.6
6886    154385.8
6887    149653.7
6888    145851.1
6889    142534.4
6890    139893.7
6891    137464.2
6892    135246.0
6893    133239.1
6894    131422.3
6895    129499.9
6896    127577.5

Answer 1

不需要构造数据的梯度，可以将样条传递给数据，使用derivative方法。对于不需要平滑的数据（因为它通过所有点），我个人更喜欢 InterpolatedUnivariateSpline：

x, y = df_sensors_100ppm["Measurement_no"], df_sensors_100ppm["101"]

from scipy.interpolate import splprep, BSpline, InterpolatedUnivariateSpline as IUS

spline = IUS(x, y)
N=500
xx = np.linspace(x.min(), x.max(), N)

import matplotlib.pyplot as plt
plt.plot(x, y, 'go')
plt.plot(xx, spline(xx))
plt.plot(xx, spline.derivative()(xx))

# np.argsort will give the positions of the sorted array from min to max, in your case you want the latter two 

x[np.argsort(spline.derivative()(x))[-2:]]
>>array([6843., 6875.])

Answer 2

回答您明确的问题：

max_idx = np.argmax(spline(xx)) 是 336 表示 linspace 的索引，即 xx[336] 即 6875.42.

找到拐点：将梯度样条连接到原始数据

Find inflection point : Connect gradient spline to original data

python

spline