曲线适合每个序列号吗？

Question

我正在尝试对每个序列号进行曲线拟合。我的第一个想法是用一个 group by 或一个列表来做，然后检查我的 df 列表中的数字。但是无论我能想到什么方法，我都必须手动输入序列号。有什么方法可以检查我的色谱柱序列号，并使曲线适合它找到的第一个序列号，然后是第二个序列号，依此类推？

这是我的 df 的一部分：

    Date    Hour    Minute  Second  Seriel number   mrwSmpVWi   mrwSmpP
0   04.06.2019  0   0   1   701086  4.2 51
1   04.06.2019  0   0   1   701092  4.6 75
2   04.06.2019  0   0   1   701088  4.3 58
3   04.06.2019  0   0   2   701085  4.2 52
4   04.06.2019  0   0   2   701091  4.5 71
5   04.06.2019  0   0   2   701089  4.3 59
6   04.06.2019  0   0   3   701087  4.0 56
7   04.06.2019  0   0   4   701090  3.8 44
8   04.06.2019  0   10  0   701092  4.3 58
9   04.06.2019  0   10  0   701086  4.3 59
10  04.06.2019  0   10  1   701088  4.4 63
11  04.06.2019  0   10  1   701085  4.4 65
12  04.06.2019  0   10  1   701091  4.5 71
13  04.06.2019  0   10  2   701089  4.5 69
14  04.06.2019  0   10  3   701087  4.4 71
15  04.06.2019  0   10  4   701090  3.5 34
16  04.06.2019  0   20  0   701092  4.3 64
17  04.06.2019  0   20  1   701086  4.4 69
18  04.06.2019  0   20  1   701088  4.3 63
19  04.06.2019  0   20  1   701091  4.5 73
20  04.06.2019  0   20  1   701085  4.2 61
21  04.06.2019  0   20  2   701089  4.4 71

这就是我想要进行曲线拟合的方式：

x=ohlala.T.iloc[5]
y=ohlala.T.iloc[6]

def logifunc(x,c,a,b):
    return c / (1 + (a) * np.exp(-b*(x)))

result, pcov = curve_fit(logifunc, x, y, p0=[110,400,-2])

我的目标是得到一个像图片一样的df

Answer 1

数据按 Serial 分组，然后传送到 data_fit 函数。返回的拟合值被加载到 3 个单独的列中。

输入：

   Serial  mrwSmpVWi  mrwSmpP
0  701086        4.2       52
1  701087        4.3       61
2  701086        4.5       34
3  701087        3.2       22
4  701086        2.5       23
5  701087        4.2       34

代码：

from scipy.optimize import curve_fit
import pandas as pd
import numpy as np


df = pd.DataFrame({'Serial': [701086, 701087, 701086, 701087, 701086, 701087], 'mrwSmpVWi': [4.2, 4.3, 4.5, 3.2, 2.5, 4.2], 'mrwSmpP': [52, 61, 34, 22,23, 34]})


def logifunc(x,c,a,b):
    return c / (1 + (a) * np.exp(-b*(x)))


def data_fit(row, x, y):
    result, pcov = curve_fit(logifunc, x.values, y.values)
    row['a'] = result[0]
    row['b'] = result[1]
    row['c'] = result[2]
    return row

grouped_data = df.groupby('Serial')

for group_name, grouped_df in grouped_data:
    x = grouped_df['mrwSmpVWi']
    y = grouped_df['mrwSmpP']
    if x.shape[0] >= 3:
        df = df.apply(data_fit, args=(x,y), axis=1)

输出：

     Serial  mrwSmpVWi  mrwSmpP     a           b          c
0  701086.0        4.2     52.0  39.0  548.084806  16.941529
1  701087.0        4.3     61.0  39.0  548.084806  16.941529
2  701086.0        4.5     34.0  39.0  548.084806  16.941529
3  701087.0        3.2     22.0  39.0  548.084806  16.941529
4  701086.0        2.5     23.0  39.0  548.084806  16.941529
5  701087.0        4.2     34.0  39.0  548.084806  16.941529

曲线适合每个序列号吗？

Do a curve fit for every serial number?

python

matplotlib

curve-fitting

scipy

pandas