如何更正确地近似点

How to approximate points more correctly

我正在尝试近似我的数据,但我需要一条更平滑的线,我该如何实现它?

import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
import numpy as np

m_x = [0.22, 0.29, 0.38, 0.52, 0.55, 0.67, 0.68, 0.74, 0.83, 1.05, 1.06, 1.19, 1.26, 1.32, 1.37, 1.38, 1.46, 1.51, 1.61, 1.62, 1.66, 1.87, 1.93, 2.01, 2.09, 2.24, 2.26, 2.3, 2.33, 2.41, 2.44, 2.51, 2.53, 2.58, 2.64, 2.65, 2.76, 3.01, 3.17, 3.21, 3.24, 3.3, 3.42, 3.51, 3.67, 3.72, 3.74, 3.83, 3.84, 3.86, 3.95, 4.01, 4.02, 4.13, 4.28, 4.36, 4.4]
m_y = [3.96, 4.21, 2.48, 4.77, 4.13, 4.74, 5.06, 4.73, 4.59, 4.79, 5.53, 6.14, 5.71, 5.96, 5.31, 5.38, 5.41, 4.79, 5.33, 5.86, 5.03, 5.35, 5.29, 7.41, 5.56, 5.48, 5.77, 5.52, 5.68, 5.76, 5.99, 5.61, 5.78, 5.79, 5.65, 5.57, 6.1, 5.87, 5.89, 5.75, 5.89, 6.1, 5.81, 6.05, 8.31, 5.84, 6.36, 5.21, 5.81, 7.88, 6.63, 6.39, 5.99, 5.86, 5.93, 6.29, 6.07]
x = np.array(m_x)
y = np.array(m_y)

plt.plot(x, y, 'ro', ms = 5)
plt.show()

spl = interp1d(x, y, fill_value = 'extrapolate')
xs = np.linspace(-3, 3, 1000)
plt.plot(xs, spl(xs), 'g', lw = 3)
plt.axis([0, 5, 2, 10])
plt.show()

行数据:


我需要:


节目制作:


UPD:除其他事项外,我需要访问结果曲线的 所有值 ,并将其外推到 的左侧y轴,向右到图尾


Seaborn 的 lmplot 将拟合曲线并显示置信区间。它接受一个 order 参数,这将允许您进行非线性拟合。阶数越高,拟合越复杂。

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

m_x = [0.22, 0.29, 0.38, 0.52, 0.55, 0.67, 0.68, 0.74, 0.83, 1.05, 1.06, 1.19, 1.26, 1.32, 1.37, 1.38, 1.46, 1.51, 1.61, 1.62, 1.66, 1.87, 1.93, 2.01, 2.09, 2.24, 2.26, 2.3, 2.33, 2.41, 2.44, 2.51, 2.53, 2.58, 2.64, 2.65, 2.76, 3.01, 3.17, 3.21, 3.24, 3.3, 3.42, 3.51, 3.67, 3.72, 3.74, 3.83, 3.84, 3.86, 3.95, 4.01, 4.02, 4.13, 4.28, 4.36, 4.4]
m_y = [3.96, 4.21, 2.48, 4.77, 4.13, 4.74, 5.06, 4.73, 4.59, 4.79, 5.53, 6.14, 5.71, 5.96, 5.31, 5.38, 5.41, 4.79, 5.33, 5.86, 5.03, 5.35, 5.29, 7.41, 5.56, 5.48, 5.77, 5.52, 5.68, 5.76, 5.99, 5.61, 5.78, 5.79, 5.65, 5.57, 6.1, 5.87, 5.89, 5.75, 5.89, 6.1, 5.81, 6.05, 8.31, 5.84, 6.36, 5.21, 5.81, 7.88, 6.63, 6.39, 5.99, 5.86, 5.93, 6.29, 6.07]
x = np.array(m_x)
y = np.array(m_y)

df = pd.DataFrame({'x':x,'y':y})
sns.lmplot(x='x',y='y', data=df, order=2)

另外,如果您知道您的数据有某种趋势(如对数趋势),您可以将数据转换为一条线并找到该线的回归系数:

a = np.polyfit(np.log(x), y, 1)
y = a[0] * np.log(x) + a[1]

然后

plt.plot(x, y, 'g', lw = 3)

您可以对数据执行多项式拟合以获得更平滑的线

d = 10

xd = np.hstack([x2**i for i in range(d+1)])

theta = np.linalg.inv(xd.T @ xd) @ xd.T @ y
plt.plot(x, xd @ theta)

您可以更改 d 的值以获得不同的行

EDIT:

这里有一个更简单的方法

d = 10

theta = np.polyfit(x, y, deg= d)
model = np.poly1d(theta2)

plt.plot(x, y, 'ro')
plt.plot(x, model(x))

是的,您可以使用此方法计算增量值

delta = y - model(x)

平滑数据的一种非常标准的方法是使用平滑 window(与卷积相同)。基本上,指定大小的 window 在您的数据和每个数据点滚动,每个点都替换为该点周围数据点的平均值(即 window 内)。下面是一个使用 numpy 的实现。有几个选项可以处理边缘效应。我在这里使用统一的 window,但是你的 window 也可能看起来像高斯分布。

import numpy as np

def smooth_moving_window(l, window_len=11, include_edges='Off'):

    if window_len%2==0:
        raise ValueError('>window_len< kwarg in function >smooth_moving_window< must be odd')

    # print l
    l = np.array(l,dtype=float)
    w = np.ones(window_len,'d')

    if include_edges == 'On':
        edge_list = np.ones(window_len)
        begin_list = [x * l[0] for x in edge_list]
        end_list = [x * l[-1] for x in edge_list]
    
        s = np.r_[begin_list, l, end_list]
    
        y = np.convolve(w/w.sum(), s , mode='same')
        y = y[window_len + 1:-window_len + 1]
    
    elif include_edges == 'Wrap':
        s=np.r_[2 * l[0] - l[window_len-1::-1], l, 2 * l[-1] - l[-1:-window_len:-1]]
        y = np.convolve(w/w.sum(), s , mode='same')
        y = y[window_len:-window_len+1]

    elif include_edges == 'Off':
        y = np.convolve(w/w.sum(), l, mode='valid')

    else:
        raise NameError('Error in >include_edges< kwarg of function >smooth_moving_window<')

    return y