带面具去趋势 Python

Question

我有一个 file，我阅读如下：

data1 = np.loadtxt('lc1.out') 
x = data1[:, 0]
y = data1[:, 1]

我想去趋势化它，我发现一个非常有用的 link 。

model = np.polyfit(x, y, 2)
predicted = np.polyval(model, x)

无论如何，我想屏蔽一部分数据，例如我将只使用屏蔽外的点来拟合。例如，我只想使用小于 639.5 和大于 641.5 的数据以及二阶多项式拟合。

我有使用 ma.masked_outside(x, 639.5, 641.5) 的想法，这样可以很容易地在数组中保存掩码外的元素...但我不明白如何使用 polyfit.

Answer 1

在您的用例中使用掩码数组可能不是很难的理由，除非是出于性能原因或进一步使用掩码。因此，我将展示如何使用和不使用掩码数组。

但让我们先不加掩蔽地去除趋势，以便有参考：

无掩蔽的二阶多项式去趋势

import numpy as np
import matplotlib.pyplot as plt

data1 = np.loadtxt('lc1.out')
x, y = data1.T

fig = plt.figure()

plt.subplot(2, 1, 1)
plt.title('polyfit, original data set')
plt.plot(x, y, 'c.')

coeff = np.polyfit(x, y, 2)

# no need to use the original x values here just for visualizing the polynomial
x_poly = np.linspace(x.min(), x.max())
y_poly = np.polyval(coeff, x_poly)
plt.plot(x_poly, y_poly, 'r-', linewidth=3)

mid = len(x_poly) // 2
plt.annotate('y = {:.7g} x\xB2 + {:.7g} x + {:.7g}'.format(*coeff),
             (x_poly[mid], y_poly[mid]), (0, 48), textcoords='offset points',
             arrowprops={'arrowstyle': '->'}, horizontalalignment='center')

plt.subplot(2, 1, 2)
plt.title('detrended')

# we need the original x values here, so we can remove the trend from all points
trend = np.polyval(coeff, x)
# note that simply subtracting the trend might not be enough for other data sets
plt.plot(x, y - trend, 'b.')
fig.show()

记下多项式的系数。

二阶多项式去趋势，选择重要点

我们可以简单地创建新的 x 和 y 数组，它们只包含想要的点。这里出错的可能性较小。

这分 3 个步骤进行。首先，我们在感兴趣的数组上使用比较运算符，这会导致 bool 数组在比较为真的索引处具有 'True' 值，在其他任何地方都具有 'False' 值。

然后我们将 bool 数组放入 'np.where()'，这将导致一个包含所有索引号作为值的数组，其中 bool 数组具有 'True' 值，i。 e.它正在回答这个问题："Where is my array truthy?"

最后我们仔细阅读 Numpy 的高级索引并将我们的索引结果数组作为索引应用到 x 和 y 数组中，这过滤掉所有不需要的索引。

import numpy as np
import matplotlib.pyplot as plt

data1 = np.loadtxt('lc1.out')
x, y = data1.T
select = np.where((x < 640.75) | (x > 641.25))
x_selection = x[select]  # numpy advanced indexing
y_selection = y[select]  # numpy advanced indexing

fig = plt.figure()

plt.subplot(2, 1, 1)
plt.title('polyfit, selecting significant points')
plt.plot(x_selection, y_selection, 'c.')

coeff = np.polyfit(x_selection, y_selection, 2)

# no need to use the original x values here just for visualizing the polynomial
x_poly = np.linspace(x_selection.min(), x_selection.max())
y_poly = np.polyval(coeff, x_poly)
plt.plot(x_poly, y_poly, 'r-', linewidth=3)

mid = len(x_poly) // 2
plt.annotate('y = {:.7g} x\xB2 + {:.7g} x + {:.7g}'.format(*coeff),
             (x_poly[mid], y_poly[mid]), (0, 48), textcoords='offset points',
             arrowprops={'arrowstyle': '->'}, horizontalalignment='center')

plt.subplot(2, 1, 2)
plt.title('detrended')

# we need the original x values here, so we can remove the trend from all points
trend = np.polyval(coeff, x)
# note that simply subtracting the trend might not be enough for other data sets
plt.plot(x, y - trend, 'b.')
fig.show()

正如预期的那样，系数现在不同了。

二阶多项式去除趋势，屏蔽不需要的点

当然我们也可以使用掩码数组。注意相反的逻辑：屏蔽点是我们不想要的。在示例数据中，我们不想要区间内的点，我们使用 ma.masked_inside().

如果出于性能原因我们希望避免创建原始数组的完整副本，我们可以使用关键字copy=False。制作原始数组 read-only 可以避免我们通过改变原始数组而意外更改屏蔽数组中的值。

对于掩码数组，我们需要使用 numpy.ma 子模块中的 polyfit() 函数版本，它正确地忽略了掩码版本 x[ 中不需要的值=64=] 以及未屏蔽的伴随数组 y。如果我们不这样做，我们就会得到错误的结果。请注意，这是一个容易犯的错误，因此如果可以的话，我们可能希望坚持使用其他方法。

import numpy as np import numpy.ma as ma import matplotlib.pyplot as plt data1 = np.loadtxt('lc1.out') x, y = data1.T x.flags.writeable = False # safety measure, as we don't copy x_masked = ma.masked_inside(x, 640.75, 641.25, copy=False) fig = plt.figure() plt.subplot(2, 1, 1) plt.title('polyfit, masking unwanted points') plt.plot(x_masked, y, 'c.') coeff = ma.polyfit(x_masked, y, 2) # no need to use the original x values here just for visualizing the polynomial x_poly = np.linspace(x_masked.min(), x_masked.max()) y_poly = np.polyval(coeff, x_poly) plt.plot(x_poly, y_poly, 'r-', linewidth=3) mid = len(x_poly) // 2 plt.annotate('y = {:.7g} x\xB2 + {:.7g} x + {:.7g}'.format(*coeff), (x_poly[mid], y_poly[mid]), (0, 48), textcoords='offset points', arrowprops={'arrowstyle': '->'}, horizontalalignment='center') plt.subplot(2, 1, 2) plt.title('detrended') # we need the original x values here, so we can remove the trend from all points trend = np.polyval(coeff, x) # note that simply subtracting the trend might not be enough for other data sets plt.plot(x, y - trend, 'b.') fig.show()

系数和另一种方法一样，很好。如果我们错误地使用了 np.polyfit()，我们最终会得到与未屏蔽参考中相同的系数。

带面具去趋势 Python

Detrend with mask Python

python

numpy

curve-fitting

无掩蔽的二阶多项式去趋势

二阶多项式去趋势，选择重要点

二阶多项式去除趋势，屏蔽不需要的点