List (data) and returns 仅包含 |x−μ|<M α_μ 的那些条目 x 的列表（并且出现的顺序与它们在数据中出现的顺序相同）

Question

这个问题让我对如何解释 x 有点困惑，到目前为止我做了如下：

import numpy as np

def keep(data, M):

    array_DATA = np.array(data)
    
    N = len(data)
    std = np.std(data)
    
    stderr = (std)/(np.sqrt(N-1))
    mean = (sum(data))/N
    
    L = abs((array_DATA)-(mean))
    R = (M)*(stderr)
    
    for i in range(0, N):
        if L < R:
            x = np.append(x, data[i])
    return (list(x))

对于标准误差函数，有人可以帮我解决这个问题吗？

Answer 1

正如 Tim Roberts 评论的那样，您最终选择的数据点存在错误，您的代码几乎就在那里，但不完全是，看起来您混淆了 python list 和numpy.array，它具有条件选择的强大功能（您可以在此处阅读更多相关信息），因此仅更改 x = arr[L < R] 的 for 循环将按照您的预期进行，所有元素L 与 R 进行比较，结果数组用于索引您的 arr 变量，并选择与 True 匹配的元素。

我还想评论一下，由于您导入了 numpy 并从输入数据中生成了 numpy.array，因此您使用它的次数更多，因为您的代码可能会运行很多对于更大的数据集更快，因为 numpy.array 是矢量化的。

因此，对代码的最终数据选择做一点改动

import numpy as np


def keep(data, M):

    arr = np.array(data)
    
    N = len(arr)
    std = arr.std()
    
    stderr = std / np.sqrt(N - 1)
    mean = arr.sum() / N
    
    L = abs(arr - mean)
    R = M * stderr

    x = arr[L < R]

    return list(x)


print(keep(range(6), 2))

如果您不想使用 numpy.array 而是坚持使用 python list，此代码会执行相同的计算

def keep_list_based(data, M):

    N = len(data)
    std = np.std(data)
    
    stderr = std / np.sqrt(N - 1)
    mean = sum(data) / N
    
    L = [abs(point - mean) for point in data]
    R = M * stderr

    # equivalent to x = data[L < R]
    bool_mask = [y < R for y in L]
    x = [point for point, passed_condition in zip(data, bool_mask) if passed_condition]
    
    return x


print(keep_list_based(list(range(6)), 2))

这里bool_mask = [y < R for y in L]对满足条件的点做一个布尔列表，x = [point for point, passed_condition in zip(data, bool_mask) if passed_condition]根据前面的布尔列表过滤data中的点，用zip 将两个列表配对。

List (data) and returns 仅包含 |x−μ|<M α_μ 的那些条目 x 的列表（并且出现的顺序与它们在数据中出现的顺序相同）

List (data) and returns a list of only those entries x of data for which |x−μ|<M α_μ (and that occur in the same order as they occur in the data)

python

arrays

numpy

physics