对具有多个值的 python 列表进行阈值处理

Question

好的，我有一个 1000x100 的随机数数组。我想用一个包含多个数字的列表来限制这个列表；这些数字从 [3 到 9]。如果它们高于阈值，我希望将行的总和附加到列表中。

我试过很多方法，包括3次有条件的。现在，我找到了一种将数组与数字列表进行比较的方法，但每次发生这种情况时，我都会再次从该列表中获取随机数。

xpatient=5
sd_healthy=2
xhealthy=7
sd_patient=2
thresholdvalue1=(xpatient-sd_healthy)*10
thresholdvalue2=(((xhealthy+sd_patient))*10)
thresholdlist=[]
x1=[]
Ahealthy=np.random.randint(10,size=(1000,100))
Apatient=np.random.randint(10,size=(1000,100))
TParray=np.random.randint(10,size=(1,61))
def thresholding(A,B): 
    for i in range(A,B):
        thresholdlist.append(i)
        i+=1
thresholding(thresholdvalue1,thresholdvalue2+1)
thresholdarray=np.asarray(thresholdlist)
thedivisor=10
newthreshold=(thresholdarray/thedivisor)
for x in range(61):
    Apatient=np.random.randint(10,size=(1000,100))
    Apatient=[Apatient>=newthreshold[x]]*Apatient
    x1.append([sum(x) for x in zip(*Apatient)])

因此，我的 for 循环由其中的一个随机整数组成，但如果我不这样做，我就看不到每轮的阈值。我希望整个数组的阈值为 3、3.1、3.2 等等。我希望我表达了我的观点。提前致谢

Answer 1

您可以使用以下方法解决您的问题：

import numpy as np

def get_sums_by_threshold(data, threshold, axis): # use axis=0 to sum values along rows, axis=1 - along columns
    result = list(np.where(data >= threshold, data, 0).sum(axis=axis))
    return result

xpatient=5
sd_healthy=2
xhealthy=7
sd_patient=2
thresholdvalue1=(xpatient-sd_healthy)*10
thresholdvalue2=(((xhealthy+sd_patient))*10)

np.random.seed(100) # to keep generated array reproducable
data = np.random.randint(10,size=(1000,100))
thresholds = [num / 10.0 for num in range(thresholdvalue1, thresholdvalue2+1)]

sums = list(map(lambda x: get_sums_by_threshold(data, x, axis=0), thresholds))

但是您应该知道您的初始数组仅包含整数值，对于具有相同整数部分的多个阈值，您将得到相同的结果 (f.e。3.0、3.1、3.2、...、3.9 ).如果你想在你的初始数组中存储从 0 到 9 的浮点数，你可以执行以下操作：

data = np.random.randint(90,size=(1000,100)) / 10.0

对具有多个值的 python 列表进行阈值处理

Thresholding a python list with multiple values

python

arrays

list

threshold