使用 numpy 找到与中位数最大差异的索引
find index of largest difference from median with numpy
我正在尝试查找离群值的索引号。基于与中位数的差异
我能够得到正确的高数,但只要低数是离群值,我就只能得到高数..
import numpy as np
def findoutlier(lis):
outliermax = np.absolute(np.max(lis) - np.median(lis))
outliermin = np.absolute(np.min(lis) - np.median(lis))
if outliermax > outliermin:
argmax = np.argmax(lis, axis = 1)
return argmax
else:
argmin = np.argmin(lis, axis = 1)
return argmin
def main():
Matx = np.array([[10,3,2],[1,2,6]])
print(findoutlier(Matx))
threeMatx = np.array([[1,10,2,8,5],[2,7,3,9,11],[19,2,1,1,5]])
print(findoutlier(threeMatx))
main()
使用中值、最大值和最小值时需要指定坐标轴:
import numpy as np
def findoutlier(lis):
omaxs = np.absolute(np.max(lis, axis=1) - np.median(lis, axis=1))
omins = np.absolute(np.min(lis, axis=1) - np.median(lis, axis=1))
return [np.argmax(l) if omax > omin else np.argmin(l) for omax, omin, l in zip(omaxs, omins, lis)]
def main():
mat_x = np.array([[10, 3, 2], [1, 2, 6]])
print(findoutlier(mat_x))
three_mat_x = np.array([[1, 10, 2, 8, 5], [2, 7, 3, 9, 11], [19, 2, 1, 1, 5]])
print(findoutlier(three_mat_x))
输出
[0, 2]
[1, 0, 0]
更新
如@user3483203 所述,您可以使用 numpy.where:
import numpy as np
def findoutlier(lis):
omaxs = np.absolute(np.max(lis, axis=1) - np.median(lis, axis=1))
omins = np.absolute(np.min(lis, axis=1) - np.median(lis, axis=1))
return np.where(omaxs > omins, np.argmax(lis, axis=1), np.argmin(lis, axis=1))
def main():
mat_x = np.array([[10, 3, 2], [1, 2, 6]])
print(findoutlier(mat_x))
three_mat_x = np.array([[1, 10, 2, 8, 5], [2, 7, 3, 9, 11], [19, 2, 1, 1, 5]])
print(findoutlier(three_mat_x))
main()
输出
[0 2]
[1 0 0]
我正在尝试查找离群值的索引号。基于与中位数的差异 我能够得到正确的高数,但只要低数是离群值,我就只能得到高数..
import numpy as np
def findoutlier(lis):
outliermax = np.absolute(np.max(lis) - np.median(lis))
outliermin = np.absolute(np.min(lis) - np.median(lis))
if outliermax > outliermin:
argmax = np.argmax(lis, axis = 1)
return argmax
else:
argmin = np.argmin(lis, axis = 1)
return argmin
def main():
Matx = np.array([[10,3,2],[1,2,6]])
print(findoutlier(Matx))
threeMatx = np.array([[1,10,2,8,5],[2,7,3,9,11],[19,2,1,1,5]])
print(findoutlier(threeMatx))
main()
使用中值、最大值和最小值时需要指定坐标轴:
import numpy as np
def findoutlier(lis):
omaxs = np.absolute(np.max(lis, axis=1) - np.median(lis, axis=1))
omins = np.absolute(np.min(lis, axis=1) - np.median(lis, axis=1))
return [np.argmax(l) if omax > omin else np.argmin(l) for omax, omin, l in zip(omaxs, omins, lis)]
def main():
mat_x = np.array([[10, 3, 2], [1, 2, 6]])
print(findoutlier(mat_x))
three_mat_x = np.array([[1, 10, 2, 8, 5], [2, 7, 3, 9, 11], [19, 2, 1, 1, 5]])
print(findoutlier(three_mat_x))
输出
[0, 2]
[1, 0, 0]
更新
如@user3483203 所述,您可以使用 numpy.where:
import numpy as np
def findoutlier(lis):
omaxs = np.absolute(np.max(lis, axis=1) - np.median(lis, axis=1))
omins = np.absolute(np.min(lis, axis=1) - np.median(lis, axis=1))
return np.where(omaxs > omins, np.argmax(lis, axis=1), np.argmin(lis, axis=1))
def main():
mat_x = np.array([[10, 3, 2], [1, 2, 6]])
print(findoutlier(mat_x))
three_mat_x = np.array([[1, 10, 2, 8, 5], [2, 7, 3, 9, 11], [19, 2, 1, 1, 5]])
print(findoutlier(three_mat_x))
main()
输出
[0 2]
[1 0 0]