运行 不包括零的中位数
running median excluding zeros
我借用了一些代码来计算一个数组的运行中位数。但是对于每个 运行 数组,我想排除零值。下面是代码:
def RunningMedian(seq, M):
seq = iter(seq)
s = []
m = M // 2
# Set up list s (to be sorted) and load deque with first window of seq
s = [item for item in islice(seq, M)]
d = deque(s)
# Simple lambda function to handle even/odd window sizes
median = lambda : s[m] if bool(M&1) else (s[m-1]+s[m]) * 0.5
# Sort it in increasing order and extract the median ("center" of the sorted window)
s.sort()
# remove zeros from the array
s = np.trim_zeros(s)
print s
medians = [median()]
for item in seq:
old = d.popleft() # pop oldest from left
d.append(item) # push newest in from right
del s[bisect_left(s, old)] # locate insertion point and then remove old
insort(s, item) # insert newest such that new sort is not required
s = np.trim_zeros(s)
print s
medians.append(median())
return medians
我正在测试代码,但失败了。我的例子是a = np.array([5 2 0 9 4 2 6 8])
,我调用这个函数RunningMedian(a,3)
。我想要的每个 运行 框是:
[2,5]
[2,9]
[4,9]
[2,4,9]
[2,4,6]
[2,6,8]
然而,在我调用上面的函数后,它给出:
[2, 5]
[2, 9]
[4, 9]
[2, 9]
[2, 6]
[2, 8]
而且它 returns 错误的中值。
调用返回的中位数是:[5, 9, 9, 9, 6, 8]
谁能帮我解决这个问题?谢谢。
尝试:
[s[s!=0] for s in np.dstack((a[:-2], a[1:-1], a[2:]))[0]]
你的代码的主要问题是在 s
中丢掉零会弄乱所使用的对象的长度,这就解释了为什么你最后没有得到 3-length windows .
我建议另一种方法:对 median
使用适当的函数并在本地忽略那些零值。这样它更干净,而且您不需要 trim_zeros
(为此导入 numpy
是非常糟糕的做法)。根据您的功能,我得出以下结论:
from itertools import islice
from collections import deque
from bisect import bisect_left,insort
def median(s):
sp = [nz for nz in s if nz!=0]
print(sp)
Mnow = len(sp)
mnow = Mnow // 2
return sp[mnow] if bool(Mnow&1) else (sp[mnow-1]+sp[mnow])*0.5
def RunningMedian(seq, M):
seq = iter(seq)
s = []
m = M // 2
# Set up list s (to be sorted) and load deque with first window of seq
s = [item for item in islice(seq, M)]
d = deque(s)
## Simple lambda function to handle even/odd window sizes
#median = lambda: s[m] if bool(M&1) else (s[m-1]+s[m])*0.5
# Sort it in increasing order and extract the median ("center" of the sorted window)
s.sort()
medians = [median(s)]
for item in seq:
old = d.popleft() # pop oldest from left
d.append(item) # push newest in from right
del s[bisect_left(s, old)] # locate insertion point and then remove old
insort(s, item) # insert newest such that new sort is not required
medians.append(median(s))
return medians
大部分更改都在新的 median
函数中,我将打印件移到了那里。我还添加了您的进口商品。请注意,我会以非常不同的方式处理这个问题,而且当前的 "fixed" 版本很可能有鸭胶带的味道。
无论如何,它似乎如你所愿:
>>> a = [5, 2, 0, 9, 4, 2, 6, 8]
>>> RunningMedian(a,3)
[2, 5]
[2, 9]
[4, 9]
[2, 4, 9]
[2, 4, 6]
[2, 6, 8]
[3.5, 5.5, 6.5, 4, 4, 6]
在您的版本中,中位数被关闭的原因是 window 的奇偶性是由 M
确定的,输入 window 宽度。如果你丢弃零,你最终会得到更小的(偶数长度)windows。在这种情况下,您不需要中间(=second)元素,但您需要对中间的两个元素进行平均。因此你的错误输出。
我借用了一些代码来计算一个数组的运行中位数。但是对于每个 运行 数组,我想排除零值。下面是代码:
def RunningMedian(seq, M):
seq = iter(seq)
s = []
m = M // 2
# Set up list s (to be sorted) and load deque with first window of seq
s = [item for item in islice(seq, M)]
d = deque(s)
# Simple lambda function to handle even/odd window sizes
median = lambda : s[m] if bool(M&1) else (s[m-1]+s[m]) * 0.5
# Sort it in increasing order and extract the median ("center" of the sorted window)
s.sort()
# remove zeros from the array
s = np.trim_zeros(s)
print s
medians = [median()]
for item in seq:
old = d.popleft() # pop oldest from left
d.append(item) # push newest in from right
del s[bisect_left(s, old)] # locate insertion point and then remove old
insort(s, item) # insert newest such that new sort is not required
s = np.trim_zeros(s)
print s
medians.append(median())
return medians
我正在测试代码,但失败了。我的例子是a = np.array([5 2 0 9 4 2 6 8])
,我调用这个函数RunningMedian(a,3)
。我想要的每个 运行 框是:
[2,5]
[2,9]
[4,9]
[2,4,9]
[2,4,6]
[2,6,8]
然而,在我调用上面的函数后,它给出:
[2, 5]
[2, 9]
[4, 9]
[2, 9]
[2, 6]
[2, 8]
而且它 returns 错误的中值。
调用返回的中位数是:[5, 9, 9, 9, 6, 8]
谁能帮我解决这个问题?谢谢。
尝试:
[s[s!=0] for s in np.dstack((a[:-2], a[1:-1], a[2:]))[0]]
你的代码的主要问题是在 s
中丢掉零会弄乱所使用的对象的长度,这就解释了为什么你最后没有得到 3-length windows .
我建议另一种方法:对 median
使用适当的函数并在本地忽略那些零值。这样它更干净,而且您不需要 trim_zeros
(为此导入 numpy
是非常糟糕的做法)。根据您的功能,我得出以下结论:
from itertools import islice
from collections import deque
from bisect import bisect_left,insort
def median(s):
sp = [nz for nz in s if nz!=0]
print(sp)
Mnow = len(sp)
mnow = Mnow // 2
return sp[mnow] if bool(Mnow&1) else (sp[mnow-1]+sp[mnow])*0.5
def RunningMedian(seq, M):
seq = iter(seq)
s = []
m = M // 2
# Set up list s (to be sorted) and load deque with first window of seq
s = [item for item in islice(seq, M)]
d = deque(s)
## Simple lambda function to handle even/odd window sizes
#median = lambda: s[m] if bool(M&1) else (s[m-1]+s[m])*0.5
# Sort it in increasing order and extract the median ("center" of the sorted window)
s.sort()
medians = [median(s)]
for item in seq:
old = d.popleft() # pop oldest from left
d.append(item) # push newest in from right
del s[bisect_left(s, old)] # locate insertion point and then remove old
insort(s, item) # insert newest such that new sort is not required
medians.append(median(s))
return medians
大部分更改都在新的 median
函数中,我将打印件移到了那里。我还添加了您的进口商品。请注意,我会以非常不同的方式处理这个问题,而且当前的 "fixed" 版本很可能有鸭胶带的味道。
无论如何,它似乎如你所愿:
>>> a = [5, 2, 0, 9, 4, 2, 6, 8]
>>> RunningMedian(a,3)
[2, 5]
[2, 9]
[4, 9]
[2, 4, 9]
[2, 4, 6]
[2, 6, 8]
[3.5, 5.5, 6.5, 4, 4, 6]
在您的版本中,中位数被关闭的原因是 window 的奇偶性是由 M
确定的,输入 window 宽度。如果你丢弃零,你最终会得到更小的(偶数长度)windows。在这种情况下,您不需要中间(=second)元素,但您需要对中间的两个元素进行平均。因此你的错误输出。