Python:使用 Max-Heap 和 Min-Heap 求 运行 中位数
Python: Find running median with Max-Heap and Min-Heap
我正在尝试 return 一系列流媒体数字的 运行 中位数。为此,我使用最大堆(将值存储在序列的下半部分)和最小堆(将值存储在序列的上半部分)。
特别是我使用了来自 heapq 模块 (https://docs.python.org/2/library/heapq.html) 的 Python (2.0) 内置最小堆数据结构。为了构建最大堆,我只需使用我需要推入堆中的数字的负数。
我的 Python 代码如下:
import heapq
maxh = []
minh = []
vals=[1,2,3,4,5,6,7,8,9,10]
for val in vals:
# Initialize the data-structure and insert/push the 1st streaming value
if not maxh and not minh:
heapq.heappush(maxh,-val)
print float(val)
elif maxh:
# Insert/push the other streaming values
if val>-maxh[0]:
heapq.heappush(minh,val)
elif val<-maxh[0]:
heapq.heappush(maxh,-val)
# Calculate the median
if len(maxh)==len(minh):
print float(-maxh[0]+minh[0])/2
elif len(maxh)==len(minh)+1:
print float(-maxh[0])
elif len(minh)==len(maxh)+1:
print float(minh[0])
# If min-heap and max-heap grow unbalanced we rebalance them by
# removing/popping one element from a heap and inserting/pushing
# it into the other heap, then we calculate the median
elif len(minh)==len(maxh)+2:
heapq.heappush(maxh,-heapq.heappop(minh))
print float(-maxh[0]+minh[0])/2
elif len(maxh)==len(minh)+2:
heapq.heappush(minh,-heapq.heappop(maxh))
print float(-maxh[0]+minh[0])/2
下面是我为检查我的代码而构建的完整测试用例列表:
vals=[1,2,3,4,5,6,7,8,9,10] # positive numbers, increasing series
vals=[10,9,8,7,6,5,4,3,2,1] # positive numbers, decreasing series
vals=[10,9,11,8,12,7,13,6,14,5] # positive numbers, jumping series (keeping
# heaps balanced)
vals=[-10,-9,-8,-7,-6,-5,-4,-3,-2,-1] # negative numbers, increasing series
vals=[-1,-2,-3,-4,-5,-6,-7,-8,-9,-10] # negative numbers, decreasing series
vals=[-10,-9,-11,-8,-12,-7,-13,-6,-14,-5] # negative numbers
# jumping series (keeping heaps
# balanced)
vals=[-5,-4,-3,-2,-1,0,1,2,3,4,5] # mixed positive-negative numbers,
# increasing series
vals=[5,4,3,2,1,0,-1,-2,-3,-4,-5] # mixed positive-negative numbers,
# decreasing series
vals=[0,-1,1,-2,2,-3,3,-4,4,-5,5] # mixed positive-negative numbers,
# jumping series (keeping heaps balanced)
我的代码对我来说似乎没问题,但我无法通过在线评判 (https://www.hackerrank.com/challenges/ctci-find-the-running-median/problem) 的 10 个测试用例中的 4 个。
你有什么提示吗?
问题在这里:
# Insert/push the other streaming values
if val>-maxh[0]:
heapq.heappush(minh,val)
elif val<-maxh[0]:
heapq.heappush(maxh,-val)
如果 val == maxh[0]
,则该项目永远不会被推入任一堆。您应该能够通过测试用例 [1,1,2]
.
揭示错误
一个简单的修复方法是:
# Insert/push the other streaming values
if val >= -maxh[0]:
heapq.heappush(minh,val)
else
heapq.heappush(maxh,-val)
我正在尝试 return 一系列流媒体数字的 运行 中位数。为此,我使用最大堆(将值存储在序列的下半部分)和最小堆(将值存储在序列的上半部分)。
特别是我使用了来自 heapq 模块 (https://docs.python.org/2/library/heapq.html) 的 Python (2.0) 内置最小堆数据结构。为了构建最大堆,我只需使用我需要推入堆中的数字的负数。
我的 Python 代码如下:
import heapq
maxh = []
minh = []
vals=[1,2,3,4,5,6,7,8,9,10]
for val in vals:
# Initialize the data-structure and insert/push the 1st streaming value
if not maxh and not minh:
heapq.heappush(maxh,-val)
print float(val)
elif maxh:
# Insert/push the other streaming values
if val>-maxh[0]:
heapq.heappush(minh,val)
elif val<-maxh[0]:
heapq.heappush(maxh,-val)
# Calculate the median
if len(maxh)==len(minh):
print float(-maxh[0]+minh[0])/2
elif len(maxh)==len(minh)+1:
print float(-maxh[0])
elif len(minh)==len(maxh)+1:
print float(minh[0])
# If min-heap and max-heap grow unbalanced we rebalance them by
# removing/popping one element from a heap and inserting/pushing
# it into the other heap, then we calculate the median
elif len(minh)==len(maxh)+2:
heapq.heappush(maxh,-heapq.heappop(minh))
print float(-maxh[0]+minh[0])/2
elif len(maxh)==len(minh)+2:
heapq.heappush(minh,-heapq.heappop(maxh))
print float(-maxh[0]+minh[0])/2
下面是我为检查我的代码而构建的完整测试用例列表:
vals=[1,2,3,4,5,6,7,8,9,10] # positive numbers, increasing series
vals=[10,9,8,7,6,5,4,3,2,1] # positive numbers, decreasing series
vals=[10,9,11,8,12,7,13,6,14,5] # positive numbers, jumping series (keeping
# heaps balanced)
vals=[-10,-9,-8,-7,-6,-5,-4,-3,-2,-1] # negative numbers, increasing series
vals=[-1,-2,-3,-4,-5,-6,-7,-8,-9,-10] # negative numbers, decreasing series
vals=[-10,-9,-11,-8,-12,-7,-13,-6,-14,-5] # negative numbers
# jumping series (keeping heaps
# balanced)
vals=[-5,-4,-3,-2,-1,0,1,2,3,4,5] # mixed positive-negative numbers,
# increasing series
vals=[5,4,3,2,1,0,-1,-2,-3,-4,-5] # mixed positive-negative numbers,
# decreasing series
vals=[0,-1,1,-2,2,-3,3,-4,4,-5,5] # mixed positive-negative numbers,
# jumping series (keeping heaps balanced)
我的代码对我来说似乎没问题,但我无法通过在线评判 (https://www.hackerrank.com/challenges/ctci-find-the-running-median/problem) 的 10 个测试用例中的 4 个。
你有什么提示吗?
问题在这里:
# Insert/push the other streaming values
if val>-maxh[0]:
heapq.heappush(minh,val)
elif val<-maxh[0]:
heapq.heappush(maxh,-val)
如果 val == maxh[0]
,则该项目永远不会被推入任一堆。您应该能够通过测试用例 [1,1,2]
.
一个简单的修复方法是:
# Insert/push the other streaming values
if val >= -maxh[0]:
heapq.heappush(minh,val)
else
heapq.heappush(maxh,-val)