慢 Python 红黑树性能

Slow Python Red Black Tree Performance

下面的Python红黑树实现需要~2s来执行测试函数,这没有多大意义,因为在测试打印出来的情况下树高是14。所以..我假设我在代码中做了一些愚蠢的事情。您能否提供一些加快代码速度的技巧?

import math

BLACK = 0
RED = 1

class RBTreeNode:
    def __init__(self, val):
        self.val = val
        self.left = None
        self.right = None
        self.color = RED


def setColor(n, c):
    n.color = c
    return n


def color(n):
    if not n:
        return BLACK
    return n.color


def isRed(n):
    return color(n) == RED


def rotateLeft(h):
    x = h.right
    h.right = x.left
    x.left = h
    x.color = h.color
    h.color = RED
    return x


def rotateRight(h):
    x = h.left
    h.left = x.right
    x.right = h
    x.color = h.color
    h.color = RED
    return x


def flipColor(n):
    n.left.color = BLACK
    n.right.color = BLACK
    n.color = RED
    return n


def insert(n, v):
    if not n:
        return RBTreeNode(v)
    if v >= n.val:
        n.right = insert(n.right, v)
    else:
        n.left = insert(n.left, v)

    if isRed(n.right) and not isRed(n.left):
        n = rotateLeft(n)
    if isRed(n.left) and isRed(n.left.left):
        n = rotateRight(n)
    if isRed(n.left) and isRed(n.right):
        n = flipColor(n)
    return n


def height(n):
    if not n:
        return 0

    h1 = height(n.left)
    h2 = height(n.right)
    return 1 + max(h1, h2)


def printNode(n, indent):
    if not n:
        return
    print " " * indent,
    (v, c) = n.val
    print v, "(BLACK)" if c == 0 else "(RED)"
    printNode(n.left, indent + 4)
    printNode(n.right, indent + 4)


def test():
    c = xrange(0, 32768)
    tree = None
    for i in c:
        tree = insert(tree, i)
    print 2 * math.log(len(c)) / math.log(2), height(tree)
#    printNode(tree, 0)

if __name__ == "__main__":
    test()

我用 cProfile 快速分析了您的程序。看来,大部分时间都花在了 color 功能上。

vagrant@vagrant-ubuntu-utopic-64:/vagrant$ python -m cProfile tree.py
         4227046 function calls (3801045 primitive calls) in 3.498 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.006    0.006    3.498    3.498 tree.py:1(<module>)
  1835000    1.762    0.000    1.762    0.000 tree.py:17(color)
  1835000    0.602    0.000    2.364    0.000 tree.py:22(isRed)
    32753    0.016    0.000    0.016    0.000 tree.py:25(rotateLeft)
    32752    0.014    0.000    0.014    0.000 tree.py:41(flipColor)
458769/32768    1.071    0.000    3.478    0.000 tree.py:47(insert)
        1    0.000    0.000    0.000    0.000 tree.py:6(RBTreeNode)
    32768    0.013    0.000    0.013    0.000 tree.py:7(__init__)
        1    0.014    0.014    3.492    3.492 tree.py:80(test)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

如果将 if not n: 替换为 if n is None:,您应该会得到不错的加速。

not n 本质上是测试 __nonzero__ is implemented and then calls it. If it is not implemented then __len__ 是否在定义时被调用。

另请注意,PEP 8 表示,与 None 的比较应始终与 is 进行比较。


color 仅由 isRed 函数使用,因此您可能希望完全删除 color 函数或重新实现 isRed 而不使用 `color 以获得额外的加速。

def isRed(n):
    if n is None:
        return False
    else:
        return n.color == RED

通过这个实现,我得到了以下结果

vagrant@vagrant-ubuntu-utopic-64:/vagrant$ python -m cProfile tree_none_no_color.py
         2392046 function calls (1966045 primitive calls) in 0.985 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.006    0.006    0.985    0.985 tree_none_no_color.py:1(<module>)
  1835000    0.286    0.000    0.286    0.000 tree_none_no_color.py:22(isRed)
    32753    0.014    0.000    0.014    0.000 tree_none_no_color.py:28(rotateLeft)
    32752    0.013    0.000    0.013    0.000 tree_none_no_color.py:44(flipColor)
458769/32768    0.646    0.000    0.969    0.000 tree_none_no_color.py:50(insert)
        1    0.000    0.000    0.000    0.000 tree_none_no_color.py:6(RBTreeNode)
    32768    0.011    0.000    0.011    0.000 tree_none_no_color.py:7(__init__)
        1    0.011    0.011    0.980    0.980 tree_none_no_color.py:83(test)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}