为什么 python 字典更新慢得离谱?

why python dict update insanely slow?

我有一个 python 程序,它从文件中读取行并将它们放入字典中,简单来说,它看起来像:

data = {'file_name':''}
with open('file_name') as in_fd:
    for line in in_fd:
        data['file_name'] += line

我发现它花了几个小时才完成。

然后,我对程序做了一点改动:

data = {'file_name':[]}
with open('file_name') as in_fd:
    for line in in_fd:
        data['file_name'].append(line)
    data['file_name'] = ''.join(data['file_name'])

它在几秒钟内完成。

我以为是+=让程序变慢了,看来不是。请看下面的测试结果

我知道我们可以使用列表附加和连接来提高连接字符串时的性能。但是没想到append和joinadd和assign之间有这么大的性能差距。

所以我决定再做一些测试,最后发现是字典更新操作让程序慢得离谱。这是一个脚本:

import time
LOOPS = 10000
WORD = 'ABC'*100

s1=time.time()
buf1 = []
for i in xrange(LOOPS):
    buf1.append(WORD)
ss = ''.join(buf1)

s2=time.time()
buf2 = ''
for i in xrange(LOOPS):
    buf2 += WORD

s3=time.time()
buf3 = {'1':''}
for i in xrange(LOOPS):
    buf3['1'] += WORD

s4=time.time()
buf4 = {'1':[]}
for i in xrange(LOOPS):
    buf4['1'].append(WORD)
buf4['1'] = ''.join(buf4['1'])

s5=time.time()
print s2-s1, s3-s2, s4-s3, s5-s4

在我的笔记本电脑中(mac pro 2013 mid,OS X 10.9.5,cpython 2.7.10),它的输出是:

0.00299620628357 0.00415587425232 3.49465799332 0.00231599807739

受juanpa.arrivillaga评论的启发,我对第二个循环做了一点改动:

trivial_reference = []
buf2 = ''
for i in xrange(LOOPS):
    buf2 += WORD
    trivial_reference.append(buf2)  # add a trivial reference to avoid optimization

更改后,现在第二个循环需要 19 秒才能完成。所以这似乎只是一个优化问题,正如juanpa.arrivillaga所说。

+= 在构建大型字符串时表现非常糟糕,但在 CPython 中的一种情况下可以有效。 如下所述

为了确保更快的字符串连接,请使用 str.join()


来自 String Concatenation section under Python Performance Tips:

避免这种情况:

s = ""
for substring in list:
    s += substring

改用s = "".join(list)。前者是构建大型字符串时非常常见且灾难性的错误。


为什么 s += xs['1'] += xs[0] += x 快?

From Note 6:

CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use the str.join() method which assures consistent linear concatenation performance across versions and implementations.

CPython 的优化是,如果一个字符串只有一个引用,那么我们可以 resize it in-place.

/* Note that we don't have to modify *unicode for unshared Unicode objects, since we can modify them in-place. */

后面两个不是简单的就地加法。事实上,这些根本不是就地添加。

s[0] += x

相当于:

temp = s[0]  # Extra reference. `S[0]` and `temp` both point to same string now.
temp += x
s[0] = temp

示例:

>>> lst = [1, 2, 3]
>>> def func():
...     lst[0] = 90
...     return 100
...
>>> lst[0] += func()
>>> print lst
[101, 2, 3]  # Not [190, 2, 3]

但通常从不使用 s += x 来连接字符串,始终在字符串集合上使用 str.join


计时

LOOPS = 1000
WORD = 'ABC'*100


def list_append():
    buf1 = [WORD for _ in xrange(LOOPS)]
    return ''.join(buf1)


def str_concat():
    buf2 = ''
    for i in xrange(LOOPS):
        buf2 += WORD


def dict_val_concat():
    buf3 = {'1': ''}
    for i in xrange(LOOPS):
        buf3['1'] += WORD
    return buf3['1']


def list_val_concat():
    buf4 = ['']
    for i in xrange(LOOPS):
        buf4[0] += WORD
    return buf4[0]


def val_pop_concat():
    buf5 = ['']
    for i in xrange(LOOPS):
        val = buf5.pop()
        val += WORD
        buf5.append(val)
    return buf5[0]


def val_assign_concat():
    buf6 = ['']
    for i in xrange(LOOPS):
        val = buf6[0]
        val += WORD
        buf6[0] = val
    return buf6[0]


>>> %timeit list_append()
1000 loops, best of 3: 1.31 ms per loop
>>> %timeit str_concat()
100 loops, best of 3: 3.09 ms per loop
>>> %run so.py
>>> %timeit list_append()
10000 loops, best of 3: 71.2 us per loop
>>> %timeit str_concat()
1000 loops, best of 3: 276 us per loop
>>> %timeit dict_val_concat()
100 loops, best of 3: 9.66 ms per loop
>>> %timeit list_val_concat()
100 loops, best of 3: 9.64 ms per loop
>>> %timeit val_pop_concat()
1000 loops, best of 3: 556 us per loop
>>> %timeit val_assign_concat()
100 loops, best of 3: 9.31 ms per loop

val_pop_concat 在这里很快,因为通过使用 pop() 我们从列表中删除对该字符串的引用,现在 CPython 可以就地调整它的大小( 正确猜测) .