在 Python 中使用 pstats 和 cProfile。如何使数组工作得更快？

Question

这是我对代码的第一次优化，我很兴奋。看了一些文章，还是有一些疑问。

1) 首先，在我下面的代码中，什么占用了这么多时间？我认为这里是数组：array.append(len(set(line.split())))。我在网上看到 python 中的列表工作得更快，但我没有看到这里使用列表。有人知道如何改进吗？

2) 我还缺少其他改进吗？

3) 另外，网上说 for 循环会大大降低代码速度。这里可以改进吗？（我想用 C 编写代码最好，但是 :D ）

4) 为什么人们建议总是看 "ncalls" 和 "tottime"？对我来说 "percall" 更有意义。它告诉您函数或调用的速度有多快。

5) 在正确答案中 Class B 他应用了列表。他呢？对我来说，我仍然看到一个数组和一个 FOR 循环，它们应该会减慢速度。 Fastest way to grow a numpy numeric array

谢谢。

新的 cProfile 结果：

 618384 function calls in 9.966 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    19686    3.927    0.000    4.897    0.000 <ipython-input-120-d8351bb3dd17>:14(f)
    78744    3.797    0.000    3.797    0.000 {numpy.core.multiarray.array}
    19686    0.948    0.000    0.948    0.000 {range}
    19686    0.252    0.000    0.252    0.000 {method 'partition' of 'numpy.ndarray' objects}
    19686    0.134    0.000    0.930    0.000 function_base.py:2896(_median)
        1    0.126    0.126    9.965    9.965 <ipython-input-120-d8351bb3dd17>:22(<module>)
    19686    0.125    0.000    0.351    0.000 _methods.py:53(_mean)
    19686    0.120    0.000    0.120    0.000 {method 'reduce' of 'numpy.ufunc' objects}
    19686    0.094    0.000    4.793    0.000 function_base.py:2747(_ureduce)
    19686    0.071    0.000    0.071    0.000 {method 'flatten' of 'numpy.ndarray' objects}
    19686    0.065    0.000    0.065    0.000 {method 'format' of 'str' objects}
    78744    0.055    0.000    3.852    0.000 numeric.py:464(asanyarray)

新代码：

import numpy
import cProfile

pr = cProfile.Profile()
pr.enable()

#paths to files
read_path = '../tweet_input/tweets.txt'
write_path = "../tweet_output/ft2.txt"


def f(a):  
    for i in range(0, len(array)):
        if a <= array[i]:
            array.insert(i, a)
            break
    if 0 == len(array):
        array.append(a)

try:
    with open(read_path) as inf, open(write_path, 'a') as outf:
        array = []
        #for every line (tweet) in the file
        for line in inf:                                            ###Loop is bad. Builtin function is good
            #append amount of unique words to the array
            wordCount = len(set(line.split()))
            #print wordCount, array
            f(wordCount)
            #write current median of the array to the file
            result = "{:.2f}\n".format(numpy.median(array))
            outf.write(result)
except IOError as e:
    print 'Operation failed: %s' % e.strerror


###Service
pr.disable()
pr.print_stats(sort = 'time')

OLD cProfile 结果： 13.195 秒内调用 551211 次函数排序依据：内部时间
ncalls tottime percall cumtime percall filename:lineno(函数) 78744 10.193 0.000 10.193 0.000 {numpy.core.multiarray.array}

旧代码：

    with open(read_path) as inf, open(write_path, 'a') as outf:
        array = []
        #for every line in the file
        for line in inf:                            
            #append amount of unique words to the array
            array.append(len(set(line.split())))
            #write current median of the array to the file
            result = "{:.2f}\n".format(numpy.median(array))
            outf.write(result)

Answer 1

Numpy 使用 meadian finding algorithm 是 O(n log n)。您每行调用一次 numpy.meadian，因此您的算法最终为 O(n^2 log n)。

有几种方法可以对此进行改进。一种是保持数组排序（即将每个元素插入到保持排序顺序的位置）。每个插入都需要 O(n)（插入数组是线性时间操作），而获得排序数组的中位数是 O(1)，所以这最终是 O(n^2).

对于性能分析，您要查看的主要内容是 tottime，因为这会告诉您您的程序总共在该函数上花费了多少时间。在你的例子中，percall 有时不是很有用，因为有时候，如果你有一个慢函数（high percall）但它只被调用了几次（low numcalls），与其他功能相比，tottime 最终变得微不足道。

在 Python 中使用 pstats 和 cProfile。如何使数组工作得更快？

Making use of pstats and cProfile in Python. How to make array work faster?

python

optimization

profiling

cprofile

pstats