处理像素时线程性能下降

Question

当我的 Python 脚本生成一个线程并且运行执行以下代码时，运行时间为 0.8 秒。当它生成五个线程并运行s 代码时，运行时间约为 5.0 秒。

显然，即使有 5 或 15 个线程，我也希望代码在 ~0.8 秒内完成运行。为什么会这样？我已经使用线程来改善程序其他部分的运行时间，但由于某种原因，它在这里成为瓶颈。另外，我从来没有生成超过 60 个线程，所以这应该不会影响性能。

# Open the image
imgx = Image.open(imgName)
imgx = imgx.convert("RGBA")
pix = imgx.load()                   


# Adjust dark pixels to black or white
for y in xrange(imgx.size[1]):
    for x in xrange(imgx.size[0]):

    # Get RGBA values for a pixel
    (first, second, third, alpha) = pix[x, y]

    # Ajust the RGBA values accordingly
    if (first > threshold) or (second > threshold) or (third > threshold):      
        first = 255
        second = 255
        third = 255
        alpha = 255
    else:                                       
        first = 0
        second = 0
        third = 0
        alpha = 255

    # Set new pixel values
    pix[x, y] = (first, second, third, alpha)

Answer 1

Python 解释器有一个全局锁（称为全局解释器锁或 GIL），可防止纯 Python 代码在多个线程中并发运行。

在 Python 循环中单独迭代像素无论如何都是非常低效的。您应该使用 Numpy 的矢量化函数，该函数可以全局作用于数组。这在单个线程中会快得多，并且具有 Numpy 在数组操作期间释放 GIL 的额外优势，因此它们实际上可以在多个线程中并行发生。

您可能甚至不需要为此应用程序使用线程。使用多进程比使用线程要精妙得多。

Numpy 代码大致相当于您编写的内容

img = Image.open(imgName).convert("RGBA")
arr = numpy.array(img)

# Split the channels of the image by rotating the axes
r, g, b, a = arr.transpose(2, 0, 1)

# Create Boolean array: True means pixel is above threshold
bw = (r > threshold) | (g > threshold) | (b > threshold)

# Set R, G and B channel to the 255 times the B/W array
arr[:, :, :2] = 255 * bw[:, :, numpy.newaxis]

# Set alpha channel to 255
arr[:, :, 3] = 255

# Create new PIL image from array
new_img = Image.fromarray(arr)

处理像素时线程性能下降

Threading Performance Degradation when Handling Pixels

python

rgb

image

pixel

python-multithreading