Python 中 GIL 的新实现是否处理了竞争条件问题？

Question

我在 Python 中阅读了 an article 关于多线程的内容，他们试图使用同步来解决竞争条件问题。我有运行下面的示例代码来重现竞争条件问题：

import threading 

# global variable x 
x = 0

def increment(): 
    """ 
    function to increment global variable x 
    """
    global x 
    x += 1

def thread_task(): 
    """ 
    task for thread 
    calls increment function 100000 times. 
    """
    for _ in range(100000): 
        increment() 

def main_task(): 
    global x 
    # setting global variable x as 0 
    x = 0

    # creating threads 
    t1 = threading.Thread(target=thread_task) 
    t2 = threading.Thread(target=thread_task) 

    # start threads 
    t1.start() 
    t2.start() 

    # wait until threads finish their job 
    t1.join() 
    t2.join() 

if __name__ == "__main__": 
    for i in range(10): 
        main_task() 
        print("Iteration {0}: x = {1}".format(i,x))

当我使用 Python 2.7.15 时，它 return 与文章中的结果相同。但是当我使用 Python 3.6.9 时它不会（所有线程 return 相同的结果 = 200000）。

我想知道 GIL 的新实现（自 Python 3.2 起）是否处理了竞争条件问题？如果是，为什么 Lock, Mutex 仍然存在于 Python >3.2 中。如果不是，为什么像上面的例子那样运行多线程修改共享资源没有冲突？

这些天，当我试图更多地了解 Python 的幕后工作原理时，我一直在思考这些问题。

Answer 1

您指的更改是将检查间隔替换为切换间隔。这意味着与其每 100 个字节代码切换一次线程，不如每 5 毫秒切换一次线程。

参考：https://pymotw.com/3/sys/threads.html https://mail.python.org/pipermail/python-dev/2009-October/093321.html

因此，如果您的代码运行足够快，它就永远不会经历线程切换，并且在您看来这些操作可能是原子的，但实际上并非如此。竞争条件没有出现，因为没有实际的线程交织。 x += 1其实就是四字节码：

>>> dis.dis(sync.increment)
 11           0 LOAD_GLOBAL              0 (x)
              3 LOAD_CONST               1 (1)
              6 INPLACE_ADD         
              7 STORE_GLOBAL             0 (x)
             10 LOAD_CONST               2 (None)
             13 RETURN_VALUE

解释器中的线程切换可以发生在任何两个字节码之间。

考虑到在 2.7 中，这总是打印 200000，因为检查间隔设置得如此之高，以至于每个线程在下一次运行之前全部完成。同样可以用开关间隔来构造。

import sys
import threading 

print(sys.getcheckinterval())
sys.setcheckinterval(1000000)

# global variable x 
x = 0

def increment(): 
    """ 
    function to increment global variable x 
    """
    global x 
    x += 1

def thread_task(): 
    """ 
    task for thread 
    calls increment function 100000 times. 
    """
    for _ in range(100000): 
        increment() 

def main_task(): 
    global x 
    # setting global variable x as 0 
    x = 0

    # creating threads 
    t1 = threading.Thread(target=thread_task) 
    t2 = threading.Thread(target=thread_task) 

    # start threads 
    t1.start() 
    t2.start() 

    # wait until threads finish their job 
    t1.join() 
    t2.join() 

if __name__ == "__main__": 
    for i in range(10): 
        main_task() 
        print("Iteration {0}: x = {1}".format(i,x))

Answer 2

GIL 保护个人字节码指令。相反，竞争条件是 incorrect ordering of instructions，这意味着多个字节代码指令。因此，GIL 无法防止 Python VM 本身之外的竞争条件。

但是，就其本质而言，竞争条件并不总是会触发。某些 GIL 策略或多或少可能触发某些竞争条件。比 GIL window 短的线程永远不会被中断，比 GIL window 长的线程总是被中断。

您的 increment 函数有 6 字节代码指令，调用它的内部循环也是如此。其中，4 条指令必须一次完成，这意味着有 3 个可能的切换点会破坏结果。您的整个 thread_task 函数大约需要 0.015 秒到 0.020 秒（在我的系统上）。

使用旧 GIL 每 100 条指令切换一次，循环保证每 8.3 次调用中断一次，或大约 1200 次。使用新的 GIL 每 5ms 切换一次，循环仅中断 3 次。

Python 中 GIL 的新实现是否处理了竞争条件问题？

Does new implementation of GIL in Python handled race condition issue?

python

multithreading

race-condition

gil