Python CPU 的多处理上下文切换

Question

我创建了这个简单的代码来检查从全局字典对象中读取的多进程：

import numpy as np
import multiprocessing as mp
import psutil

from itertools import repeat

def computations_x( max_int ):
    
    #random selection
    
    mask_1   = np.random.randint( low=0, high=max_int, size=1000  )
    mask_2   = np.random.randint( low=0, high=max_int, size=1000  )
    
    exponent_1 = np.sqrt( np.pi )
    vector_1   = np.array( [ read_obj[ k ]**( exponent_1 ) for k in mask_1  ]  )
    vector_2   = np.array( [ read_obj[ k ]**np.pi for k in mask_2  ]  )
    
    result = []
    
    for j in range(100):
        res_col = []
        for i in range(100):
            
            c = np.multiply( vector_1, vector_2 ).sum( axis=0 )
            res_col.append(c)
        
        res_col = np.array( res_col )
        
        result.append( res_col )
        
    result = np.array( result )
    
    return result
            

global read_obj

total_items = 40000
max_int     = 1000
keys        = np.arange(0, max_int)

number_processors      = psutil.cpu_count( logical=False )
#number_used_processors = 1
number_used_processors = number_processors - 1
     
number_tasks           = number_used_processors        

read_obj = { k: np.random.rand( 1000 ) for k in keys   }

pool        = mp.Pool( processes = number_used_processors )

args        = list( repeat( max_int, number_tasks ) ) 
results     = pool.map( computations_x, args )
                
pool.close()  
pool.join()

但是，在查看 CPU 性能时，我发现 CPU 在执行计算时被 OS 切换。我在 Ubuntu 18.04 运行，这是使用 Python 的 MP 模块时的正常行为吗？这是我在调试代码时在系统监视器中观察到的（我使用的是Eclipse2019进行调试）

感谢任何帮助，因为在我的主要项目中，我需要以与此处相同的精神通过进程共享一个全局“只读”对象，我想确保这不会严重影响性能;我还想确保所有任务都在池 class 中同时执行。谢谢。

Answer 1

我想说这是正常行为，因为 OS 必须确保其他进程不会饿死 CPU 时间。

这是一篇关于 OS 调度器基础知识的好文章：https://www.ardanlabs.com/blog/2018/08/scheduling-in-go-part1.html

它专注于 Golang，但第一部分非常笼统。

Python CPU 的多处理上下文切换

Python multiprocessing context switching of CPU's

python

python-multiprocessing