在 While 循环中使用相同的函数实现多处理

Implementing Multiprocessing with the same Function in While Loop

我已经在 Python 3.8 中实现了一个进化算法过程,并且正在尝试 optimise/reduce 它的运行时。由于对有效解决方案的严格限制,生成有效染色体可能需要几分钟时间。为了避免仅仅生成初始群体就花费数小时,我想使用 Multiprocessing 一次生成多个。

我此时的代码是:

populationCount = 500

def readDistanceMatrix():
    # code removed

def generateAvailableValues():
    # code removed

def generateAvailableValuesPerColumn():
    # code removed

def generateScheduleTemplate():
    # code removed

def generateChromosome():
    # code removed

if __name__ == '__main__':
    # Data type = DataFrame
    distanceMatrix = readDistanceMatrix()
    
    # Data type = List of Integers
    availableValues = generateAvailableValues()

    # Data type = List containing Lists of Integers
    availableValuesPerColumn = generateAvailableValuesPerColumn(availableValues)
        
    # Data type = DataFrame
    scheduleTemplate = generateScheduleTemplate(distanceMatrix)
    
    # Data type = List containing custom class (with Integer and DataFrame)
    population = []
    while len(population) < populationCount:
        chrmSolution = generateChromosome(availableValuesPerColumn, scheduleTemplate, distanceMatrix)
        population.append(chrmSolution)

人口列表最后用while循环填充的地方。我想用最多可以使用预设数量的内核的多处理解决方案替换 while 循环。例如:

population = []
availableCores = 6 
while len(population) < populationCount:
    while usedCores < availableCores:
        # start generating another chromosome as 'chrmSolution'
    population.append(chrmSolution)

但是,在阅读和观看了数小时的教程之后,我无法获得循环 运行。我应该怎么做?

听起来简单的 multiprocessing.Pool 应该可以解决问题,或者至少是一个起点。这是一个简单的示例:

from multiprocessing import Pool, cpu_count

child_globals = {} #mutable object at the `module` level acts as container for globals (constants)

if __name__ == '__main__':
    # ...
    
    def init_child(availableValuesPerColumn, scheduleTemplate, distanceMatrix):
        #passing variables to the child process every time is inefficient if they're
        #  constant, so instead pass them to the initialization function, and let
        #  each child re-use them each time generateChromosome is called
        child_globals['availableValuesPerColumn'] = availableValuesPerColumn
        child_globals['scheduleTemplate'] = scheduleTemplate
        child_globals['distanceMatrix'] = distanceMatrix
        
    def child_work(i):
        #child_work simply wraps generateChromosome with inputs, and throws out dummy `i` from `range()`
        return generateChromosome(child_globals['availableValuesPerColumn'],
                                  child_globals['scheduleTemplate'],
                                  child_globals['distanceMatrix'])
    with Pool(cpu_count(), 
              initializer=init_child, #init function to stuff some constants into the child's global context
              initargs=(availableValuesPerColumn, scheduleTemplate, distanceMatrix)) as p:
        #imap_unordered doesn't make child processes wait to ensure order is preserved,
        #  so it keeps the cpu busy more often. it returns a generator, so we use list()
        #  to store the results into a list.
        population = list(p.imap_unordered(child_work, range(populationCount)))