如何将 multiprocess.Pool 与 .apply 函数一起使用?

How to use multiprocess.Pool with .apply function?

我有一个处理字符串的函数,我正在将它应用于数据框列

import pandas as pd
import numpy as np
  
def test_upper(d):
    return d.upper()

def mainfunc():
    df = pd.read_csv("file.csv", sep='\t', encoding='utf-8')

    print(df.head())

    lambdafunc = lambda x: test_upper(x)

    df['upper_cols'] = df['cols'].apply(lambdafunc)

    print(df.head())

mainfunc()

现在,我想用 multiprocessing.Pool 做同样的事情。我已经在 Whosebug 中搜索了如何执行此操作,这就是我想出的:

import pandas as pd
import numpy as np
import multiprocessing as mp

def test_upper(d):
    return d.upper()

def mainfunc():
    df = pd.read_csv("file.csv", sep='\t', encoding='utf-8')

    print(df.head())

    lambdafunc = lambda x: test_upper(x)

    list_results = pd.Series()
    def log_result(result):
        list_results.append(result)
        
    pool = mp.Pool(processes=4)
    pool.apply_async(lambdafunc, (df['cols'], ), callback=log_result)
    pool.close()
    pool.join()

    print(list_results)

mainfunc()

结果是空白 Series/list 因为我都试过了。我在这里做错了什么? 谢谢!

终于想通了

def test_upper(d):
    output = d.apply(lambda x: x:upper())
    return output

def mainfunc():
    df = pd.read_csv("file.csv", sep='\t', encoding='utf-8')

    print(df.head())
        
    pool = mp.Pool(processes=4)
    result = pool.apply_async(test_upper, (df['cols'], ))
    pool.close()
    pool.join()

    print(result.get())

mainfunc()