Joblib 在 2x n_jobs 后崩溃

Joblib crashes after 2x n_jobs

Joblib 因错误而崩溃

  Parallel(n_jobs=-1, prefer="threads", verbose=10)(
  File "/home/developer/.local/lib/python3.8/site-packages/joblib/parallel.py", line 1054, in __call__
    self.retrieve()
  File "/home/developer/.local/lib/python3.8/site-packages/joblib/parallel.py", line 933, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/developer/.local/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 595, in __call__
    return self.func(*args, **kwargs)
  File "/home/developer/.local/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
    return [func(*args, **kwargs)
  File "/home/developer/.local/lib/python3.8/site-packages/joblib/parallel.py", line 263, in <listcomp>
    for func, args, kwargs in self.items]
TypeError: cannot unpack non-iterable function object

在这段代码中(一些名称被更改为隐藏信息)

    with open(inputFile) as file:
        csv_reader = csv.DictReader(
            file, fieldnames=["Header1", "Header2"])
        Parallel(n_jobs=3, prefer="threads", verbose=10)(
            delayed(pullSummaryData(row["Header1"]))
            for row in csv_reader
        )

有趣的是它总是在恰好调用 pullSummaryData 2*n_jobs 后崩溃。如果n_jobs=3pullSummaryData会在崩溃前被调用6次。

Joblib v1.0.1

csv v1.0

Python v3.8.5

尝试将 delayed(pullSummaryData(row["Header1"])) 更改为 delayed(pullSummaryData)(row["Header1"])

参考:Document

根据原文 post 下的 user696969's 评论回答。