optuna 的分段错误(核心已转储)

Segmentation fault (core dumped) with optuna

我知道这个问题已经被问过很多次了,但是根据其他线程的回答,我无法解决我的问题。我在命令行中使用 运行 python 脚本 python3 CTGAN_noscale.py --database_name CTGAN_noshift 并收到以下错误(使用 faulthandler):

Fatal Python error: Segmentation fault

Current thread 0x00007f57e97fe700 (most recent call first):
<no Python frame>

Thread 0x00007f593db07740 (most recent call first):
  File "/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py", line 145 in backward
  File "/usr/local/lib/python3.8/dist-packages/torch/tensor.py", line 245 in backward
  File "/usr/local/lib/python3.8/dist-packages/ctgan/synthesizers/ctgan.py", line 374 in fit
  File "CTGAN_noscale.py", line 140 in objective
  File "CTGAN_noscale.py", line 162 in <lambda>
  File "/usr/local/lib/python3.8/dist-packages/optuna/_optimize.py", line 216 in _run_trial
  File "/usr/local/lib/python3.8/dist-packages/optuna/_optimize.py", line 162 in _optimize_sequential
  File "/usr/local/lib/python3.8/dist-packages/optuna/_optimize.py", line 65 in _optimize
  File "/usr/local/lib/python3.8/dist-packages/optuna/study.py", line 401 in optimize
  File "CTGAN_noscale.py", line 162 in run_CTGAN
  File "CTGAN_noscale.py", line 210 in <module>
Segmentation fault (core dumped)

看来问题出在 optuna 上。 奇怪的是,在另一台服务器上一切正常,在更换服务器后它崩溃了,就像这样

更新

我发现当我不使用 docker 容器或使用不带 GPU 的 docker 容器时,问题不会发生。

我通过重建一个新镜像并从这个镜像派生一个容器解决了这个问题。在这个容器中,错误不再出现。