optuna 的分段错误(核心已转储)
Segmentation fault (core dumped) with optuna
我知道这个问题已经被问过很多次了,但是根据其他线程的回答,我无法解决我的问题。我在命令行中使用 运行 python 脚本 python3 CTGAN_noscale.py --database_name CTGAN_noshift
并收到以下错误(使用 faulthandler):
Fatal Python error: Segmentation fault
Current thread 0x00007f57e97fe700 (most recent call first):
<no Python frame>
Thread 0x00007f593db07740 (most recent call first):
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py", line 145 in backward
File "/usr/local/lib/python3.8/dist-packages/torch/tensor.py", line 245 in backward
File "/usr/local/lib/python3.8/dist-packages/ctgan/synthesizers/ctgan.py", line 374 in fit
File "CTGAN_noscale.py", line 140 in objective
File "CTGAN_noscale.py", line 162 in <lambda>
File "/usr/local/lib/python3.8/dist-packages/optuna/_optimize.py", line 216 in _run_trial
File "/usr/local/lib/python3.8/dist-packages/optuna/_optimize.py", line 162 in _optimize_sequential
File "/usr/local/lib/python3.8/dist-packages/optuna/_optimize.py", line 65 in _optimize
File "/usr/local/lib/python3.8/dist-packages/optuna/study.py", line 401 in optimize
File "CTGAN_noscale.py", line 162 in run_CTGAN
File "CTGAN_noscale.py", line 210 in <module>
Segmentation fault (core dumped)
看来问题出在 optuna 上。
奇怪的是,在另一台服务器上一切正常,在更换服务器后它崩溃了,就像这样
更新
我发现当我不使用 docker 容器或使用不带 GPU 的 docker 容器时,问题不会发生。
我通过重建一个新镜像并从这个镜像派生一个容器解决了这个问题。在这个容器中,错误不再出现。
我知道这个问题已经被问过很多次了,但是根据其他线程的回答,我无法解决我的问题。我在命令行中使用 运行 python 脚本 python3 CTGAN_noscale.py --database_name CTGAN_noshift
并收到以下错误(使用 faulthandler):
Fatal Python error: Segmentation fault
Current thread 0x00007f57e97fe700 (most recent call first):
<no Python frame>
Thread 0x00007f593db07740 (most recent call first):
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py", line 145 in backward
File "/usr/local/lib/python3.8/dist-packages/torch/tensor.py", line 245 in backward
File "/usr/local/lib/python3.8/dist-packages/ctgan/synthesizers/ctgan.py", line 374 in fit
File "CTGAN_noscale.py", line 140 in objective
File "CTGAN_noscale.py", line 162 in <lambda>
File "/usr/local/lib/python3.8/dist-packages/optuna/_optimize.py", line 216 in _run_trial
File "/usr/local/lib/python3.8/dist-packages/optuna/_optimize.py", line 162 in _optimize_sequential
File "/usr/local/lib/python3.8/dist-packages/optuna/_optimize.py", line 65 in _optimize
File "/usr/local/lib/python3.8/dist-packages/optuna/study.py", line 401 in optimize
File "CTGAN_noscale.py", line 162 in run_CTGAN
File "CTGAN_noscale.py", line 210 in <module>
Segmentation fault (core dumped)
看来问题出在 optuna 上。 奇怪的是,在另一台服务器上一切正常,在更换服务器后它崩溃了,就像这样
更新
我发现当我不使用 docker 容器或使用不带 GPU 的 docker 容器时,问题不会发生。
我通过重建一个新镜像并从这个镜像派生一个容器解决了这个问题。在这个容器中,错误不再出现。