如何从 mpi运行 在 gnome 终端中 运行 gdb?
How to run gdb in gnome-terminal from mpirun?
我有一个使用 mpi 的程序。要调试它,我可以使用 mpirun -np 2 xterm -e gdb myprog
.
但是,xterm 在我的机器上有问题。我想尝试 gnome-terminal 但我不知道该输入什么。我试过:
1) mpirun -np 2 gnome-terminal -- gdb myprog
2) mpirun -np 2 gnome-terminal -- "gdb myprog"
3) mpirun -np 2 gnome-terminal -- bash -c "gdb myprog"
4) mpirun -np 2 gnome-terminal -- bash -c "gdb myprog; exec bash"
但其中 none 似乎有效; 1),3),4) 在 gdb 中的 run
之后说:
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
ompi_mpi_init: ompi_rte_init failed
--> Returned "(null)" (-43) instead of "Success" (0)
-------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[oleg-VirtualBox:4169] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[Inferior 1 (process 4169) exited with code 01]
在 2) 终端说:
There was an error creating the child process for this terminal
Failed to execute child process “gdb app” (No such file or directory)
顺便说一句,我使用 Ubuntu 18.04.02 LTS。
我做错了什么?
编辑:事实证明,有问题的不是 xterm,而是带有 --tui 选项的 gdb。如果您的程序打印某些内容,gdb window 将开始显示错误,无论在哪个终端。
问题是gnome-terminal将请求的程序交给终端服务器,然后立即退出。 mpi运行 然后看到启动的程序已经退出,并破坏了 MPI 运行time 环境。 MPI程序真正启动的时候,mpi运行已经退出了。据我所知,没有办法让 gnome-terminal 等到给定的命令结束。
有一个解决方法:不是直接用 mpi运行 启动 gnome-terminal,而是有两个包装器脚本。第一个由 mpi运行 启动。它创建一个临时文件,告诉 gnome-terminal 启动第二个包装脚本,然后等待临时文件消失。第二个包装脚本 运行 是您实际想要 运行 的命令,例如gdb myprog
,等到它结束,然后删除临时文件。那时第一个包装器注意到临时文件消失并退出。那么mpi运行就可以安全的破坏MPI环境了。
这可能从脚本本身更容易理解。
debug.sh:
#!/bin/bash
# This is run outside gnome-terminal by mpirun.
# Create a tmp file that we can wait on.
export MY_MPIRUN_TMP_FILE="$(mktemp)"
# Start the gnome-terminal. It will exit immediately.
# Call the wrapper script which removes the tmp file
# after the actual command has ended.
gnome-terminal -- ./helper.sh "$@"
# Wait for the file to disappear.
while [ -f "${MY_MPIRUN_TMP_FILE}" ] ; do
sleep 1
done
# Now exit, so mpirun can destroy the MPI environment
# and exit itself.
helper.sh
#!/bin/bash
# This is run by gnome-terminal.
# The command you actually want to run.
"$@"
# Remove the tmp file to show that the command has exited.
rm "${MY_MPIRUN_TMP_FILE}"
运行 为 mpirun debug.sh gdb myproc
.
我有一个使用 mpi 的程序。要调试它,我可以使用 mpirun -np 2 xterm -e gdb myprog
.
但是,xterm 在我的机器上有问题。我想尝试 gnome-terminal 但我不知道该输入什么。我试过:
1) mpirun -np 2 gnome-terminal -- gdb myprog
2) mpirun -np 2 gnome-terminal -- "gdb myprog"
3) mpirun -np 2 gnome-terminal -- bash -c "gdb myprog"
4) mpirun -np 2 gnome-terminal -- bash -c "gdb myprog; exec bash"
但其中 none 似乎有效; 1),3),4) 在 gdb 中的 run
之后说:
It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer):
ompi_mpi_init: ompi_rte_init failed
--> Returned "(null)" (-43) instead of "Success" (0)
-------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[oleg-VirtualBox:4169] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[Inferior 1 (process 4169) exited with code 01]
在 2) 终端说:
There was an error creating the child process for this terminal
Failed to execute child process “gdb app” (No such file or directory)
顺便说一句,我使用 Ubuntu 18.04.02 LTS。
我做错了什么?
编辑:事实证明,有问题的不是 xterm,而是带有 --tui 选项的 gdb。如果您的程序打印某些内容,gdb window 将开始显示错误,无论在哪个终端。
问题是gnome-terminal将请求的程序交给终端服务器,然后立即退出。 mpi运行 然后看到启动的程序已经退出,并破坏了 MPI 运行time 环境。 MPI程序真正启动的时候,mpi运行已经退出了。据我所知,没有办法让 gnome-terminal 等到给定的命令结束。
有一个解决方法:不是直接用 mpi运行 启动 gnome-terminal,而是有两个包装器脚本。第一个由 mpi运行 启动。它创建一个临时文件,告诉 gnome-terminal 启动第二个包装脚本,然后等待临时文件消失。第二个包装脚本 运行 是您实际想要 运行 的命令,例如gdb myprog
,等到它结束,然后删除临时文件。那时第一个包装器注意到临时文件消失并退出。那么mpi运行就可以安全的破坏MPI环境了。
这可能从脚本本身更容易理解。
debug.sh:
#!/bin/bash
# This is run outside gnome-terminal by mpirun.
# Create a tmp file that we can wait on.
export MY_MPIRUN_TMP_FILE="$(mktemp)"
# Start the gnome-terminal. It will exit immediately.
# Call the wrapper script which removes the tmp file
# after the actual command has ended.
gnome-terminal -- ./helper.sh "$@"
# Wait for the file to disappear.
while [ -f "${MY_MPIRUN_TMP_FILE}" ] ; do
sleep 1
done
# Now exit, so mpirun can destroy the MPI environment
# and exit itself.
helper.sh
#!/bin/bash
# This is run by gnome-terminal.
# The command you actually want to run.
"$@"
# Remove the tmp file to show that the command has exited.
rm "${MY_MPIRUN_TMP_FILE}"
运行 为 mpirun debug.sh gdb myproc
.