mpich 集群测试错误,无法更改 wdir

mpich cluster test error, unable to change wdir

我搭建了一个mpich2集群,machinefile是:

pc3@ub3:4   # this will spawn 4 process on ub3
pc1@ub1     # this will spawn 1 process on ub1

当我运行测试过程时,它应该打印:

Hello from processor 0 of 8
Hello from processor 1 of 8
Hello from processor 2 of 8
Hello from processor 3 of 8
Hello from processor 4 of 8
Hello from processor 5 of 8
Hello from processor 6 of 8
Hello from processor 7 of 8

但它返回:

pc1@ub1:~$ mpiexec -n 8 -f machinefile ./mpi_hello
[proxy:0:0@ub3] launch_procs (./pm/pmiserv/pmip_cb.c:648): unable to change wdir to /home/pc1 (No such file or directory)
[proxy:0:0@ub3] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:893): launch_procs returned error
[proxy:0:0@ub3] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:0@ub3] main (./pm/pmiserv/pmip.c:206): demux engine error waiting for event
[mpiexec@ub1] control_cb (./pm/pmiserv/pmiserv_cb.c:202): assert (!closed) failed
[mpiexec@ub1] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec@ub1] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:197): error waiting for event
[mpiexec@ub1] main (./ui/mpich/mpiexec.c:331): process manager error waiting for completion

我已成功启用无密码 SSH,以便 pc1 可以无密码连接到 pc3。尽管如此,我仍然认为 SSH 或访问权限有问题。我的 OS 是 Ubuntu 14.04 LTS 32bit

感谢您的帮助。

确保所有用户名都相同。所以将机器文件更改为

ub3:4   # this will spawn 4 process on ub3
ub1     # this will spawn 1 process on ub1

并将编译好的文件全部复制到对应目录下。 确保所有主机名都在所有节点的 /etc/hostname 文件中。