mxnet 和 tensorflow 中错误的 gpu 顺序

Wrong gpu order in mxnet and tensorflow

我的桌面安装了 2 个 gpu:1080 和 1080Ti nvidia-smi显示gpu-0是1080,gpu-1是1080Ti

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:01:00.0 Off |                  N/A |
| 26%   57C    P2    53W / 215W |    696MiB /  8119MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:02:00.0 Off |                  N/A |
| 55%   70C    P2   204W / 250W |   8641MiB / 11178MiB |     28%      Default |
+-------------------------------+----------------------+----------------------+

现在 tensorflowmxnet 使用相反的顺序:当我指定 gpu=0 时为 1080ti,当我指定 gpu=1 时为 1080 .

为什么会出现这种情况以及如何将tensorflow和mxnet gpu顺序与nvidia-smi gpu顺序同步?

mxnet 的代码片段:

mod = mx.mod.Module(symbol, label_names=None, context=mx.gpu(0))

对于tensorflow我使用环境变量

CUDA_VISIBLE_DEVICES="0"   

设置

export CUDA_DEVICE_ORDER=PCI_BUS_ID.

另见 this question