TensorFlow 1.0 教程中的 Cuda 问题貌似 TF 找不到 CUPTI/lib64？

Question

此问题与警告 SSE AVX 等无关。为了完整起见，我已经包含了输出。问题是某些 cuda 库出现故障，我认为，最后，机器有一张 NVIDA 1070 卡，并且有在此过程中较早使用的 Cuda 库，但最后缺少什么？我 pip 安装了 TensorFlow 的 1.0 版我还单独下载了 repo 以获得最新的教程。本教程专门用于获取所有 Tensorboard 功能的实例。尝试从 repo 中的 tensorFlow 教程中运行 Minst_with_summaries.py （我将文件从 repo 复制到工作目录中）并且我正在使用 Anaconda 和 Python 3.6 我得到了以下：

(py36) tom@tomServal:~/Documents/LearningRepos/Working$ python Minst_with_summaries.py
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
Extracting /tmp/tensorflow/mnist/input_data/train-images-idx3-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/train-labels-idx1-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/t10k-images-idx3-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/t10k-labels-idx1-ubyte.gz
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GeForce GTX 1070
major: 6 minor: 1 memoryClockRate (GHz) 1.645
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.48GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0)
Accuracy at step 0: 0.1213
Accuracy at step 10: 0.6962
Accuracy at step 20: 0.8054
Accuracy at step 30: 0.8447
Accuracy at step 40: 0.8718
Accuracy at step 50: 0.8779
Accuracy at step 60: 0.8846
Accuracy at step 70: 0.8783
Accuracy at step 80: 0.8853
Accuracy at step 90: 0.8989
I tensorflow/stream_executor/dso_loader.cc:126] Couldn't open CUDA library libcupti.so.8.0. LD_LIBRARY_PATH: :/usr/local/cuda/lib64
F tensorflow/core/platform/default/gpu/cupti_wrapper.cc:59] Check failed: ::tensorflow::Status::OK() == (::tensorflow::Env::Default()->GetSymbolFromLibrary( GetDsoHandle(), kName, &f)) (OK vs. Not found: /home/tom/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow.so: undefined symbol: cuptiActivityRegisterCallbacks)could not find cuptiActivityRegisterCallbacksin libcupti DSO
Aborted

在我看来，TensorFlow 的安装可能缺少一些东西看到上面的最后几行了吗？怎么修？另请参阅 GitHub 上的此问题：https://github.com/tensorflow/tensorflow/issues/7975

答案已发布在 GitHub 上，似乎存在可通过以下方式修复的安装错误：

adding /usr/local/cuda/extras/CUPTI/lib64 to your LD_LIBRARY_PATH

如果@mrry 重新打开以便其他人可以看到正确的分辨率，将会很有帮助。

Answer 1

也在 GitHub 上参考这个问题：https://github.com/tensorflow/tensorflow/issues/7975

您可以尝试 git-hub 问题建议的 apt-get 安装，但对我来说并不适用：这个适用：

答案发布在 GitHub 上，似乎有一个安装错误可以通过以下方式修复：

adding /usr/local/cuda/extras/CUPTI/lib64 to your LD_LIBRARY_PATH

您可以通过编辑您的 .bash 个人资料来做到这一点

Answer 2

我在 windows 上遇到过这个。我通过将 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\extras\CUPTI\libx64 添加到环境变量来解决。

TensorFlow 1.0 教程中的 Cuda 问题貌似 TF 找不到 CUPTI/lib64？

Cuda issue in TensorFlow 1.0 tutorial looks like TF can't find CUPTI/lib64?

python

tensorflow

tensorboard