使用 TensorFlow 时的 GPU 内存管理问题
GPU Memory management issues when using TensorFlow
|进程:GPU 内存 |
| GPU PID 类型进程名称用法
| 0 6944 C python3 11585MiB |
| 1 6944 C python3 11587MiB |
| 2 6944 C python3 10621MiB |
中途停止tensorflow后nvidia-smi
内存没有释放
尝试使用这个
config = tf.ConfigProto()
config.gpu_options.allocator_type = 'BFC'
config.gpu_options.per_process_gpu_memory_fraction = 0.90
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
还有
with tf.device('/gpu:0'):
with tf.Graph().as_default():
已尝试重置 GPU
sudo nvidia-smi --gpu-reset -i 0
内存根本无法释放
解决方案是从 获得的,感谢 Yaroslav。
大部分信息是从 Tensorflow Whosebug 文档中获得的。我不允许 post 它。不知道为什么。
将此插入代码的开头。
from tensorflow.python.client import device_lib
# Set the environment variables
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
# Double check that you have the correct devices visible to TF
print("{0}\nThe available CPU/GPU devices on your system\n{0}".format('=' * 100))
print(device_lib.list_local_devices())
Different options to start with GPU or CPU. I am using the CPU. Can be changed from the below options
with tf.device('/cpu:0'):
# with tf.device('/gpu:0'):
# with tf.Graph().as_default():
在会话中使用以下行:
config = tf.ConfigProto(device_count={'GPU': 1}, log_device_placement=False,
allow_soft_placement=True)
# allocate only as much GPU memory based on runtime allocations
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
# Session needs to be closed
sess.close()
以下行将解决被 python
锁定的资源问题
with tf.Session(config=config) as sess:
另一篇有助于理解 重要性的文章
请检查来自 tensorflow 的官方 tf.Session()。
参数说明
To find out which devices your operations and tensors are assigned to, create the session with
log_device_placement configuration option set to True.
TensorFlow to automatically choose an existing and supported device to run the operations in case the specified
one doesn't exist, you can set allow_soft_placement=True in the configuration option when creating the session.
|进程:GPU 内存 |
| GPU PID 类型进程名称用法
| 0 6944 C python3 11585MiB |
| 1 6944 C python3 11587MiB |
| 2 6944 C python3 10621MiB |
中途停止tensorflow后nvidia-smi
内存没有释放
尝试使用这个
config = tf.ConfigProto()
config.gpu_options.allocator_type = 'BFC'
config.gpu_options.per_process_gpu_memory_fraction = 0.90
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
还有
with tf.device('/gpu:0'):
with tf.Graph().as_default():
已尝试重置 GPU
sudo nvidia-smi --gpu-reset -i 0
内存根本无法释放
解决方案是从
大部分信息是从 Tensorflow Whosebug 文档中获得的。我不允许 post 它。不知道为什么。
将此插入代码的开头。
from tensorflow.python.client import device_lib
# Set the environment variables
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
# Double check that you have the correct devices visible to TF
print("{0}\nThe available CPU/GPU devices on your system\n{0}".format('=' * 100))
print(device_lib.list_local_devices())
Different options to start with GPU or CPU. I am using the CPU. Can be changed from the below options
with tf.device('/cpu:0'):
# with tf.device('/gpu:0'):
# with tf.Graph().as_default():
在会话中使用以下行:
config = tf.ConfigProto(device_count={'GPU': 1}, log_device_placement=False,
allow_soft_placement=True)
# allocate only as much GPU memory based on runtime allocations
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
# Session needs to be closed
sess.close()
以下行将解决被 python
锁定的资源问题with tf.Session(config=config) as sess:
另一篇有助于理解
参数说明
To find out which devices your operations and tensors are assigned to, create the session with
log_device_placement configuration option set to True.
TensorFlow to automatically choose an existing and supported device to run the operations in case the specified
one doesn't exist, you can set allow_soft_placement=True in the configuration option when creating the session.