Pytorch CPU 没有 gpu 的 CUDA 设备加载
Pytorch CPU CUDA device load without gpu
我找到了这个很棒的代码 Pytorch mobilenet,我无法在 CPU 上获得 运行。
https://github.com/rdroste/unisal
我是 Pytorch 的新手,所以我不知道该怎么做。
在模块的第 174 行 train.py 设置了设备:
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
就我对 Pytorch 的了解而言,这是正确的。
我也必须更改 torch.load 吗?我试过了没有成功。
class BaseModel(nn.Module):
"""Abstract model class with functionality to save and load weights"""
def forward(self, *input):
raise NotImplementedError
def save_weights(self, directory, name):
torch.save(self.state_dict(), directory / f'weights_{name}.pth')
def load_weights(self, directory, name):
self.load_state_dict(torch.load(directory / f'weights_{name}.pth'))
def load_best_weights(self, directory):
self.load_state_dict(torch.load(directory / f'weights_best.pth'))
def load_epoch_checkpoint(self, directory, epoch):
"""Load state_dict from a Trainer checkpoint at a specific epoch"""
chkpnt = torch.load(directory / f"chkpnt_epoch{epoch:04d}.pth")
self.load_state_dict(chkpnt['model_state_dict'])
def load_checkpoint(self, file):
"""Load state_dict from a specific Trainer checkpoint"""
"""Load """
chkpnt = torch.load(file)
self.load_state_dict(chkpnt['model_state_dict'])
def load_last_chkpnt(self, directory):
"""Load state_dict from the last Trainer checkpoint"""
last_chkpnt = sorted(list(directory.glob('chkpnt_epoch*.pth')))[-1]
self.load_checkpoint(last_chkpnt)
我不明白。我必须在哪里告诉 pytorch 没有 gpu?
完整错误:
Traceback (most recent call last):
File "run.py", line 99, in <module>
fire.Fire()
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/fire/core.py", line 138, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/fire/core.py", line 471, in _Fire
target=component.__name__)
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/fire/core.py", line 675, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "run.py", line 95, in predict_examples
example_folder, is_video, train_id=train_id, source=source)
File "run.py", line 72, in predictions_from_folder
folder_path, is_video, source=source, model_domain=model_domain)
File "/home/b256/Data/saliency_models/unisal-master/unisal/train.py", line 871, in generate_predictions_from_path
self.model.load_best_weights(self.train_dir)
File "/home/b256/Data/saliency_models/unisal-master/unisal/train.py", line 1057, in model
self._model = model_cls(**self.model_cfg)
File "/home/b256/Data/saliency_models/unisal-master/unisal/model.py", line 190, in __init__
self.cnn = MobileNetV2(**self.cnn_cfg)
File "/home/b256/Data/saliency_models/unisal-master/unisal/models/MobileNetV2.py", line 156, in __init__
Path(__file__).resolve().parent / 'weights/mobilenet_v2.pth.tar')
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 367, in load
return _load(f, map_location, pickle_module)
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 538, in _load
result = unpickler.load()
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 504, in persistent_load
data_type(size), location)
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 113, in default_restore_location
result = fn(storage, location)
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 94, in _cuda_deserialize
device = validate_cuda_device(location)
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 78, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.
在 https://pytorch.org/tutorials/beginner/saving_loading_models.html#save-on-gpu-load-on-cpu 中,您会看到有一个 map_location
关键字参数用于将权重发送到正确的设备:
model.load_state_dict(torch.load(PATH, map_location=device))
来自文档 https://pytorch.org/docs/stable/generated/torch.load.html#torch.load
torch.load() uses Python’s unpickling facilities but treats storages,
which underlie tensors, specially. They are first deserialized on the
CPU and are then moved to the device they were saved from. If this
fails (e.g. because the run time system doesn’t have certain devices),
an exception is raised. However, storages can be dynamically remapped
to an alternative set of devices using the map_location argument.
If map_location is a callable, it will be called once for each
serialized storage with two arguments: storage and location. The
storage argument will be the initial deserialization of the storage,
residing on the CPU. Each serialized storage has a location tag
associated with it which identifies the device it was saved from, and
this tag is the second argument passed to map_location. The builtin
location tags are 'cpu' for CPU tensors and 'cuda:device_id' (e.g.
'cuda:2') for CUDA tensors. map_location should return either None or
a storage. If map_location returns a storage, it will be used as the
final deserialized object, already moved to the right device.
Otherwise, torch.load() will fall back to the default behavior, as if
map_location wasn’t specified.
If map_location is a torch.device object or a string containing a
device tag, it indicates the location where all tensors should be
loaded.
Otherwise, if map_location is a dict, it will be used to remap
location tags appearing in the file (keys), to ones that specify where
to put the storages (values).
我找到了这个很棒的代码 Pytorch mobilenet,我无法在 CPU 上获得 运行。 https://github.com/rdroste/unisal
我是 Pytorch 的新手,所以我不知道该怎么做。
在模块的第 174 行 train.py 设置了设备:
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
就我对 Pytorch 的了解而言,这是正确的。
我也必须更改 torch.load 吗?我试过了没有成功。
class BaseModel(nn.Module):
"""Abstract model class with functionality to save and load weights"""
def forward(self, *input):
raise NotImplementedError
def save_weights(self, directory, name):
torch.save(self.state_dict(), directory / f'weights_{name}.pth')
def load_weights(self, directory, name):
self.load_state_dict(torch.load(directory / f'weights_{name}.pth'))
def load_best_weights(self, directory):
self.load_state_dict(torch.load(directory / f'weights_best.pth'))
def load_epoch_checkpoint(self, directory, epoch):
"""Load state_dict from a Trainer checkpoint at a specific epoch"""
chkpnt = torch.load(directory / f"chkpnt_epoch{epoch:04d}.pth")
self.load_state_dict(chkpnt['model_state_dict'])
def load_checkpoint(self, file):
"""Load state_dict from a specific Trainer checkpoint"""
"""Load """
chkpnt = torch.load(file)
self.load_state_dict(chkpnt['model_state_dict'])
def load_last_chkpnt(self, directory):
"""Load state_dict from the last Trainer checkpoint"""
last_chkpnt = sorted(list(directory.glob('chkpnt_epoch*.pth')))[-1]
self.load_checkpoint(last_chkpnt)
我不明白。我必须在哪里告诉 pytorch 没有 gpu?
完整错误:
Traceback (most recent call last):
File "run.py", line 99, in <module>
fire.Fire()
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/fire/core.py", line 138, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/fire/core.py", line 471, in _Fire
target=component.__name__)
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/fire/core.py", line 675, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "run.py", line 95, in predict_examples
example_folder, is_video, train_id=train_id, source=source)
File "run.py", line 72, in predictions_from_folder
folder_path, is_video, source=source, model_domain=model_domain)
File "/home/b256/Data/saliency_models/unisal-master/unisal/train.py", line 871, in generate_predictions_from_path
self.model.load_best_weights(self.train_dir)
File "/home/b256/Data/saliency_models/unisal-master/unisal/train.py", line 1057, in model
self._model = model_cls(**self.model_cfg)
File "/home/b256/Data/saliency_models/unisal-master/unisal/model.py", line 190, in __init__
self.cnn = MobileNetV2(**self.cnn_cfg)
File "/home/b256/Data/saliency_models/unisal-master/unisal/models/MobileNetV2.py", line 156, in __init__
Path(__file__).resolve().parent / 'weights/mobilenet_v2.pth.tar')
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 367, in load
return _load(f, map_location, pickle_module)
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 538, in _load
result = unpickler.load()
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 504, in persistent_load
data_type(size), location)
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 113, in default_restore_location
result = fn(storage, location)
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 94, in _cuda_deserialize
device = validate_cuda_device(location)
File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 78, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.
在 https://pytorch.org/tutorials/beginner/saving_loading_models.html#save-on-gpu-load-on-cpu 中,您会看到有一个 map_location
关键字参数用于将权重发送到正确的设备:
model.load_state_dict(torch.load(PATH, map_location=device))
来自文档 https://pytorch.org/docs/stable/generated/torch.load.html#torch.load
torch.load() uses Python’s unpickling facilities but treats storages, which underlie tensors, specially. They are first deserialized on the CPU and are then moved to the device they were saved from. If this fails (e.g. because the run time system doesn’t have certain devices), an exception is raised. However, storages can be dynamically remapped to an alternative set of devices using the map_location argument.
If map_location is a callable, it will be called once for each serialized storage with two arguments: storage and location. The storage argument will be the initial deserialization of the storage, residing on the CPU. Each serialized storage has a location tag associated with it which identifies the device it was saved from, and this tag is the second argument passed to map_location. The builtin location tags are 'cpu' for CPU tensors and 'cuda:device_id' (e.g. 'cuda:2') for CUDA tensors. map_location should return either None or a storage. If map_location returns a storage, it will be used as the final deserialized object, already moved to the right device. Otherwise, torch.load() will fall back to the default behavior, as if map_location wasn’t specified.
If map_location is a torch.device object or a string containing a device tag, it indicates the location where all tensors should be loaded.
Otherwise, if map_location is a dict, it will be used to remap location tags appearing in the file (keys), to ones that specify where to put the storages (values).