从头开始烧焦训练对象
Skorch training object from scratch
我正在尝试使用 skorch class 在 classifier 上执行 GridSearch。
我尝试了 运行 香草 NeuralNetClassifier
对象,但我还没有找到一种方法只将可训练的权重传递给 Adam 优化器(我使用的是预训练的嵌入,我想让它们保持冻结状态).如果初始化一个模块,然后使用 optimizer__params
选项传递这些权重,这是可行的,但模块需要一个未初始化的模型。有没有解决的办法?
net = NeuralNetClassifier(module=RNN, module__vocab_size=vocab_size, module__hidden_size=hidden_size,
module__embedding_dim=embedding_dim, module__pad_id=pad_id,
module__dataset=ClaimsDataset, lr=lr, criterion=nn.CrossEntropyLoss,
optimizer=torch.optim.Adam, optimizer__weight_decay=35e-3, device='cuda',
max_epochs=nb_epochs, warm_start=True)
上面的代码有效。但是,将 batch_size 设置为 64,我必须 运行 为每个批次的指定时期数创建模型!这不是我正在寻找的行为。如果有人可以建议更好的方法来做到这一点,我将不胜感激。
我的另一个问题是 subclassing skorch.NeuralNet
。我 运行 遇到了一个类似的问题:找出一种方法只将可训练的权重传递给 Adam 优化器。下面的代码是我到目前为止所得到的。
class Train(skorch.NeuralNet):
def __init__(self, module, lr, norm, *args, **kwargs):
self.module = module
self.lr = lr
self.norm = norm
self.params = [p for p in self.module.parameters(self) if p.requires_grad]
super(Train, self).__init__(*args, **kwargs)
def initialize_optimizer(self):
self.optimizer = torch.optim.Adam(params=self.params, lr=self.lr, weight_decay=35e-3, amsgrad=True)
def train_step(self, Xi, yi, **fit_params):
self.module.train()
self.optimizer.zero_grad()
yi = variable(yi)
output = self.module(Xi)
loss = self.criterion(output, yi)
loss.backward()
nn.utils.clip_grad_norm_(self.params, max_norm=self.norm)
self.optimizer.step()
def score(self, y_t, y_p):
return accuracy_score(y_t, y_p)
初始化 class 给出错误:
Traceback (most recent call last):
File "/snap/pycharm-community/74/helpers/pydev/pydevd.py", line 1664, in <module>
main()
File "/snap/pycharm-community/74/helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/snap/pycharm-community/74/helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/snap/pycharm-community/74/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/l/Documents/Bsrc/cv.py", line 115, in <module>
main()
File "/home/l/B/src/cv.py", line 86, in main
trainer = Train(module=RNN, criterion=nn.CrossEntropyLoss, lr=lr, norm=max_norm)
File "/home/l/B/src/cv.py", line 22, in __init__
self.params = [p for p in self.module.parameters(self) if p.requires_grad]
File "/home/l/B/src/cv.py", line 22, in <listcomp>
self.params = [p for p in self.module.parameters(self) if p.requires_grad]
File "/home/l/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 739, in parameters
for name, param in self.named_parameters():
AttributeError: 'Train' object has no attribute 'named_parameters'
but module
needs an uninitialized model
这是不正确的,您也可以传递一个初始化模型。 The documentation 模型参数状态:
It is, however, also possible to pass an instantiated module, e.g. a PyTorch Sequential instance.
问题是,在传递初始化模型时,您不能将任何 module__
参数传递给 NeuralNet
,因为这需要重新初始化模块。但是,如果您想对模块参数进行网格搜索,那当然会有问题。
一个解决方案是覆盖 initialize_model
并在创建新实例后加载并冻结参数(通过将参数的 requires_grad
属性设置为 False
):
def _load_embedding_weights(self):
return torch.randn(1, 100)
def initialize_module(self):
kwargs = self._get_params_for('module')
self.module_ = self.module(**kwargs)
# load weights
self.module_.embedding0.weight = self._load_embedding_weights()
# freeze layer
self.module_.embedding0.weight.requires_grad = False
return self
我正在尝试使用 skorch class 在 classifier 上执行 GridSearch。
我尝试了 运行 香草 NeuralNetClassifier
对象,但我还没有找到一种方法只将可训练的权重传递给 Adam 优化器(我使用的是预训练的嵌入,我想让它们保持冻结状态).如果初始化一个模块,然后使用 optimizer__params
选项传递这些权重,这是可行的,但模块需要一个未初始化的模型。有没有解决的办法?
net = NeuralNetClassifier(module=RNN, module__vocab_size=vocab_size, module__hidden_size=hidden_size,
module__embedding_dim=embedding_dim, module__pad_id=pad_id,
module__dataset=ClaimsDataset, lr=lr, criterion=nn.CrossEntropyLoss,
optimizer=torch.optim.Adam, optimizer__weight_decay=35e-3, device='cuda',
max_epochs=nb_epochs, warm_start=True)
上面的代码有效。但是,将 batch_size 设置为 64,我必须 运行 为每个批次的指定时期数创建模型!这不是我正在寻找的行为。如果有人可以建议更好的方法来做到这一点,我将不胜感激。
我的另一个问题是 subclassing skorch.NeuralNet
。我 运行 遇到了一个类似的问题:找出一种方法只将可训练的权重传递给 Adam 优化器。下面的代码是我到目前为止所得到的。
class Train(skorch.NeuralNet):
def __init__(self, module, lr, norm, *args, **kwargs):
self.module = module
self.lr = lr
self.norm = norm
self.params = [p for p in self.module.parameters(self) if p.requires_grad]
super(Train, self).__init__(*args, **kwargs)
def initialize_optimizer(self):
self.optimizer = torch.optim.Adam(params=self.params, lr=self.lr, weight_decay=35e-3, amsgrad=True)
def train_step(self, Xi, yi, **fit_params):
self.module.train()
self.optimizer.zero_grad()
yi = variable(yi)
output = self.module(Xi)
loss = self.criterion(output, yi)
loss.backward()
nn.utils.clip_grad_norm_(self.params, max_norm=self.norm)
self.optimizer.step()
def score(self, y_t, y_p):
return accuracy_score(y_t, y_p)
初始化 class 给出错误:
Traceback (most recent call last):
File "/snap/pycharm-community/74/helpers/pydev/pydevd.py", line 1664, in <module>
main()
File "/snap/pycharm-community/74/helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/snap/pycharm-community/74/helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/snap/pycharm-community/74/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/l/Documents/Bsrc/cv.py", line 115, in <module>
main()
File "/home/l/B/src/cv.py", line 86, in main
trainer = Train(module=RNN, criterion=nn.CrossEntropyLoss, lr=lr, norm=max_norm)
File "/home/l/B/src/cv.py", line 22, in __init__
self.params = [p for p in self.module.parameters(self) if p.requires_grad]
File "/home/l/B/src/cv.py", line 22, in <listcomp>
self.params = [p for p in self.module.parameters(self) if p.requires_grad]
File "/home/l/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 739, in parameters
for name, param in self.named_parameters():
AttributeError: 'Train' object has no attribute 'named_parameters'
but
module
needs an uninitialized model
这是不正确的,您也可以传递一个初始化模型。 The documentation 模型参数状态:
It is, however, also possible to pass an instantiated module, e.g. a PyTorch Sequential instance.
问题是,在传递初始化模型时,您不能将任何 module__
参数传递给 NeuralNet
,因为这需要重新初始化模块。但是,如果您想对模块参数进行网格搜索,那当然会有问题。
一个解决方案是覆盖 initialize_model
并在创建新实例后加载并冻结参数(通过将参数的 requires_grad
属性设置为 False
):
def _load_embedding_weights(self):
return torch.randn(1, 100)
def initialize_module(self):
kwargs = self._get_params_for('module')
self.module_ = self.module(**kwargs)
# load weights
self.module_.embedding0.weight = self._load_embedding_weights()
# freeze layer
self.module_.embedding0.weight.requires_grad = False
return self