Pytorch 动态层数?
Pytorch dynamic amount of Layers?
我正在尝试指定动态数量的图层,但我似乎做错了。
我的问题是,当我在这里定义 100 层时,我会在前进的步骤中出错。
但是当我正确定义图层时它会起作用吗?
下面的简化示例
class PredictFromEmbeddParaSmall(LightningModule):
def __init__(self, hyperparams={'lr': 0.0001}):
super(PredictFromEmbeddParaSmall, self).__init__()
#Input is something like tensor.size=[768*100]
self.TO_ILLUSTRATE = nn.Linear(768, 5)
self.enc_ref=[]
for i in range(100):
self.enc_red.append(nn.Linear(768, 5))
# gather the layers output sth
self.dense_simple1 = nn.Linear(5*100, 2)
self.output = nn.Sigmoid()
def forward(self, x):
# first input to enc_red
x_vecs = []
for i in range(self.para_count):
layer = self.enc_red[i]
# The first dim is the batch size here, output is correct
processed_slice = x[:, i * 768:(i + 1) * 768]
# This works and give the out of size 5
rand = self.TO_ILLUSTRATE(processed_slice)
#This will fail? Error below
ret = layer(processed_slice)
#more things happening we can ignore right now since we fail earlier
我在执行“ret = layer.forward(processed_slice)”时遇到这个错误
RuntimeError: Expected object of device type cuda but got device type
cpu for argument #1 'self' in call to _th_addmm
有没有更聪明的编程方法?或者解决错误?
您应该使用 pytorch 中的 ModuleList 而不是列表:https://pytorch.org/docs/master/generated/torch.nn.ModuleList.html。这是因为 Pytorch 必须保留包含模型所有模块的图表,如果您只是将它们添加到列表中,则它们不会在图表中正确索引,从而导致您遇到错误。
你的代码应该是这样的:
class PredictFromEmbeddParaSmall(LightningModule):
def __init__(self, hyperparams={'lr': 0.0001}):
super(PredictFromEmbeddParaSmall, self).__init__()
#Input is something like tensor.size=[768*100]
self.TO_ILLUSTRATE = nn.Linear(768, 5)
self.enc_ref=nn.ModuleList() # << MODIFIED LINE <<
for i in range(100):
self.enc_red.append(nn.Linear(768, 5))
# gather the layers output sth
self.dense_simple1 = nn.Linear(5*100, 2)
self.output = nn.Sigmoid()
def forward(self, x):
# first input to enc_red
x_vecs = []
for i in range(self.para_count):
layer = self.enc_red[i]
# The first dim is the batch size here, output is correct
processed_slice = x[:, i * 768:(i + 1) * 768]
# This works and give the out of size 5
rand = self.TO_ILLUSTRATE(processed_slice)
#This will fail? Error below
ret = layer(processed_slice)
#more things happening we can ignore right now since we fail earlier
那么它应该可以正常工作了!
编辑:替代方法。
除了使用ModuleList
,你也可以只使用nn.Sequential
,这样你就可以避免在正向传递中使用for
循环。这也意味着您将无法访问中间激活,因此如果您需要它们,这不是适合您的解决方案。
class PredictFromEmbeddParaSmall(LightningModule):
def __init__(self, hyperparams={'lr': 0.0001}):
super(PredictFromEmbeddParaSmall, self).__init__()
#Input is something like tensor.size=[768*100]
self.TO_ILLUSTRATE = nn.Linear(768, 5)
self.enc_ref=[]
for i in range(100):
self.enc_red.append(nn.Linear(768, 5))
self.enc_red = nn.Seqential(*self.enc_ref) # << MODIFIED LINE <<
# gather the layers output sth
self.dense_simple1 = nn.Linear(5*100, 2)
self.output = nn.Sigmoid()
def forward(self, x):
# first input to enc_red
x_vecs = []
out = self.enc_red(x) # << MODIFIED LINE <<
我正在尝试指定动态数量的图层,但我似乎做错了。 我的问题是,当我在这里定义 100 层时,我会在前进的步骤中出错。 但是当我正确定义图层时它会起作用吗? 下面的简化示例
class PredictFromEmbeddParaSmall(LightningModule):
def __init__(self, hyperparams={'lr': 0.0001}):
super(PredictFromEmbeddParaSmall, self).__init__()
#Input is something like tensor.size=[768*100]
self.TO_ILLUSTRATE = nn.Linear(768, 5)
self.enc_ref=[]
for i in range(100):
self.enc_red.append(nn.Linear(768, 5))
# gather the layers output sth
self.dense_simple1 = nn.Linear(5*100, 2)
self.output = nn.Sigmoid()
def forward(self, x):
# first input to enc_red
x_vecs = []
for i in range(self.para_count):
layer = self.enc_red[i]
# The first dim is the batch size here, output is correct
processed_slice = x[:, i * 768:(i + 1) * 768]
# This works and give the out of size 5
rand = self.TO_ILLUSTRATE(processed_slice)
#This will fail? Error below
ret = layer(processed_slice)
#more things happening we can ignore right now since we fail earlier
我在执行“ret = layer.forward(processed_slice)”时遇到这个错误
RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _th_addmm
有没有更聪明的编程方法?或者解决错误?
您应该使用 pytorch 中的 ModuleList 而不是列表:https://pytorch.org/docs/master/generated/torch.nn.ModuleList.html。这是因为 Pytorch 必须保留包含模型所有模块的图表,如果您只是将它们添加到列表中,则它们不会在图表中正确索引,从而导致您遇到错误。
你的代码应该是这样的:
class PredictFromEmbeddParaSmall(LightningModule):
def __init__(self, hyperparams={'lr': 0.0001}):
super(PredictFromEmbeddParaSmall, self).__init__()
#Input is something like tensor.size=[768*100]
self.TO_ILLUSTRATE = nn.Linear(768, 5)
self.enc_ref=nn.ModuleList() # << MODIFIED LINE <<
for i in range(100):
self.enc_red.append(nn.Linear(768, 5))
# gather the layers output sth
self.dense_simple1 = nn.Linear(5*100, 2)
self.output = nn.Sigmoid()
def forward(self, x):
# first input to enc_red
x_vecs = []
for i in range(self.para_count):
layer = self.enc_red[i]
# The first dim is the batch size here, output is correct
processed_slice = x[:, i * 768:(i + 1) * 768]
# This works and give the out of size 5
rand = self.TO_ILLUSTRATE(processed_slice)
#This will fail? Error below
ret = layer(processed_slice)
#more things happening we can ignore right now since we fail earlier
那么它应该可以正常工作了!
编辑:替代方法。
除了使用ModuleList
,你也可以只使用nn.Sequential
,这样你就可以避免在正向传递中使用for
循环。这也意味着您将无法访问中间激活,因此如果您需要它们,这不是适合您的解决方案。
class PredictFromEmbeddParaSmall(LightningModule):
def __init__(self, hyperparams={'lr': 0.0001}):
super(PredictFromEmbeddParaSmall, self).__init__()
#Input is something like tensor.size=[768*100]
self.TO_ILLUSTRATE = nn.Linear(768, 5)
self.enc_ref=[]
for i in range(100):
self.enc_red.append(nn.Linear(768, 5))
self.enc_red = nn.Seqential(*self.enc_ref) # << MODIFIED LINE <<
# gather the layers output sth
self.dense_simple1 = nn.Linear(5*100, 2)
self.output = nn.Sigmoid()
def forward(self, x):
# first input to enc_red
x_vecs = []
out = self.enc_red(x) # << MODIFIED LINE <<