轴不匹配数组/大小不匹配,m1:[132096 x 344],m2:[118336 x 128]
axes don't match array / size mismatch, m1: [132096 x 344], m2: [118336 x 128]
这是一个线性自动编码器代码,原图是344*344 RGB,训练过程结束后,我想用下面的代码显示解码后的图片,但是有ValueError: axes don' t 匹配数组
pytorch,googlecolab(GPU)
enter code here:
EPOCH = 20
BATCH_SIZE = 128
LR = 0.005 # learning rate
torch.cuda.empty_cache()
data_transforms = torchvision.transforms.Compose([
torchvision.transforms.RandomResizedCrop(344),
torchvision.transforms.RandomHorizontalFlip(),
torchvision.transforms.ToTensor()])
path1 = 'drive/My Drive/Colab/image/test/'
train_data = torchvision.datasets.ImageFolder(path1,
transform=data_transforms)
train_loader = Data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE,
shuffle=True)
class AutoEncoder(nn.Module):
def __init__(self):
super(AutoEncoder, self).__init__()
self.encoder = nn.Sequential(
nn.Linear(3*344*344, 128),
nn.Tanh(), # 激活
nn.Linear(128, 64),
nn.Tanh(),
nn.Linear(64, 12),
nn.Tanh(),
nn.Linear(12, 3), # compress to 3 features which can be
visualized in plt
)
self.decoder = nn.Sequential(
nn.Linear(3, 12),
nn.Tanh(),
nn.Linear(12, 64),
nn.Tanh(),
nn.Linear(64, 128),
nn.Tanh(),
nn.Linear(128, 3*344*344),
nn.Sigmoid(), # compress to a range (0, 1)
)
def forward(self, x):
x = x.view(x.size(0), -1)
encoded = self.encoder(x)
decoded = self.decoder(encoded)
return encoded, decoded
autoencoder = AutoEncoder()
optimizer = torch.optim.Adam(autoencoder.parameters(), lr=LR)
loss_func = nn.MSELoss()
for epoch in range(EPOCH):
for step, (x, b_label) in enumerate(train_loader):
b_x = x.view(-1, 3*344*344) # batch x, shape (batch, 28*28)
b_y = x.view(-1, 3*344*344) # batch y, shape (batch, 28*28)
encoded, decoded = autoencoder(b_x)
loss = loss_func(decoded, b_y) # mean square error
optimizer.zero_grad() # clear gradients for this
training step
loss.backward() # backpropagation, compute
gradients
optimizer.step() # apply gradients
###################################################
######## below is used to plot decoded pic ########
with torch.no_grad():
for img, label in train_loader :
fig = plt.figure()
)
imggg = np.transpose(img[0],(1,2,0))
ax1 = fig.add_subplot(121)
ax1.imshow(imggg)
if torch.cuda.is_available():
img = Variable(img.to())
else:
img = Variable(img)
encoded, decoded = autoencoder(img)
decodeddd = np.transpose(decoded.cpu()[0],(1,2,0))
ax2 = fig.add_subplot(122)
ax2.imshow(decodeddd)
我希望输出2张图片,但现在只显示原始图片,解码后的图片不显示。
训练过程很好,就是不知道图片大小有什么问题
decoder
返回形状为 BATCH_SIZE x 355008
的线性输出。首先,我们需要在应用转置之前将第二个维度重塑为形状 3 x 344 x 344
的 3 个维度。将 decodeddd
替换为以下内容应该可以解决问题:
decodeddd = np.transpose(decoded.cpu()[0].view(3, 344, 344),(1,2,0))
这是一个线性自动编码器代码,原图是344*344 RGB,训练过程结束后,我想用下面的代码显示解码后的图片,但是有ValueError: axes don' t 匹配数组
pytorch,googlecolab(GPU)
enter code here:
EPOCH = 20
BATCH_SIZE = 128
LR = 0.005 # learning rate
torch.cuda.empty_cache()
data_transforms = torchvision.transforms.Compose([
torchvision.transforms.RandomResizedCrop(344),
torchvision.transforms.RandomHorizontalFlip(),
torchvision.transforms.ToTensor()])
path1 = 'drive/My Drive/Colab/image/test/'
train_data = torchvision.datasets.ImageFolder(path1,
transform=data_transforms)
train_loader = Data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE,
shuffle=True)
class AutoEncoder(nn.Module):
def __init__(self):
super(AutoEncoder, self).__init__()
self.encoder = nn.Sequential(
nn.Linear(3*344*344, 128),
nn.Tanh(), # 激活
nn.Linear(128, 64),
nn.Tanh(),
nn.Linear(64, 12),
nn.Tanh(),
nn.Linear(12, 3), # compress to 3 features which can be
visualized in plt
)
self.decoder = nn.Sequential(
nn.Linear(3, 12),
nn.Tanh(),
nn.Linear(12, 64),
nn.Tanh(),
nn.Linear(64, 128),
nn.Tanh(),
nn.Linear(128, 3*344*344),
nn.Sigmoid(), # compress to a range (0, 1)
)
def forward(self, x):
x = x.view(x.size(0), -1)
encoded = self.encoder(x)
decoded = self.decoder(encoded)
return encoded, decoded
autoencoder = AutoEncoder()
optimizer = torch.optim.Adam(autoencoder.parameters(), lr=LR)
loss_func = nn.MSELoss()
for epoch in range(EPOCH):
for step, (x, b_label) in enumerate(train_loader):
b_x = x.view(-1, 3*344*344) # batch x, shape (batch, 28*28)
b_y = x.view(-1, 3*344*344) # batch y, shape (batch, 28*28)
encoded, decoded = autoencoder(b_x)
loss = loss_func(decoded, b_y) # mean square error
optimizer.zero_grad() # clear gradients for this
training step
loss.backward() # backpropagation, compute
gradients
optimizer.step() # apply gradients
###################################################
######## below is used to plot decoded pic ########
with torch.no_grad():
for img, label in train_loader :
fig = plt.figure()
)
imggg = np.transpose(img[0],(1,2,0))
ax1 = fig.add_subplot(121)
ax1.imshow(imggg)
if torch.cuda.is_available():
img = Variable(img.to())
else:
img = Variable(img)
encoded, decoded = autoencoder(img)
decodeddd = np.transpose(decoded.cpu()[0],(1,2,0))
ax2 = fig.add_subplot(122)
ax2.imshow(decodeddd)
我希望输出2张图片,但现在只显示原始图片,解码后的图片不显示。
训练过程很好,就是不知道图片大小有什么问题
decoder
返回形状为 BATCH_SIZE x 355008
的线性输出。首先,我们需要在应用转置之前将第二个维度重塑为形状 3 x 344 x 344
的 3 个维度。将 decodeddd
替换为以下内容应该可以解决问题:
decodeddd = np.transpose(decoded.cpu()[0].view(3, 344, 344),(1,2,0))