使用 Pytorch 的线性自动编码器
Linear autoencoder using Pytorch
我们如何构建一个简单的线性自动编码器并使用 torch.optim 个优化器对其进行训练?
如何使用 autograd (.backward()) 并优化 MSE 损失,然后学习编码器和解码器中的权重和偏差的值(即编码器中的 3 个参数和4 在解码器中)?并且数据必须是随机的,对于每个运行的学习,从随机权重和偏差开始,例如:
wEncoder = torch.randn(D,1, requires_grad=True)
wDecoder = torch.randn(1,D, requires_grad=True)
bEncoder = torch.randn(1, requires_grad=True)
bDecoder = torch.randn(1,D, requires_grad=True)
目标优化器是 SGD,学习率 0.01,无动量,1000 步(从随机开始),那么我们如何绘制损失与时期(步数)的关系?
我试过了,但是每个时期的损失都是一样的。
D = 2
x = torch.rand(100,D)
x[:,0] = x[:,0] + x[:,1]
x[:,1] = 0.5*x[:,0] + x[:,1]
loss_fn = nn.MSELoss()
optimizer = optim.SGD([x[:,0],x[:,1]], lr=0.01)
losses = []
for epoch in range(1000):
running_loss = 0.0
inputs = x_reconstructed
targets = x
loss=loss_fn(inputs,targets)
loss.backward(retain_graph=True)
optimizer.step()
optimizer.zero_grad()
running_loss += loss.item()
epoch_loss = running_loss / len(data)
losses.append(running_loss)
这个例子应该能让你开始。进一步解释请看代码注释:
import torch
# Use torch.nn.Module to create models
class AutoEncoder(torch.nn.Module):
def __init__(self, features: int, hidden: int):
# Necessary in order to log C++ API usage and other internals
super().__init__()
self.encoder = torch.nn.Linear(features, hidden)
self.decoder = torch.nn.Linear(hidden, features)
def forward(self, X):
return self.decoder(self.encoder(X))
def encode(self, X):
return self.encoder(X)
# Random data
data = torch.rand(100, 4)
model = AutoEncoder(4, 10)
# Pass model.parameters() for increased readability
# Weights of encoder and decoder will be passed
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
loss_fn = torch.nn.MSELoss()
# Per-epoch losses are gathered
# Loss is the mean of batch elements, in our case mean of 100 elements
losses = []
for epoch in range(1000):
reconstructed = model(data)
loss = loss_fn(reconstructed, data)
# No need to retain_graph=True as you are not performing multiple passes
# of backpropagation
loss.backward()
optimizer.step()
optimizer.zero_grad()
losses.append(loss.item())
请注意线性自动编码器大致相当于PCA decomposition,效率更高。
除非仅用于训练目的,否则您可能应该使用非线性自动编码器。
我们也可以简单地使用 nn.Sequential()
,例如,使用以下代码片段:
import torch
encoded_dim = 32
encoder = torch.nn.Sequential(
torch.nn.Flatten(),
torch.nn.Linear(28*28, 256),
torch.nn.Sigmoid(),
torch.nn.Linear(256, 64),
torch.nn.Sigmoid(),
torch.nn.Linear(64, encoded_dim)
)
decoder = torch.nn.Sequential(
torch.nn.Linear(encoded_dim, 64),
torch.nn.Sigmoid(),
torch.nn.Linear(64, 256),
torch.nn.Sigmoid(),
torch.nn.Linear(256, 28*28),
torch.nn.Sigmoid(),
torch.nn.Unflatten(1, (28,28))
)
autoencoder = torch.nn.Sequential(encoder, decoder)
autoencoder
# Sequential(
# (0): Sequential(
# (0): Flatten(start_dim=1, end_dim=-1)
# (1): Linear(in_features=784, out_features=256, bias=True)
# (2): Sigmoid()
# (3): Linear(in_features=256, out_features=64, bias=True)
# (4): Sigmoid()
# (5): Linear(in_features=64, out_features=32, bias=True)
# )
# (1): Sequential(
# (0): Linear(in_features=32, out_features=64, bias=True)
# (1): Sigmoid()
# (2): Linear(in_features=64, out_features=256, bias=True)
# (3): Sigmoid()
# (4): Linear(in_features=256, out_features=784, bias=True)
# (5): Sigmoid()
# (6): Unflatten(dim=1, unflattened_size=(28, 28))
# )
#)
使用 MNIST 数据进行训练的示例
加载数据 (MNIST) torchvision
:
train_loader = torch.utils.data.DataLoader(
torchvision.datasets.MNIST('./data', train=True, download=True,
transform=torchvision.transforms.Compose([
torchvision.transforms.ToTensor(),
# ...
])),
batch_size=64, shuffle=True)
现在,让我们训练自动编码器模型,使用的优化器是Adam
,尽管也可以使用SGD
:
loss_fn = torch.nn.BCELoss()
optimizer = torch.optim.Adam(autoencoder.parameters(), lr=1e-3, weight_decay=1e-5)
for epoch in range(10):
for idx, (x, _) in enumerate(train_loader):
x = x.squeeze()
x = x / x.max()
x_pred = autoencoder(x) # forward pass
loss = loss_fn(x_pred, x)
if idx % 1024 == 0:
print(epoch, loss.item())
optimizer.zero_grad()
loss.backward() # backward pass
optimizer.step()
# epoch loss
# 0 0.702496349811554
# 1 0.24611620604991913
# 2 0.20603498816490173
# 3 0.1827092468738556
# 4 0.1805819869041443
# 5 0.16927748918533325
# 6 0.17275433242321014
# 7 0.15827134251594543
# 8 0.1635081171989441
# 9 0.15693898499011993
下面的动画显示了自动编码器在不同时期重建一些随机选择的图像,注意 MNIST 数字的重建如何随着越来越多的时期变得更好:
我们如何构建一个简单的线性自动编码器并使用 torch.optim 个优化器对其进行训练?
如何使用 autograd (.backward()) 并优化 MSE 损失,然后学习编码器和解码器中的权重和偏差的值(即编码器中的 3 个参数和4 在解码器中)?并且数据必须是随机的,对于每个运行的学习,从随机权重和偏差开始,例如:
wEncoder = torch.randn(D,1, requires_grad=True)
wDecoder = torch.randn(1,D, requires_grad=True)
bEncoder = torch.randn(1, requires_grad=True)
bDecoder = torch.randn(1,D, requires_grad=True)
目标优化器是 SGD,学习率 0.01,无动量,1000 步(从随机开始),那么我们如何绘制损失与时期(步数)的关系?
我试过了,但是每个时期的损失都是一样的。
D = 2
x = torch.rand(100,D)
x[:,0] = x[:,0] + x[:,1]
x[:,1] = 0.5*x[:,0] + x[:,1]
loss_fn = nn.MSELoss()
optimizer = optim.SGD([x[:,0],x[:,1]], lr=0.01)
losses = []
for epoch in range(1000):
running_loss = 0.0
inputs = x_reconstructed
targets = x
loss=loss_fn(inputs,targets)
loss.backward(retain_graph=True)
optimizer.step()
optimizer.zero_grad()
running_loss += loss.item()
epoch_loss = running_loss / len(data)
losses.append(running_loss)
这个例子应该能让你开始。进一步解释请看代码注释:
import torch
# Use torch.nn.Module to create models
class AutoEncoder(torch.nn.Module):
def __init__(self, features: int, hidden: int):
# Necessary in order to log C++ API usage and other internals
super().__init__()
self.encoder = torch.nn.Linear(features, hidden)
self.decoder = torch.nn.Linear(hidden, features)
def forward(self, X):
return self.decoder(self.encoder(X))
def encode(self, X):
return self.encoder(X)
# Random data
data = torch.rand(100, 4)
model = AutoEncoder(4, 10)
# Pass model.parameters() for increased readability
# Weights of encoder and decoder will be passed
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
loss_fn = torch.nn.MSELoss()
# Per-epoch losses are gathered
# Loss is the mean of batch elements, in our case mean of 100 elements
losses = []
for epoch in range(1000):
reconstructed = model(data)
loss = loss_fn(reconstructed, data)
# No need to retain_graph=True as you are not performing multiple passes
# of backpropagation
loss.backward()
optimizer.step()
optimizer.zero_grad()
losses.append(loss.item())
请注意线性自动编码器大致相当于PCA decomposition,效率更高。
除非仅用于训练目的,否则您可能应该使用非线性自动编码器。
我们也可以简单地使用 nn.Sequential()
,例如,使用以下代码片段:
import torch
encoded_dim = 32
encoder = torch.nn.Sequential(
torch.nn.Flatten(),
torch.nn.Linear(28*28, 256),
torch.nn.Sigmoid(),
torch.nn.Linear(256, 64),
torch.nn.Sigmoid(),
torch.nn.Linear(64, encoded_dim)
)
decoder = torch.nn.Sequential(
torch.nn.Linear(encoded_dim, 64),
torch.nn.Sigmoid(),
torch.nn.Linear(64, 256),
torch.nn.Sigmoid(),
torch.nn.Linear(256, 28*28),
torch.nn.Sigmoid(),
torch.nn.Unflatten(1, (28,28))
)
autoencoder = torch.nn.Sequential(encoder, decoder)
autoencoder
# Sequential(
# (0): Sequential(
# (0): Flatten(start_dim=1, end_dim=-1)
# (1): Linear(in_features=784, out_features=256, bias=True)
# (2): Sigmoid()
# (3): Linear(in_features=256, out_features=64, bias=True)
# (4): Sigmoid()
# (5): Linear(in_features=64, out_features=32, bias=True)
# )
# (1): Sequential(
# (0): Linear(in_features=32, out_features=64, bias=True)
# (1): Sigmoid()
# (2): Linear(in_features=64, out_features=256, bias=True)
# (3): Sigmoid()
# (4): Linear(in_features=256, out_features=784, bias=True)
# (5): Sigmoid()
# (6): Unflatten(dim=1, unflattened_size=(28, 28))
# )
#)
使用 MNIST 数据进行训练的示例
加载数据 (MNIST) torchvision
:
train_loader = torch.utils.data.DataLoader(
torchvision.datasets.MNIST('./data', train=True, download=True,
transform=torchvision.transforms.Compose([
torchvision.transforms.ToTensor(),
# ...
])),
batch_size=64, shuffle=True)
现在,让我们训练自动编码器模型,使用的优化器是Adam
,尽管也可以使用SGD
:
loss_fn = torch.nn.BCELoss()
optimizer = torch.optim.Adam(autoencoder.parameters(), lr=1e-3, weight_decay=1e-5)
for epoch in range(10):
for idx, (x, _) in enumerate(train_loader):
x = x.squeeze()
x = x / x.max()
x_pred = autoencoder(x) # forward pass
loss = loss_fn(x_pred, x)
if idx % 1024 == 0:
print(epoch, loss.item())
optimizer.zero_grad()
loss.backward() # backward pass
optimizer.step()
# epoch loss
# 0 0.702496349811554
# 1 0.24611620604991913
# 2 0.20603498816490173
# 3 0.1827092468738556
# 4 0.1805819869041443
# 5 0.16927748918533325
# 6 0.17275433242321014
# 7 0.15827134251594543
# 8 0.1635081171989441
# 9 0.15693898499011993
下面的动画显示了自动编码器在不同时期重建一些随机选择的图像,注意 MNIST 数字的重建如何随着越来越多的时期变得更好: