使用 LSTM 有状态传递上下文 b/w 批次;上下文传递可能有一些错误,没有得到好的结果?
Using LSTM stateful for passing context b/w batches; may be some error in context passing, not getting good results?
我已经检查过数据,然后再将其发送到网络。数据正确。
使用 LSTM 并传递上下文 b/w 批次。 per_class_accuracy在变化,但损失没有下降。卡了好久,不知道代码有没有错误?
我有多个class class基于不平衡数据集的化问题
Dataset_type:CSV
Dataset_size: 20000
基于传感器的 CSV 数据
X = 0.6986111111111111,0,0,1,0,1,0,0,0,1,0,0,0,0,1,0,0,0,1,1,0,0 ,0
Y = 离开家
每 class 准确度:
{'leaveHouse':0.34932855,'getDressed':1.0,'idle':0.8074534,'prepareBreakfast':0.8,'goToBed':0.35583413,'getDrink':0.0,'takeShower': 1.0, 'useToilet': 0.0, 'eatBreakfast': 0.8857143}
培训:
# Using loss weights, the inverse of class frequency
criterion = nn.CrossEntropyLoss(weight = class_weights)
hn, cn = model.init_hidden(batch_size)
for i, (input, label) in enumerate(trainLoader):
hn.detach_()
cn.detach_()
input = input.view(-1, seq_dim, input_dim)
if torch.cuda.is_available():
input = input.float().cuda()
label = label.cuda()
else:
input = input.float()
label = label
# Forward pass to get output/logits
output, (hn, cn) = model((input, (hn, cn)))
# Calculate Loss: softmax --> cross entropy loss
loss = criterion(output, label)#weig pram
running_loss += loss
loss.backward() # Backward pass
optimizer.step() # Now we can do an optimizer step
optimizer.zero_grad() # Reset gradients tensors
网络
class LSTMModel(nn.Module):
def init_hidden(self, batch_size):
self.batch_size = batch_size
if torch.cuda.is_available():
hn = torch.zeros(self.layer_dim, self.batch_size, self.hidden_dim).cuda()
# Initialize cell state
cn = torch.zeros(self.layer_dim, self.batch_size, self.hidden_dim).cuda()
else:
hn = torch.zeros(self.layer_dim, self.batch_size, self.hidden_dim)
# Initialize cell state
cn = torch.zeros(self.layer_dim, self.batch_size, self.hidden_dim)
return hn, cn
def __init__(self, input_dim, hidden_dim, layer_dim, output_dim, seq_dim):
super(LSTMModel, self).__init__()
# Hidden dimensions
self.hidden_dim = hidden_dim
# Number of hidden layers
self.layer_dim = layer_dim
self.input_dim = input_dim
# Building your LSTM
# batch_first=True causes input/output tensors to be of shape
# (batch_dim, seq_dim, feature_dim)
self.lstm = nn.LSTM(self.input_dim, hidden_dim, layer_dim, batch_first=True)
# Readout layer
self.fc = nn.Linear(hidden_dim, output_dim)
self.relu = nn.ReLU()
self.softmax = nn.Softmax(dim=1)
self.seq_dim = seq_dim
def forward(self, inputs):
# Initialize hidden state with zeros
input, (hn, cn) = inputs
input = input.view(-1, self.seq_dim, self.input_dim)
# time steps
out, (hn, cn) = self.lstm(input, (hn, cn))
# Index hidden state of last time step
out = self.fc(out[:, -1, :])
out = self.softmax(out)
return out, (hn,cn)
您可能遇到的一个问题是 CrossEntropyLoss
将对数 softmax 操作与负对数似然损失相结合,但您在模型中应用了 softmax。您应该将最后一层的原始 logits 传递给 CrossEntropyLoss
.
此外,我不会说没有看到模型前向传递,但看起来您正在将维度 1 上的 softmax 应用于(我推断)具有形状 batch_size, sequence_length, output_dim
的张量,当你应该沿着输出暗淡应用它。
我已经检查过数据,然后再将其发送到网络。数据正确。
使用 LSTM 并传递上下文 b/w 批次。 per_class_accuracy在变化,但损失没有下降。卡了好久,不知道代码有没有错误?
我有多个class class基于不平衡数据集的化问题
Dataset_type:CSV
Dataset_size: 20000
基于传感器的 CSV 数据
X = 0.6986111111111111,0,0,1,0,1,0,0,0,1,0,0,0,0,1,0,0,0,1,1,0,0 ,0
Y = 离开家
每 class 准确度: {'leaveHouse':0.34932855,'getDressed':1.0,'idle':0.8074534,'prepareBreakfast':0.8,'goToBed':0.35583413,'getDrink':0.0,'takeShower': 1.0, 'useToilet': 0.0, 'eatBreakfast': 0.8857143}
培训:
# Using loss weights, the inverse of class frequency
criterion = nn.CrossEntropyLoss(weight = class_weights)
hn, cn = model.init_hidden(batch_size)
for i, (input, label) in enumerate(trainLoader):
hn.detach_()
cn.detach_()
input = input.view(-1, seq_dim, input_dim)
if torch.cuda.is_available():
input = input.float().cuda()
label = label.cuda()
else:
input = input.float()
label = label
# Forward pass to get output/logits
output, (hn, cn) = model((input, (hn, cn)))
# Calculate Loss: softmax --> cross entropy loss
loss = criterion(output, label)#weig pram
running_loss += loss
loss.backward() # Backward pass
optimizer.step() # Now we can do an optimizer step
optimizer.zero_grad() # Reset gradients tensors
网络
class LSTMModel(nn.Module):
def init_hidden(self, batch_size):
self.batch_size = batch_size
if torch.cuda.is_available():
hn = torch.zeros(self.layer_dim, self.batch_size, self.hidden_dim).cuda()
# Initialize cell state
cn = torch.zeros(self.layer_dim, self.batch_size, self.hidden_dim).cuda()
else:
hn = torch.zeros(self.layer_dim, self.batch_size, self.hidden_dim)
# Initialize cell state
cn = torch.zeros(self.layer_dim, self.batch_size, self.hidden_dim)
return hn, cn
def __init__(self, input_dim, hidden_dim, layer_dim, output_dim, seq_dim):
super(LSTMModel, self).__init__()
# Hidden dimensions
self.hidden_dim = hidden_dim
# Number of hidden layers
self.layer_dim = layer_dim
self.input_dim = input_dim
# Building your LSTM
# batch_first=True causes input/output tensors to be of shape
# (batch_dim, seq_dim, feature_dim)
self.lstm = nn.LSTM(self.input_dim, hidden_dim, layer_dim, batch_first=True)
# Readout layer
self.fc = nn.Linear(hidden_dim, output_dim)
self.relu = nn.ReLU()
self.softmax = nn.Softmax(dim=1)
self.seq_dim = seq_dim
def forward(self, inputs):
# Initialize hidden state with zeros
input, (hn, cn) = inputs
input = input.view(-1, self.seq_dim, self.input_dim)
# time steps
out, (hn, cn) = self.lstm(input, (hn, cn))
# Index hidden state of last time step
out = self.fc(out[:, -1, :])
out = self.softmax(out)
return out, (hn,cn)
您可能遇到的一个问题是 CrossEntropyLoss
将对数 softmax 操作与负对数似然损失相结合,但您在模型中应用了 softmax。您应该将最后一层的原始 logits 传递给 CrossEntropyLoss
.
此外,我不会说没有看到模型前向传递,但看起来您正在将维度 1 上的 softmax 应用于(我推断)具有形状 batch_size, sequence_length, output_dim
的张量,当你应该沿着输出暗淡应用它。