Pytorch Lightning Automatic Logging - AttributeError: 'NoneType' object has no attribute '_results'

Question

在 Pytorch Lightning 上调用 training_step() 时无法使用 Automatic Logging (self.log)，我错过了什么？这是一个最小的例子：

import pytorch_lightning as pl
import torch
import torch.nn as nn
import torch.nn.functional as F

class LitModel(pl.LightningModule):
    def __init__(self):
        super().__init__()
        self.l1 = nn.Linear(100, 4)

    def forward(self, x):
        return torch.relu(self.l1(x.view(x.size(0), -1)))

    def training_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = F.cross_entropy(y_hat, y.long())
        self.log("train_loss", loss) # <-- error
        return loss

    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters(), lr=0.02)

pl_model = LitModel()
x = torch.rand((10,100))
y = torch.randint(0,4, size=(10,))
batch = (x,y)
loss = pl_model.training_step(batch, 0)

错误：

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-34-b9419bfca30f> in <module>
     25 y = torch.randint(0,4, size=(10,))
     26 batch = (x,y)
---> 27 loss = pl_model.training_step(batch, 0)

<ipython-input-34-b9419bfca30f> in training_step(self, batch, batch_idx)
     14         y_hat = self(x)
     15         loss = F.cross_entropy(y_hat, y.long())
---> 16         self.log("train_loss", loss)
     17         return loss
     18 

D:\programs\anaconda3\lib\site-packages\pytorch_lightning\core\lightning.py in log(self, name, value, prog_bar, logger, on_step, on_epoch, reduce_fx, tbptt_reduce_fx, tbptt_pad_token, enable_graph, sync_dist, sync_dist_op, sync_dist_group, add_dataloader_idx, batch_size, metric_attribute, rank_zero_only)
    405         on_epoch = self.__auto_choose_log_on_epoch(on_epoch)
    406 
--> 407         results = self.trainer._results
    408         assert results is not None
    409         assert self._current_fx_name is not None

AttributeError: 'NoneType' object has no attribute '_results'

Answer 1

这不是 LightningModule class 的正确用法。您不能手动调用挂钩（即 .training_step()）并期望一切正常。

您需要按照 PyTorch Lightning 在其教程开头的建议设置 Trainer - 这是一项要求。您在 LightningModule 中定义的函数（或挂钩）仅告诉 Lightning 在特定情况下（在本例中为每个训练步骤）“做什么”。 Trainer 通过实例化必要的环境（包括日志记录功能）并在需要时将其输入闪电模块，实际上“编排”训练。

所以，按照 Lightning suggests 的方式去做，它会奏效的。

Pytorch Lightning Automatic Logging - AttributeError: 'NoneType' object has no attribute '_results'

Pytorch Lightning Automatic Logging - AttributeError: 'NoneType' object has no attribute '_results'

pytorch-lightning