TypeError: 'Vocab' object is not callable

Question

我正在学习在 1.9 pytorch 上发布的 torchtext transformers 教程。但是，因为我在 Tegra TX2 上工作，所以我坚持使用 torchtext 0.6.0，而不是 0.10.0（这是我假设教程使用的）。

按照教程，下面会抛出一个错误：

data = [torch.tensor(vocab(tokenizer(item)), dtype=torch.long) for item in raw_text_iter]
    return torch.cat(tuple(filter(lambda t: t.numel() > 0, data)))

错误是：

TypeError: 'Vocab' object is not callable

我明白这个错误是什么意思，但我不知道，在这种情况下，这是 Vocab 的预期 return 吗？

查看 documentation for TorchText 0.6.0 我发现它有：

stoi
itos
freqs
vectors

示例是否期望来自 Vocab 的 vectors？

编辑：

我查了 0.10.0 documentation，它没有 __call__。

Answer 1

看the source在0.10.0中实现了Vocab，显然是torch.nn.Module的subclass，也就是说继承了__call__ 从那里（调用它大致相当于调用它的 forward() 方法，但是有一些额外的机制来实现挂钩等）。

我们还可以看到它包裹了一些底层的VocabPyBind对象（相当于老版本的Vocabclass），它的forward()方法只是调用它的lookup_indices 方法。

简而言之，旧版本库中的等价物似乎是调用 vocab.lookup_indices(tokenizer(item)).

更新： 显然在 0.6.0 中 Vocab class 甚至没有 lookup_indices 方法，但是阅读源代码也就是说，这相当于：

[vocab[token] for token in tokenizer]

如果您能够升级，为了向前兼容，您可以编写如下包装器：

from torchtext.vocab import Vocab as _Vocab

class Vocab(_Vocab):
    def lookup_indices(self, tokens):
        return [vocab[token] for token in tokens]

    def __call__(self, tokens):
        return self.lookup_indices(tokens)

TypeError: 'Vocab' object is not callable

TypeError: 'Vocab' object is not callable

python

pytorch

torchtext

编辑：