当代码几乎相同时，为什么 theano scan 的工作方式不同？

Question

代码如下：

import theano
import numpy as np
from theano import tensor as T
h1=T.as_tensor_variable(np.zeros((1, 20), dtype=theano.config.floatX))
s1=T.as_tensor_variable(np.zeros((1, 20), dtype=theano.config.floatX))

def forward(input, h, s):
    return h, s
result, update=theano.scan(fn=forward, sequences=[T.arange(10)], outputs_info=[h1, s1], go_backwards=False)
print result[0].shape.eval()

有一个错误：

TypeError: Cannot convert Type TensorType(float32, 3D) (of Variable IncSubtensor{Set;:int64:}.0) into Type TensorType(float32, (False, True, False)). You can try to manually convert IncSubtensor{Set;:int64:}.0 into a TensorType(float32, (False, True, False)).

但是当我将 1 更改为任何其他数字时，例如：

h1=T.as_tensor_variable(np.zeros((2, 20), dtype=theano.config.floatX))
s1=T.as_tensor_variable(np.zeros((2, 20), dtype=theano.config.floatX))

它工作正常。

我不知道这里发生了什么。有人可以帮助我吗？

Answer 1

请关注这个post：https://github.com/Theano/Theano/issues/2985

在调用 theano.scan 时传递形状包含 1 作为 outputs_info 一部分的张量会失败，除非使用 tensor.unbroadcast 手动取消那些形状为 1 的轴。这是由于来自 scan 内部函数的实际 return 与通过 outputs_info 传递的相应广播模式不同。

尝试：

h1=T.unbroadcast(T.as_tensor_variable(np.zeros((1, 20), dtype=theano.config.floatX)), 0)
s1=T.unbroadcast(T.as_tensor_variable(np.zeros((1, 20), dtype=theano.config.floatX)), 0)

让第一个维度无法播放。

当代码几乎相同时，为什么 theano scan 的工作方式不同？

Why theano scan works differently when the codes are nearly the same?

theano