带有theano扫描函数的MissingInputError

Question

我正在试验 Theano，特别是函数 scan。

我想用它来将线性分类器应用于一组存储为矩阵 X 的列的特征向量（我相信有更好的方法可以做到这一点，这只是为了熟悉函数扫描).

这是我的代码片段：

T_W = T.fmatrix('W')
T_b = T.fmatrix('b')

T_X = T.fmatrix('X')
T_x = T.fmatrix('x')

# this is the linear classifier
T_f = T.dot(T_W, T_x) + T_b
f = theano.function(inputs=[T_x, theano.Param(T_W), theano.Param(T_b)],outputs=T_f)

T_outputs, T_updates = theano.scan(fn=lambda x,W,b : T_f, sequences=[T_X], non_sequences=[T_W,T_b])

F = theano.function(inputs=[T_X, theano.Param(T_W), theano.Param(T_b)],outputs=T_outputs)

执行 iPython 中的代码片段时出现以下错误（由最后一条指令触发）：

    MissingInputError: A variable that is an input to the graph was neither provided as an input to the function nor given a value. A chain of variables leading from this input to an output is [x, for{cpu,scan_fn}.0]. This chain may not be unique
Backtrace when the variable is created:
  File "<ipython-input-40-72b539c54ff4>", line 5, in <module>
    T_x = T.fmatrix('x')

Answer 1

目前还不完全清楚您要在这里做什么，但我猜您正在实现线性分类器的两个不同版本，一个不使用扫描，另一个使用扫描。

下面的代码展示了我的方法。

回答您的具体问题：

出现错误消息是因为您的扫描版本在扫描步骤函数中使用了 T_f（这很奇怪，也是不清楚您要做什么的原因之一；步骤函数未使用任何它的输入变量 x、W 或 b！）和 T_f 使用 T_x 但您的扫描版本的功能不采用 T_x 作为输入。相反，它需要 T_X （注意大小写差异），然后根本不会使用它。

这里有一些关于你的代码和我的代码之间差异的提示和解释。

将事物分离成离散的方法非常有帮助。通过将代码拆分为 v1 和 v2 方法，我们确保这两种不同的实现不会相互干扰。
建议始终使用theano.scan的strict参数。它确保您不会意外引入由步骤函数参数中的命名冲突引起的错误。默认情况下不启用它，因为当严格不存在时，这可能会破坏旧代码。
对扫描的步进函数使用完全成熟的函数而不是 lambda。与严格模式一样，这有助于避免意外的命名冲突，并使步骤代码更易于理解。 step函数也可以单独测试
使用compute_test_value确保计算适用于简单的样本数据。特别是，这将识别形状不匹配（例如，以错误的顺序对参数执行 dot），并通过在计算时能够 print/explore 中间值 来简化调试图正在构建，而不是稍后执行计算时。
此代码将每个输入样本编码为 x 的行而不是 x 的列。这需要 post 乘以 w 而不是预乘。两种方式都可以，但是预乘 w 会使 b 的加法有点混乱（需要引入 dimshuffle）。
没有必要使用theano.Param除非你需要使用关于默认值等的非标准行为
避免以大小写不同的方式命名事物！通常，坚持 Python style guide（即实例变量应小写，单词用下划线分隔）。
scan版本的step函数中需要dimshuffle和第一行的选择，以确保偏差的点积和后续相加是维度兼容的。这在非扫描版本中不需要，因为我们正在做一个矩阵矩阵点积。

代码：

import numpy
import theano
import theano.tensor as T


def create_inputs(x_value, w_value, b_value):
    x, w = T.matrices(2)
    b = T.vector()
    x.tag.test_value = x_value
    w.tag.test_value = w_value
    b.tag.test_value = b_value
    return x, w, b


def v1(x_value, w_value, b_value):
    x, w, b = create_inputs(x_value, w_value, b_value)
    y = T.dot(x, w) + b
    f = theano.function(inputs=[x, w, b], outputs=y)
    print f(x_value, w_value, b_value)


def v2_step(x, w, b):
    return (T.dot(x.dimshuffle('x', 0), w) + b)[0]


def v2(x_value, w_value, b_value):
    x, w, b = create_inputs(x_value, w_value, b_value)
    y, _ = theano.scan(v2_step, sequences=[x], non_sequences=[w, b], strict=True)
    f = theano.function(inputs=[x, w, b], outputs=y)
    print f(x_value, w_value, b_value)


def main():
    batch_size = 2
    input_size = 3
    hidden_size = 4
    theano.config.compute_test_value = 'raise'
    numpy.random.seed(1)
    x_value = numpy.random.standard_normal(size=(batch_size, input_size))
    w_value = numpy.random.standard_normal(size=(input_size, hidden_size))
    b_value = numpy.zeros((hidden_size,))
    v1(x_value, w_value, b_value)
    v2(x_value, w_value, b_value)


main()

带有theano扫描函数的MissingInputError

MissingInputError with theano scan function

python

theano

deep-learning