theano的scan函数是如何工作的？

Question

看这段代码：

import theano
import numpy
import theano.tensor as T 
import numpy as np

x = T.dvector('x')
y = T.dvector('y')

def fun(x,a):
    return x+a

results, updates = theano.scan(fn=fun,sequences=dict(input=x), outputs_info=dict(initial=y, taps=[-3]))

h = [10.,20,30,40,50,60,70]
f = theano.function([x, y], results)
g = theano.function([y], y)

print(f([1],h))

我把outputs_info'taps改成了-2,-3,等等，但是代码的结果还是一样[11.0]，看不懂。有人可以解释一下吗？

另一个问题。

import theano
import numpy
import theano.tensor as T 
import numpy as np

x = T.dvector('x')
y = T.dvector('y')

def fun(x,a,b):
    return x+a+b

results, updates = theano.scan(fn=fun,sequences=dict(input=x), outputs_info=dict(initial=y, taps=[-5,-3]))

h = [10.,20,30,40,50,60,70]
f = theano.function([x, y], results)
g = theano.function([y], y)

print(f([1,2,3,4],h))

输出是[41,62,83,85]，85怎么来的？

Answer 1

考虑您的代码的这种变化：

x = T.dvector('x')
y = T.dvector('y')

def fun(x,a,b):
    return x+b

results, updates = theano.scan(
    fn=fun,
    sequences=dict(input=x), 
    outputs_info=dict(initial=y, taps=[-5,-3])
)

h = [10.,20,30,40,50,60,70]
f = theano.function([x, y], results)
g = theano.function([y], y)

print(f([1],h))

你的结果将是 31。

将拍子更改为 [-5, -2]，您的结果将更改为 41。
将拍子更改为 [-4, -3]，您的结果将更改为 21。

这演示了事情是如何工作的：

taps中的最大负数被视为h[0]
所有其他水龙头都与那个偏移

因此，当点击 [-5,-2] 有趣时，输入 a 和 b 分别 = 10 和 40。

新问题更新

taps实际上表示t时刻的函数依赖于t - taps.

时刻函数的输出

例如，斐波那契数列由函数定义

$f1$

以下是使用 theano.scan 实现斐波那契数列的方法：

x = T.ivector('x')
y = T.ivector('y')

def fibonacci(x,a,b):
    return a+b

results, _ = theano.scan(
    fn=fibonacci,
    sequences=dict(input=x), 
    outputs_info=dict(initial=y, taps=[-2,-1])
    )

h = [1,1]
f = theano.function([x, y], results)

print(np.append(h, f(range(10),h)))

但是，theano.scan有问题。如果该函数依赖于先验输出，您使用什么作为第一次迭代的先验输出？

答案是初始输入，h 在你的例子中。但是在你的情况下 h 比你需要的要长，你只需要它有 5 个元素长（因为在你的情况下最大的抽头是 -5）。使用 h 所需的 5 个元素后，您的函数将切换到函数的实际输出。

下面是您的代码中发生的事情的简化跟踪：

output[0] = x[0] + h[0] + h[2] = 41
output[1] = x[1] + h[1] + h[3] = 62
output[2] = x[2] + h[2] + h[4] = 83
output[3] = x[3] + h[3] + output[0] = 85

你会看到，在时间 = 4，我们有时间 4-3 的函数输出，输出是 41。既然我们有那个输出，我们就需要使用它，因为函数被定义为使用先前的输出。所以我们就忽略剩下的 h.

theano的scan函数是如何工作的？

How does theano's scan function work?

python

theano

theano.scan

新问题更新