为什么这个 Theano 代码 运行 成功而没有任何错误?
Why does this Theano code run successfully without any errors?
我从在线教程中借用了以下代码。我看到下面一行写在代码的主要方法中
c = broadcasted_add(a, b)
是将(2,1,2,2)维的张量'a'和(2,2,2,2)维的张量'b'相加。即使我们在 make_tensor 方法中将 broadcastable 声明为 'false' ,它如何能够正确添加?我们不应该将 broadcastable 声明为 True 以便它可以添加不同的维度吗?它不应该抛出一个错误说尺寸不匹配吗?我对广播的理解是错误的吗?
import numpy as np
from theano import function
import theano.tensor as T
def make_tensor(dim):
"""
Returns a new Theano tensor with no broadcastable dimensions.
dim: the total number of dimensions of the tensor.
"""
return T.TensorType(broadcastable=tuple([False] * dim), dtype='float32')()
def broadcasted_add(a, b):
"""
a: a 3D theano tensor
b: a 4D theano tensor
Returns c, a 4D theano tensor, where
c[i, j, k, l] = a[l, k, i] + b[i, j, k, l]
for all i, j, k, l
"""
return a.dimshuffle(2, 'x', 1, 0) + b
def partial_max(a):
"""
a: a 4D theano tensor
Returns b, a theano matrix, where
b[i, j] = max_{k,l} a[i, k, l, j]
for all i, j
"""
return a.max(axis=(1, 2))
if __name__ == "__main__":
a = make_tensor(3)
b = make_tensor(4)
c = broadcasted_add(a, b)
d = partial_max(c)
f = function([a, b,], d)
rng = np.random.RandomState([1, 2, 3])
a_value = rng.randn(2, 2, 2).astype(a.dtype)
b_value = rng.rand(2, 2, 2, 2).astype(b.dtype)
c_value = np.transpose(a_value, (2, 1, 0))[:, None, :, :] + b_value
expected = c_value.max(axis=1).max(axis=1)
actual = f(a_value, b_value)
assert np.allclose(actual, expected), (actual, expected)
print "SUCCESS!"
这样做的原因是 dimshuffle
通过 'x'
参数值添加的新维度总是可广播的。
请注意,在 broadcasted_add
中,唯一需要广播的维度是通过 dimshuffle
添加到 a
的维度。 None个其他维度需要播出
我从在线教程中借用了以下代码。我看到下面一行写在代码的主要方法中
c = broadcasted_add(a, b)
是将(2,1,2,2)维的张量'a'和(2,2,2,2)维的张量'b'相加。即使我们在 make_tensor 方法中将 broadcastable 声明为 'false' ,它如何能够正确添加?我们不应该将 broadcastable 声明为 True 以便它可以添加不同的维度吗?它不应该抛出一个错误说尺寸不匹配吗?我对广播的理解是错误的吗?
import numpy as np
from theano import function
import theano.tensor as T
def make_tensor(dim):
"""
Returns a new Theano tensor with no broadcastable dimensions.
dim: the total number of dimensions of the tensor.
"""
return T.TensorType(broadcastable=tuple([False] * dim), dtype='float32')()
def broadcasted_add(a, b):
"""
a: a 3D theano tensor
b: a 4D theano tensor
Returns c, a 4D theano tensor, where
c[i, j, k, l] = a[l, k, i] + b[i, j, k, l]
for all i, j, k, l
"""
return a.dimshuffle(2, 'x', 1, 0) + b
def partial_max(a):
"""
a: a 4D theano tensor
Returns b, a theano matrix, where
b[i, j] = max_{k,l} a[i, k, l, j]
for all i, j
"""
return a.max(axis=(1, 2))
if __name__ == "__main__":
a = make_tensor(3)
b = make_tensor(4)
c = broadcasted_add(a, b)
d = partial_max(c)
f = function([a, b,], d)
rng = np.random.RandomState([1, 2, 3])
a_value = rng.randn(2, 2, 2).astype(a.dtype)
b_value = rng.rand(2, 2, 2, 2).astype(b.dtype)
c_value = np.transpose(a_value, (2, 1, 0))[:, None, :, :] + b_value
expected = c_value.max(axis=1).max(axis=1)
actual = f(a_value, b_value)
assert np.allclose(actual, expected), (actual, expected)
print "SUCCESS!"
这样做的原因是 dimshuffle
通过 'x'
参数值添加的新维度总是可广播的。
请注意,在 broadcasted_add
中,唯一需要广播的维度是通过 dimshuffle
添加到 a
的维度。 None个其他维度需要播出