tf.expand_dims 究竟对向量做了什么,为什么即使矩阵形状不同也可以将结果加在一起?

What exactly does tf.expand_dims do to a vector and why can results be added together even when matrix shapes are different?

我将我认为 'reshaped' 的两个向量相加,得到一个二维矩阵。我预计这里会出现某种类型的错误,但没有得到。我想我明白发生了什么,它把它们当作每个向量在水平和垂直方向上还有两组,但我不明白为什么 a 和 b 的结果没有不同。如果他们不是故意的,为什么这会起作用?

import tensorflow as tf
import numpy as np

start_vec = np.array((83,69,45))
a = tf.expand_dims(start_vec, 0)
b = tf.expand_dims(start_vec, 1)
ab_sum = a + b
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    a = sess.run(a)
    b = sess.run(b)
    ab_sum = sess.run(ab_sum)

print(a)
print(b)
print(ab_sum)

============================================= ====

[[83 69 45]]

[[83]
 [69]
 [45]]

[[166 152 128]
 [152 138 114]
 [128 114  90]]

其实这道题更多的是利用了tensorflow的broadcasting特性, 与 numpy (Broadcasting) 相同。 Broadcasting去掉了张量之间的运算形状必须相同的要求。当然,也必须满足一定的条件。

General Broadcasting Rules:

When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when

1.they are equal, or

2.one of them is 1

一个简单的例子是一维张量乘以标量。

import tensorflow as tf

start_vec = tf.constant((83,69,45))
b = start_vec * 2

with tf.Session() as sess:
    print(sess.run(b))

[166 138  90]

回到问题,tf.expand_dims()的作用是在张量的指定axis位置插入一个维度。您的原始数据形状是 (3,)。当你设置 axis=0 时,你将得到 a=tf.expand_dims(start_vec, 0) 的形状是 (1,3)。当你设置 axis=1 时,你将得到 b=tf.expand_dims(start_vec, 1) 的形状是 (3,1) ].

通过对比broadcasting的规则,可以看出满足了第二个条件。所以他们的实际操作是

83,83,83     83,69,45
69,69,69  +  83,69,45
45,45,45     83,69,45