具有零值和非零值的数组的 Softmax 导致仅具有非零值的数组

Question

我有一个数组，其中一些值为零，一些为非零。然后我应用 softmax，我希望所有非零值加起来为 1。但是在 softmax 之后，所有值都非零并且加起来为 1。

这是我正在尝试做的事情：我有一些价值观

score[0]

<tf.Tensor: shape=(1, 48), dtype=float32, numpy=
array([[ 2.405819  , 27.748499  , 16.080362  ,  8.780167  , 16.615538  ,
        19.353844  , 19.497992  , 16.051327  ,  5.4946175 , 15.927819  ,
        11.512515  , 19.716702  , 15.100697  , 26.370419  , 21.838608  ,
        10.650975  ,  9.212484  , 17.439907  , 14.322778  , 12.001259  ,
        10.433163  , 10.011807  , 15.847178  , 18.343014  , 26.086296  ,
        26.723047  , 17.28703   , -0.7059817 , 26.380203  , 21.49808   ,
        14.828656  , 13.711437  , 19.565845  ,  5.9418716 , 12.614753  ,
        29.56828   ,  1.1372657 , 25.873251  , 36.031494  , -7.397362  ,
        12.691793  ,  4.3349338 , 15.1586275 , 14.650254  , 14.632486  ,
        18.829857  , 21.885925  ,  0.56010276]], dtype=float32)>

和口罩

mask_test[0]

<tf.Tensor: shape=(1, 48), dtype=int32, numpy=
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 1, 1, 1]])>

我将值与掩码相乘

score = tf.multiply(score, tf.cast(mask_test, tf.float32))
score[0]

<tf.Tensor: shape=(1, 48), dtype=float32, numpy=
array([[ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        , -0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        , -0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        18.829857  , 21.885925  ,  0.56010276]], dtype=float32)>

效果很好。现在我想添加一个 softmax，以便 所有非零值 加起来为 1。0 应该保持为 0。

attention_weights = tf.nn.softmax(score, axis=-1)
attention_weights[0]

<tf.Tensor: shape=(1, 48), dtype=float32, numpy=
array([[2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 4.4956207e-02, 9.5504379e-01, 5.2280064e-10]],
      dtype=float32)>

并且结果都是非零值。我猜这是来自 softmax 中的指数。有没有办法用 softmax 实现这一点，还是有其他方法？面具并不总是相同的。

提前致谢

Answer 1

Softmax 不是那样工作的。看看softmax

的公式

您需要为此定义自定义函数。

一个简单的方法是：

def custom_soft_max(arr):
    non_zero_indices = np.where(arr != 0)
    arr[non_zero_indices] = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)
    return arr

这将排除所有对应值为 0 的索引，然后仅对非零索引执行 softmax。

Answer 2

无需自定义 softmax。

Softmax() 仍然对 0.0 值和 returns 计算数学预期的 non-zero 值 (link)。

从 softmax() 获得零输出的唯一方法是传递一个 非常小的浮点值 。如果将屏蔽值设置为 float64 的最小可能机器限制，则此值的 Softmax() 将为零。

要获得 float64 的机器限制，您需要 tf.float64.min，它等于 -1.7976931348623157e+308。有关此 .

的机器限制的更多信息

在 tf.multiply() 之后和使用 softmax 将零更改为 float64 的最小机器限制之前应用此，softmax 会将它们标记为 0 -

#Keep score where not 0, else replace by machine limit
tf.where(score!=0, score, tf.float64.min) #<----

其中，tf.float64.min 给出了 float64 的 tf（和 numpy）机器限制。

具有零值和非零值的数组的 Softmax 导致仅具有非零值的数组

Softmax of array with zero and non-zero values results in array with only non-zero values

python

tensorflow

softmax

无需自定义 softmax。