tf.nn.leaky_relu( features, alpha=0.2, name=None ) 中有关 alpha 的详细信息

Details about alpha in tf.nn.leaky_relu( features, alpha=0.2, name=None )

我正在尝试使用 leaky_relu 作为隐藏层的激活函数。对于参数alpha，解释为：

slope of the activation function at x < 0

这是什么意思？ alpha的不同取值会对模型的结果产生什么影响？

有关 ReLU 及其变体的深入解释在以下链接中：

在常规 ReLU 中，主要缺点是激活的输入可能为负，因为在网络中执行的操作会导致所谓的“Dying RELU”问题

the gradient is 0 whenever the unit is not active. This could lead to cases where a unit never activates as a gradient-based optimization algorithm will not adjust the weights of a unit that never activates initially. Further, like the vanishing gradients problem, we might expect learning to be slow when training ReLU networks with constant 0 gradients.

所以 Leaky ReLU 用一些小值代替零值，比如 0.001（称为“alpha”）。因此，对于 leaky ReLU，函数 f(x) = max(0.001x, x)。现在 0.001x 的梯度下降将具有 non-zero 值，它将继续学习而不会到达死胡同。

tf.nn.leaky_relu( features, alpha=0.2, name=None ) 中有关 alpha 的详细信息

Details about alpha in tf.nn.leaky_relu( features, alpha=0.2, name=None )

python

neural-network

deep-learning

tensorflow

activation-function