试图理解深度 RNN 权重
Trying to understand deep RNN weights
我想了解为 RNN 训练了哪些权重。对于具有 1 层的简单 RNN,它很容易理解。例如,如果时间步长的输入形状是 [50, 3],则每个特征有 3 个要训练的权重,加上权重的偏差和输入状态的权重。但是我很难理解随着 RNN 数量的增加,参数如何变成 12、21、32。感谢您的指导。
model = Sequential([
SimpleRNN(1, return_sequences = False, input_shape = [50, 3]), # 3 features and 1 per Wx and Wy
Dense(1)
])
model.summary()
model2 = Sequential([
SimpleRNN(2, return_sequences = False, input_shape = [50, 3]),
Dense(1) # last do not neeed the return sequencies
])
model2.summary()
model3 = Sequential([
SimpleRNN(3, return_sequences = False, input_shape = [50, 3]),
Dense(1) # last do not neeed the return sequencies
])
model3.summary()
model4 = Sequential([
SimpleRNN(4, return_sequences = False, input_shape = [50, 3]),
Dense(1) # last do not neeed the return sequencies
])
Model: "sequential_20"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn_22 (SimpleRNN) (None, 1) 5
_________________________________________________________________
dense_18 (Dense) (None, 1) 2
=================================================================
Total params: 7
Trainable params: 7
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_21"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn_23 (SimpleRNN) (None, 2) 12
_________________________________________________________________
dense_19 (Dense) (None, 1) 3
=================================================================
Total params: 15
Trainable params: 15
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_22"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn_24 (SimpleRNN) (None, 3) 21
_________________________________________________________________
dense_20 (Dense) (None, 1) 4
=================================================================
Total params: 25
Trainable params: 25
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_23"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn_25 (SimpleRNN) (None, 4) 32
_________________________________________________________________
dense_21 (Dense) (None, 1) 5
=================================================================
Total params: 37
Trainable params: 37
Non-trainable params: 0
_________________________________________________________________
对于您的模型 2:
model2 = Sequential([
SimpleRNN(2, return_sequences = False, input_shape = [50, 3]),
Dense(1) # last do not neeed the return sequencies
])
下图显示了其中一个神经元的权重(5 个权重),您将有 1 个偏差。所以每个神经元有 6 个参数,总参数数为 6*2 = 12.
您的示例的公式为:
h * (3 + h) + h
其中 (3 + h)
是每个神经元的权重数,最后 h
将偏差添加到参数
我想了解为 RNN 训练了哪些权重。对于具有 1 层的简单 RNN,它很容易理解。例如,如果时间步长的输入形状是 [50, 3],则每个特征有 3 个要训练的权重,加上权重的偏差和输入状态的权重。但是我很难理解随着 RNN 数量的增加,参数如何变成 12、21、32。感谢您的指导。
model = Sequential([
SimpleRNN(1, return_sequences = False, input_shape = [50, 3]), # 3 features and 1 per Wx and Wy
Dense(1)
])
model.summary()
model2 = Sequential([
SimpleRNN(2, return_sequences = False, input_shape = [50, 3]),
Dense(1) # last do not neeed the return sequencies
])
model2.summary()
model3 = Sequential([
SimpleRNN(3, return_sequences = False, input_shape = [50, 3]),
Dense(1) # last do not neeed the return sequencies
])
model3.summary()
model4 = Sequential([
SimpleRNN(4, return_sequences = False, input_shape = [50, 3]),
Dense(1) # last do not neeed the return sequencies
])
Model: "sequential_20"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn_22 (SimpleRNN) (None, 1) 5
_________________________________________________________________
dense_18 (Dense) (None, 1) 2
=================================================================
Total params: 7
Trainable params: 7
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_21"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn_23 (SimpleRNN) (None, 2) 12
_________________________________________________________________
dense_19 (Dense) (None, 1) 3
=================================================================
Total params: 15
Trainable params: 15
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_22"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn_24 (SimpleRNN) (None, 3) 21
_________________________________________________________________
dense_20 (Dense) (None, 1) 4
=================================================================
Total params: 25
Trainable params: 25
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_23"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn_25 (SimpleRNN) (None, 4) 32
_________________________________________________________________
dense_21 (Dense) (None, 1) 5
=================================================================
Total params: 37
Trainable params: 37
Non-trainable params: 0
_________________________________________________________________
对于您的模型 2:
model2 = Sequential([
SimpleRNN(2, return_sequences = False, input_shape = [50, 3]),
Dense(1) # last do not neeed the return sequencies
])
下图显示了其中一个神经元的权重(5 个权重),您将有 1 个偏差。所以每个神经元有 6 个参数,总参数数为 6*2 = 12.
您的示例的公式为:
h * (3 + h) + h
其中 (3 + h)
是每个神经元的权重数,最后 h
将偏差添加到参数