2个隐藏层神经网络的维度不相关

Dimensions of 2 hidden layer neural network not correlating

我在这里尝试单独使用 numpy 实现一个 2 层神经网络。下面的代码只是计算前向传播。

训练数据是两个示例,其中输入为 5 维,输出为 4 维。当我尝试 运行 我的网络时:

# Two Layer Neural network

import numpy as np

M = 2
learning_rate = 0.0001

X_train = np.asarray([[1,1,1,1,1] , [1,1,1,1,1]])
Y_train = np.asarray([[0,0,0,0] , [1,0,0,0]])

X_trainT = X_train.T
Y_trainT = Y_train.T

def sigmoid(z):
    s = 1 / (1 + np.exp(-z))  
    return s

w1=np.zeros((Y_trainT.shape[0], X_trainT.shape[0]))
b1=np.zeros((Y_trainT.shape[0], 1))
A1 = sigmoid(np.dot(w1 , X_trainT))

w2=np.zeros((A1.shape[0], w1.shape[0]))
b2=np.zeros((A1.shape[0], 1))
A2 = sigmoid(np.dot(w2 , A1))

# forward propogation

dw1 =  ( 1 / M ) * np.dot((A1 - A2) , X_trainT.T / M)
db1 =  (A1 - A2).mean(axis=1, keepdims=True)
w1 = w1 - learning_rate * dw1
b1 = b1 - learning_rate * db1

dw2 =  ( 1 / M ) * np.dot((A2 - A1) , Y_trainT.T / M)
db2 =  (A2 - Y_trainT).mean(axis=1, keepdims=True)
w2 = w2 - learning_rate * dw2
b2 = b2 - learning_rate * db2

Y_prediction_train = sigmoid(np.dot(w2 , X_train) +b2)
print(Y_prediction_train.T)

我收到错误:

ValueError                                Traceback (most recent call last)
<ipython-input-42-f0462b5940a4> in <module>()
     36 b2 = b2 - learning_rate * db2
     37 
---> 38 Y_prediction_train = sigmoid(np.dot(w2 , X_train) +b2)
     39 print(Y_prediction_train.T)

ValueError: shapes (4,4) and (2,5) not aligned: 4 (dim 1) != 2 (dim 0)

我的线性代数似乎误入歧途,但我不确定在哪里。

打印权重和相应的导数:

print(w1.shape)
print(w2.shape)
print(dw1.shape)
print(dw2.shape)

打印:

(4, 5)
(4, 4)
(4, 5)
(4, 4)

如何将5个维度的训练样例合并到这个网络中?

我是否正确实施了前向传播?

来自@Imran 现在使用此网络的回答:

# Two Layer Neural network

import numpy as np

M = 2
learning_rate = 0.0001

X_train = np.asarray([[1,0,1,1,1] , [1,1,1,1,1]])
Y_train = np.asarray([[0,1,0,0] , [1,0,0,0]])

X_trainT = X_train.T
Y_trainT = Y_train.T

def sigmoid(z):
    s = 1 / (1 + np.exp(-z))  
    return s

w1=np.zeros((Y_trainT.shape[0], X_trainT.shape[0]))
b1=np.zeros((Y_trainT.shape[0], 1))
A1 = sigmoid(np.dot(w1 , X_trainT))

w2=np.zeros((A1.shape[0], w1.shape[0]))
b2=np.zeros((A1.shape[0], 1))
A2 = sigmoid(np.dot(w2 , A1))

# forward propogation

dw1 =  ( 1 / M ) * np.dot((A1 - A2) , X_trainT.T / M)
db1 =  (A1 - A2).mean(axis=1, keepdims=True)
w1 = w1 - learning_rate * dw1
b1 = b1 - learning_rate * db1

dw2 =  ( 1 / M ) * np.dot((A2 - A1) , Y_trainT.T / M)
db2 =  (A2 - Y_trainT).mean(axis=1, keepdims=True)
w2 = w2 - learning_rate * dw2
b2 = b2 - learning_rate * db2

Y_prediction_train = sigmoid(np.dot(w2 , A1) +b2)
print(Y_prediction_train.T)

打印:

[[ 0.5        0.5        0.4999875  0.4999875]
 [ 0.5        0.5        0.4999875  0.4999875]]

我认为 dw2 = ( 1 / M ) * np.dot((A2 - A1) , Y_trainT.T / M) 应该改为 dw2 = ( 1 / M ) * np.dot((A2 - A1) , A1.T / M) 为了将差异从隐藏层 1 传播到隐藏层 2,这是正确的吗?

Y_prediction_train = sigmoid(np.dot(w2 , X_train) +b2)

w2 是第二个隐藏层的权重矩阵。这绝不能乘以您的输入,X_train

要获得预测,您需要将前向传播分解为接受输入 X 的函数,首先计算 A1 = sigmoid(np.dot(w1 , X)),然后 returns [=15] 的结果=]

更新:

I think dw2 = ( 1 / M ) * np.dot((A2 - A1) , Y_trainT.T / M) should instead be dw2 = ( 1 / M ) * np.dot((A2 - A1) , A1.T / M) as in order to propagate differences from layer hidden layer 1 to hidden layer 2, is this correct ?

反向传播传播错误向后。第一步是计算损失函数相对于你的输出的梯度,如果你使用均方误差,它将是 A2-Y。然后,这将被输入到您关于第 2 层的权重和偏差的损失梯度的术语中,因此返回到第 1 层。您不想在反向传播期间将任何东西从第 1 层传播到第 2 层。

看起来你在更新的问题中几乎是正确的,但我认为你想要:

dW2 = ( 1 / M ) * np.dot((A2 - Y) , A1.T)

更多注意事项:

  1. 您正在将权重初始化为零。这不会让神经网络在训练过程中打破对称性,最终每个神经元的权重都相同。您应该尝试使用 [-1,1].
  2. 范围内的随机权重进行初始化
  3. 你应该将你的前向传播和反向传播步骤放在一个循环中,这样你就可以 运行 在你的错误仍在改善的同时进行多个时期。