Deeplearning4j - 3 层神经网络无法正确拟合
Deeplearning4j - 3 layer neural network not fitting correctly
我正在尝试学习 Deeplearning4j 库。我正在尝试使用 sigmoid 激活函数来实现一个简单的 3 层神经网络来解决 XOR。我缺少哪些配置或超参数?我已经设法使用 RELU 激活和来自我在网上找到的一些 MLP 示例的 softmax 输出获得准确的输出,但是对于 sigmoid 激活,它似乎不想准确地拟合。任何人都可以分享为什么我的网络没有产生正确的输出吗?
DenseLayer inputLayer = new DenseLayer.Builder()
.nIn(2)
.nOut(3)
.name("Input")
.weightInit(WeightInit.ZERO)
.build();
DenseLayer hiddenLayer = new DenseLayer.Builder()
.nIn(3)
.nOut(3)
.name("Hidden")
.activation(Activation.SIGMOID)
.weightInit(WeightInit.ZERO)
.build();
OutputLayer outputLayer = new OutputLayer.Builder()
.nIn(3)
.nOut(1)
.name("Output")
.activation(Activation.SIGMOID)
.weightInit(WeightInit.ZERO)
.lossFunction(LossFunction.MEAN_SQUARED_LOGARITHMIC_ERROR)
.build();
NeuralNetConfiguration.Builder nncBuilder = new NeuralNetConfiguration.Builder();
nncBuilder.iterations(10000);
nncBuilder.learningRate(0.01);
nncBuilder.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT);
NeuralNetConfiguration.ListBuilder listBuilder = nncBuilder.list();
listBuilder.layer(0, inputLayer);
listBuilder.layer(1, hiddenLayer);
listBuilder.layer(2, outputLayer);
listBuilder.backprop(true);
MultiLayerNetwork myNetwork = new MultiLayerNetwork(listBuilder.build());
myNetwork.init();
INDArray trainingInputs = Nd4j.zeros(4, inputLayer.getNIn());
INDArray trainingOutputs = Nd4j.zeros(4, outputLayer.getNOut());
// If 0,0 show 0
trainingInputs.putScalar(new int[]{0,0}, 0);
trainingInputs.putScalar(new int[]{0,1}, 0);
trainingOutputs.putScalar(new int[]{0,0}, 0);
// If 0,1 show 1
trainingInputs.putScalar(new int[]{1,0}, 0);
trainingInputs.putScalar(new int[]{1,1}, 1);
trainingOutputs.putScalar(new int[]{1,0}, 1);
// If 1,0 show 1
trainingInputs.putScalar(new int[]{2,0}, 1);
trainingInputs.putScalar(new int[]{2,1}, 0);
trainingOutputs.putScalar(new int[]{2,0}, 1);
// If 1,1 show 0
trainingInputs.putScalar(new int[]{3,0}, 1);
trainingInputs.putScalar(new int[]{3,1}, 1);
trainingOutputs.putScalar(new int[]{3,0}, 0);
DataSet myData = new DataSet(trainingInputs, trainingOutputs);
myNetwork.fit(myData);
INDArray actualInput = Nd4j.zeros(1,2);
actualInput.putScalar(new int[]{0,0}, 0);
actualInput.putScalar(new int[]{0,1}, 0);
INDArray actualOutput = myNetwork.output(actualInput);
System.out.println("myNetwork Output " + actualOutput);
//Output is producing 1.00. Should be 0.0
所以总的来说,我会 link 你:
https://deeplearning4j.org/troubleshootingneuralnets
一些具体的技巧。永远不要使用 weight init zero,我们在示例中不使用它是有原因的(我强烈建议您从头开始而不是从头开始):
https://github.com/deeplearning4j/dl4j-examples
对于输出层,如果你想学习 xor,为什么不直接使用二进制 xent:
https://github.com/deeplearning4j/dl4j-examples/blob/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/feedforward/xor/XorExample.java
注意这里,也关闭小批量(见上面的例子),见:
https://deeplearning4j.org/toyproblems
我正在尝试学习 Deeplearning4j 库。我正在尝试使用 sigmoid 激活函数来实现一个简单的 3 层神经网络来解决 XOR。我缺少哪些配置或超参数?我已经设法使用 RELU 激活和来自我在网上找到的一些 MLP 示例的 softmax 输出获得准确的输出,但是对于 sigmoid 激活,它似乎不想准确地拟合。任何人都可以分享为什么我的网络没有产生正确的输出吗?
DenseLayer inputLayer = new DenseLayer.Builder()
.nIn(2)
.nOut(3)
.name("Input")
.weightInit(WeightInit.ZERO)
.build();
DenseLayer hiddenLayer = new DenseLayer.Builder()
.nIn(3)
.nOut(3)
.name("Hidden")
.activation(Activation.SIGMOID)
.weightInit(WeightInit.ZERO)
.build();
OutputLayer outputLayer = new OutputLayer.Builder()
.nIn(3)
.nOut(1)
.name("Output")
.activation(Activation.SIGMOID)
.weightInit(WeightInit.ZERO)
.lossFunction(LossFunction.MEAN_SQUARED_LOGARITHMIC_ERROR)
.build();
NeuralNetConfiguration.Builder nncBuilder = new NeuralNetConfiguration.Builder();
nncBuilder.iterations(10000);
nncBuilder.learningRate(0.01);
nncBuilder.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT);
NeuralNetConfiguration.ListBuilder listBuilder = nncBuilder.list();
listBuilder.layer(0, inputLayer);
listBuilder.layer(1, hiddenLayer);
listBuilder.layer(2, outputLayer);
listBuilder.backprop(true);
MultiLayerNetwork myNetwork = new MultiLayerNetwork(listBuilder.build());
myNetwork.init();
INDArray trainingInputs = Nd4j.zeros(4, inputLayer.getNIn());
INDArray trainingOutputs = Nd4j.zeros(4, outputLayer.getNOut());
// If 0,0 show 0
trainingInputs.putScalar(new int[]{0,0}, 0);
trainingInputs.putScalar(new int[]{0,1}, 0);
trainingOutputs.putScalar(new int[]{0,0}, 0);
// If 0,1 show 1
trainingInputs.putScalar(new int[]{1,0}, 0);
trainingInputs.putScalar(new int[]{1,1}, 1);
trainingOutputs.putScalar(new int[]{1,0}, 1);
// If 1,0 show 1
trainingInputs.putScalar(new int[]{2,0}, 1);
trainingInputs.putScalar(new int[]{2,1}, 0);
trainingOutputs.putScalar(new int[]{2,0}, 1);
// If 1,1 show 0
trainingInputs.putScalar(new int[]{3,0}, 1);
trainingInputs.putScalar(new int[]{3,1}, 1);
trainingOutputs.putScalar(new int[]{3,0}, 0);
DataSet myData = new DataSet(trainingInputs, trainingOutputs);
myNetwork.fit(myData);
INDArray actualInput = Nd4j.zeros(1,2);
actualInput.putScalar(new int[]{0,0}, 0);
actualInput.putScalar(new int[]{0,1}, 0);
INDArray actualOutput = myNetwork.output(actualInput);
System.out.println("myNetwork Output " + actualOutput);
//Output is producing 1.00. Should be 0.0
所以总的来说,我会 link 你: https://deeplearning4j.org/troubleshootingneuralnets
一些具体的技巧。永远不要使用 weight init zero,我们在示例中不使用它是有原因的(我强烈建议您从头开始而不是从头开始): https://github.com/deeplearning4j/dl4j-examples
对于输出层,如果你想学习 xor,为什么不直接使用二进制 xent: https://github.com/deeplearning4j/dl4j-examples/blob/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/feedforward/xor/XorExample.java
注意这里,也关闭小批量(见上面的例子),见: https://deeplearning4j.org/toyproblems