DL4J写的异或神经网络不起作用
The XOR neural network written in DL4J does not work
我开始结合DL4j框架研究神经网络,从异或训练开始。但是无论我做什么,我都会得到错误的结果。
MultiLayerConfiguration networkConfiguration = new NeuralNetConfiguration.Builder()
.weightInit(WeightInit.SIGMOID_UNIFORM)
.list()
.layer(new DenseLayer.Builder()
.nIn(2).nOut(2)
.activation(Activation.SIGMOID)
.build())
.layer( new DenseLayer.Builder()
.nIn(2).nOut(2)
.activation(Activation.SIGMOID)
.build())
.layer( new OutputLayer.Builder()
.nIn(2).nOut(1)
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.build())
.build();
MultiLayerNetwork network = new MultiLayerNetwork(networkConfiguration);
network.setListeners(new ScoreIterationListener(1));
network.init();
INDArray input = Nd4j.createFromArray(new double[][]{{0,1},{0,0},{1,0},{1,1}});
INDArray output = Nd4j.createFromArray(new double[][]{{0^1},{0^0},{1^0},{1^1}});
// INDArray output = Nd4j.createFromArray(new double[]{0^1,0^0,1^1,1^0});
//DataSet dataSet = new org.nd4j.linalg.dataset.DataSet(input,output);
for (int i = 0; i < 10000; i++) {
network.fit(input,output);
}
INDArray res = network.output(input,false);
System.out.print(res);
学习成绩:
[[0.5748],
[0.5568],
[0.4497],
[0.4533]]
这看起来像是一个老例子。你从哪里弄来的?请注意,该项目不认可或支持人们从中提取的随机示例。如果这是书中的内容,请注意这些示例此时已有几年历史,不应使用。
此配置患有我喜欢称之为“玩具问题综合症”的问题。 Dl4j 默认采用小批量,因此默认情况下会根据输入示例的小批量大小裁剪学习。如果你在现实世界中做任何事情,99% 的问题都是这样设置的。
这意味着如果你用记忆中的整套玩具问题,网络采取的每一步实际上并不是它所采取的完整步骤。我们最新的示例通过为此关闭小批量来处理此问题:
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.updater(new Sgd(0.1))
.seed(seed)
.biasInit(0) // init the bias with 0 - empirical value, too
// The networks can process the input more quickly and more accurately by ingesting
// minibatches 5-10 elements at a time in parallel.
// This example runs better without, because the dataset is smaller than the mini batch size
.miniBatch(false)
.list()
.layer(new DenseLayer.Builder()
.nIn(2)
.nOut(4)
.activation(Activation.SIGMOID)
// random initialize weights with values between 0 and 1
.weightInit(new UniformDistribution(0, 1))
.build())
.layer(new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
.nOut(2)
.activation(Activation.SOFTMAX)
.weightInit(new UniformDistribution(0, 1))
.build())
.build();
注意配置中的 minibatch(false)。
我开始结合DL4j框架研究神经网络,从异或训练开始。但是无论我做什么,我都会得到错误的结果。
MultiLayerConfiguration networkConfiguration = new NeuralNetConfiguration.Builder()
.weightInit(WeightInit.SIGMOID_UNIFORM)
.list()
.layer(new DenseLayer.Builder()
.nIn(2).nOut(2)
.activation(Activation.SIGMOID)
.build())
.layer( new DenseLayer.Builder()
.nIn(2).nOut(2)
.activation(Activation.SIGMOID)
.build())
.layer( new OutputLayer.Builder()
.nIn(2).nOut(1)
.activation(Activation.SIGMOID)
.lossFunction(LossFunctions.LossFunction.XENT)
.build())
.build();
MultiLayerNetwork network = new MultiLayerNetwork(networkConfiguration);
network.setListeners(new ScoreIterationListener(1));
network.init();
INDArray input = Nd4j.createFromArray(new double[][]{{0,1},{0,0},{1,0},{1,1}});
INDArray output = Nd4j.createFromArray(new double[][]{{0^1},{0^0},{1^0},{1^1}});
// INDArray output = Nd4j.createFromArray(new double[]{0^1,0^0,1^1,1^0});
//DataSet dataSet = new org.nd4j.linalg.dataset.DataSet(input,output);
for (int i = 0; i < 10000; i++) {
network.fit(input,output);
}
INDArray res = network.output(input,false);
System.out.print(res);
学习成绩:
[[0.5748],
[0.5568],
[0.4497],
[0.4533]]
这看起来像是一个老例子。你从哪里弄来的?请注意,该项目不认可或支持人们从中提取的随机示例。如果这是书中的内容,请注意这些示例此时已有几年历史,不应使用。
此配置患有我喜欢称之为“玩具问题综合症”的问题。 Dl4j 默认采用小批量,因此默认情况下会根据输入示例的小批量大小裁剪学习。如果你在现实世界中做任何事情,99% 的问题都是这样设置的。
这意味着如果你用记忆中的整套玩具问题,网络采取的每一步实际上并不是它所采取的完整步骤。我们最新的示例通过为此关闭小批量来处理此问题:
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.updater(new Sgd(0.1))
.seed(seed)
.biasInit(0) // init the bias with 0 - empirical value, too
// The networks can process the input more quickly and more accurately by ingesting
// minibatches 5-10 elements at a time in parallel.
// This example runs better without, because the dataset is smaller than the mini batch size
.miniBatch(false)
.list()
.layer(new DenseLayer.Builder()
.nIn(2)
.nOut(4)
.activation(Activation.SIGMOID)
// random initialize weights with values between 0 and 1
.weightInit(new UniformDistribution(0, 1))
.build())
.layer(new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
.nOut(2)
.activation(Activation.SOFTMAX)
.weightInit(new UniformDistribution(0, 1))
.build())
.build();
注意配置中的 minibatch(false)。