对相同数据的不同预测
Different predictions for the same data
我使用 Deeplearning4j class确定设备名称。我用 495 classes 标记了 ~ 50,000 个项目,我使用这些数据来训练神经网络。
也就是说,作为输入,我提供了一组由 0 和 1 组成的向量 (50,000),以及每个向量(0 到 494)的预期 class。
我使用 IrisClassifier 示例作为代码的基础。
我把训练好的模型保存到一个文件中,现在我可以用它来预测class的装备
例如,我尝试使用我用于训练的相同数据(50,000 项)进行预测,并将预测与我对该数据的标记进行比较。
结果非常好,神经网络的误差是~1%。
之后,我尝试使用这 50,000 条记录中的前 100 个向量进行预测,并删除其余的 49900。
并且对于这100个向量,与相同的100个向量在50,000的组合中的预测相比,预测是不同的。
也就是说,我们提供给训练模型的数据越少,预测误差就越大。
即使是完全相同的向量。
为什么会这样?
我的代码。
培训:
//First: get the dataset using the record reader. CSVRecordReader handles loading/parsing
int numLinesToSkip = 0;
char delimiter = ',';
RecordReader recordReader = new CSVRecordReader(numLinesToSkip,delimiter);
recordReader.initialize(new FileSplit(new File(args[0])));
//Second: the RecordReaderDataSetIterator handles conversion to DataSet objects, ready for use in neural network
int labelIndex = 3331;
int numClasses = 495;
int batchSize = 4000;
// DataSetIterator iterator = new RecordReaderDataSetIterator(recordReader,batchSize,labelIndex,numClasses);
DataSetIterator iterator = new RecordReaderDataSetIterator.Builder(recordReader, batchSize).classification(labelIndex, numClasses).build();
List<DataSet> trainingData = new ArrayList<>();
List<DataSet> testData = new ArrayList<>();
while (iterator.hasNext()) {
DataSet allData = iterator.next();
allData.shuffle();
SplitTestAndTrain testAndTrain = allData.splitTestAndTrain(0.8); //Use 80% of data for training
trainingData.add(testAndTrain.getTrain());
testData.add(testAndTrain.getTest());
}
DataSet allTrainingData = DataSet.merge(trainingData);
DataSet allTestData = DataSet.merge(testData);
//We need to normalize our data. We'll use NormalizeStandardize (which gives us mean 0, unit variance):
DataNormalization normalizer = new NormalizerStandardize();
normalizer.fit(allTrainingData); //Collect the statistics (mean/stdev) from the training data. This does not modify the input data
normalizer.transform(allTrainingData); //Apply normalization to the training data
normalizer.transform(allTestData); //Apply normalization to the test data. This is using statistics calculated from the *training* set
long seed = 6;
int firstHiddenLayerSize = labelIndex/6;
int secondHiddenLayerSize = firstHiddenLayerSize/4;
//log.info("Build model....");
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(seed)
.activation(Activation.TANH)
.weightInit(WeightInit.XAVIER)
.updater(new Sgd(0.1))
.l2(1e-4)
.list()
.layer(new DenseLayer.Builder().nIn(labelIndex).nOut(firstHiddenLayerSize)
.build())
.layer(new DenseLayer.Builder().nIn(firstHiddenLayerSize).nOut(secondHiddenLayerSize)
.build())
.layer( new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
.activation(Activation.SOFTMAX) //Override the global TANH activation with softmax for this layer
.nIn(secondHiddenLayerSize).nOut(numClasses).build())
.build();
//run the model
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
//record score once every 100 iterations
model.setListeners(new ScoreIterationListener(100));
for(int i=0; i<5000; i++ ) {
model.fit(allTrainingData);
}
//evaluate the model on the test set
Evaluation eval = new Evaluation(numClasses);
INDArray output = model.output(allTestData.getFeatures());
eval.eval(allTestData.getLabels(), output);
log.info(eval.stats());
// Save the Model
File locationToSave = new File(args[1]);
model.save(locationToSave, false);
预测:
// Open the network file
File locationToLoad = new File(args[0]);
MultiLayerNetwork model = MultiLayerNetwork.load(locationToLoad, false);
model.init();
// First: get the dataset using the record reader. CSVRecordReader handles loading/parsing
int numLinesToSkip = 0;
char delimiter = ',';
// Data to predict
CSVRecordReader recordReader = new CSVRecordReader(numLinesToSkip, delimiter); //skip no lines at the top - i.e. no header
recordReader.initialize(new FileSplit(new File(args[1])));
//Second: the RecordReaderDataSetIterator handles conversion to DataSet objects, ready for use in neural network
int batchSize = 4000;
DataSetIterator iterator = new RecordReaderDataSetIterator.Builder(recordReader, batchSize).build();
List<DataSet> dataSetList = new ArrayList<>();
while (iterator.hasNext()) {
DataSet allData = iterator.next();
dataSetList.add(allData);
}
DataSet dataSet = DataSet.merge(dataSetList);
DataNormalization normalizer = new NormalizerStandardize();
normalizer.fit(dataSet);
normalizer.transform(dataSet);
// Now use it to classify some data
INDArray output = model.output(dataSet.getFeatures());
// Save result
BufferedWriter writer = new BufferedWriter(new FileWriter(args[2], true));
for (int i=0; i<output.rows(); i++) {
writer
.append(output.getRow(i).argMax().toString())
.append(" ")
.append(String.valueOf(i))
.append(" ")
.append(output.getRow(i).toString())
.append('\n');
}
writer.close();
您需要对训练和预测使用相同的标准化数据。否则它会在转换您的数据时使用错误的统计数据。
您当前的操作方式导致数据看起来与训练数据非常不同,这就是您得到如此不同结果的原因。
确保将规范化器与模型一起保存如下:
import org.nd4j.linalg.dataset.api.preprocessor.serializer.NormalizerSerializer;
NormalizerSerializer SUT = NormalizerSerializer.getDefault();
SUT.write(normalizer,new File("outputFile.bin"));
NormalizeStandardize restored = SUT.restore(new File("outputFile.bin");
我使用 Deeplearning4j class确定设备名称。我用 495 classes 标记了 ~ 50,000 个项目,我使用这些数据来训练神经网络。
也就是说,作为输入,我提供了一组由 0 和 1 组成的向量 (50,000),以及每个向量(0 到 494)的预期 class。
我使用 IrisClassifier 示例作为代码的基础。
我把训练好的模型保存到一个文件中,现在我可以用它来预测class的装备
例如,我尝试使用我用于训练的相同数据(50,000 项)进行预测,并将预测与我对该数据的标记进行比较。
结果非常好,神经网络的误差是~1%。
之后,我尝试使用这 50,000 条记录中的前 100 个向量进行预测,并删除其余的 49900。
并且对于这100个向量,与相同的100个向量在50,000的组合中的预测相比,预测是不同的。
也就是说,我们提供给训练模型的数据越少,预测误差就越大。
即使是完全相同的向量。
为什么会这样?
我的代码。
培训:
//First: get the dataset using the record reader. CSVRecordReader handles loading/parsing
int numLinesToSkip = 0;
char delimiter = ',';
RecordReader recordReader = new CSVRecordReader(numLinesToSkip,delimiter);
recordReader.initialize(new FileSplit(new File(args[0])));
//Second: the RecordReaderDataSetIterator handles conversion to DataSet objects, ready for use in neural network
int labelIndex = 3331;
int numClasses = 495;
int batchSize = 4000;
// DataSetIterator iterator = new RecordReaderDataSetIterator(recordReader,batchSize,labelIndex,numClasses);
DataSetIterator iterator = new RecordReaderDataSetIterator.Builder(recordReader, batchSize).classification(labelIndex, numClasses).build();
List<DataSet> trainingData = new ArrayList<>();
List<DataSet> testData = new ArrayList<>();
while (iterator.hasNext()) {
DataSet allData = iterator.next();
allData.shuffle();
SplitTestAndTrain testAndTrain = allData.splitTestAndTrain(0.8); //Use 80% of data for training
trainingData.add(testAndTrain.getTrain());
testData.add(testAndTrain.getTest());
}
DataSet allTrainingData = DataSet.merge(trainingData);
DataSet allTestData = DataSet.merge(testData);
//We need to normalize our data. We'll use NormalizeStandardize (which gives us mean 0, unit variance):
DataNormalization normalizer = new NormalizerStandardize();
normalizer.fit(allTrainingData); //Collect the statistics (mean/stdev) from the training data. This does not modify the input data
normalizer.transform(allTrainingData); //Apply normalization to the training data
normalizer.transform(allTestData); //Apply normalization to the test data. This is using statistics calculated from the *training* set
long seed = 6;
int firstHiddenLayerSize = labelIndex/6;
int secondHiddenLayerSize = firstHiddenLayerSize/4;
//log.info("Build model....");
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(seed)
.activation(Activation.TANH)
.weightInit(WeightInit.XAVIER)
.updater(new Sgd(0.1))
.l2(1e-4)
.list()
.layer(new DenseLayer.Builder().nIn(labelIndex).nOut(firstHiddenLayerSize)
.build())
.layer(new DenseLayer.Builder().nIn(firstHiddenLayerSize).nOut(secondHiddenLayerSize)
.build())
.layer( new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
.activation(Activation.SOFTMAX) //Override the global TANH activation with softmax for this layer
.nIn(secondHiddenLayerSize).nOut(numClasses).build())
.build();
//run the model
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
//record score once every 100 iterations
model.setListeners(new ScoreIterationListener(100));
for(int i=0; i<5000; i++ ) {
model.fit(allTrainingData);
}
//evaluate the model on the test set
Evaluation eval = new Evaluation(numClasses);
INDArray output = model.output(allTestData.getFeatures());
eval.eval(allTestData.getLabels(), output);
log.info(eval.stats());
// Save the Model
File locationToSave = new File(args[1]);
model.save(locationToSave, false);
预测:
// Open the network file
File locationToLoad = new File(args[0]);
MultiLayerNetwork model = MultiLayerNetwork.load(locationToLoad, false);
model.init();
// First: get the dataset using the record reader. CSVRecordReader handles loading/parsing
int numLinesToSkip = 0;
char delimiter = ',';
// Data to predict
CSVRecordReader recordReader = new CSVRecordReader(numLinesToSkip, delimiter); //skip no lines at the top - i.e. no header
recordReader.initialize(new FileSplit(new File(args[1])));
//Second: the RecordReaderDataSetIterator handles conversion to DataSet objects, ready for use in neural network
int batchSize = 4000;
DataSetIterator iterator = new RecordReaderDataSetIterator.Builder(recordReader, batchSize).build();
List<DataSet> dataSetList = new ArrayList<>();
while (iterator.hasNext()) {
DataSet allData = iterator.next();
dataSetList.add(allData);
}
DataSet dataSet = DataSet.merge(dataSetList);
DataNormalization normalizer = new NormalizerStandardize();
normalizer.fit(dataSet);
normalizer.transform(dataSet);
// Now use it to classify some data
INDArray output = model.output(dataSet.getFeatures());
// Save result
BufferedWriter writer = new BufferedWriter(new FileWriter(args[2], true));
for (int i=0; i<output.rows(); i++) {
writer
.append(output.getRow(i).argMax().toString())
.append(" ")
.append(String.valueOf(i))
.append(" ")
.append(output.getRow(i).toString())
.append('\n');
}
writer.close();
您需要对训练和预测使用相同的标准化数据。否则它会在转换您的数据时使用错误的统计数据。
您当前的操作方式导致数据看起来与训练数据非常不同,这就是您得到如此不同结果的原因。
确保将规范化器与模型一起保存如下:
import org.nd4j.linalg.dataset.api.preprocessor.serializer.NormalizerSerializer;
NormalizerSerializer SUT = NormalizerSerializer.getDefault();
SUT.write(normalizer,new File("outputFile.bin"));
NormalizeStandardize restored = SUT.restore(new File("outputFile.bin");