对相同数据的不同预测

Question

我使用 Deeplearning4j class确定设备名称。我用 495 classes 标记了 ~ 50,000 个项目，我使用这些数据来训练神经网络。

也就是说，作为输入，我提供了一组由 0 和 1 组成的向量 (50,000)，以及每个向量（0 到 494）的预期 class。

我使用 IrisClassifier 示例作为代码的基础。

我把训练好的模型保存到一个文件中，现在我可以用它来预测class的装备

例如，我尝试使用我用于训练的相同数据（50,000 项）进行预测，并将预测与我对该数据的标记进行比较。

结果非常好，神经网络的误差是~1%。

之后，我尝试使用这 50,000 条记录中的前 100 个向量进行预测，并删除其余的 49900。

并且对于这100个向量，与相同的100个向量在50,000的组合中的预测相比，预测是不同的。

也就是说，我们提供给训练模型的数据越少，预测误差就越大。

即使是完全相同的向量。

为什么会这样？

我的代码。

培训：

 //First: get the dataset using the record reader. CSVRecordReader handles loading/parsing
int numLinesToSkip = 0;
char delimiter = ',';
RecordReader recordReader = new CSVRecordReader(numLinesToSkip,delimiter);
recordReader.initialize(new FileSplit(new File(args[0])));

//Second: the RecordReaderDataSetIterator handles conversion to DataSet objects, ready for use in neural network
int labelIndex = 3331;
int numClasses = 495;
int batchSize = 4000;

// DataSetIterator iterator = new RecordReaderDataSetIterator(recordReader,batchSize,labelIndex,numClasses);
DataSetIterator iterator = new RecordReaderDataSetIterator.Builder(recordReader, batchSize).classification(labelIndex, numClasses).build();

List<DataSet> trainingData = new ArrayList<>();
List<DataSet> testData = new ArrayList<>();

while (iterator.hasNext()) {
    DataSet allData = iterator.next();
    allData.shuffle();
    SplitTestAndTrain testAndTrain = allData.splitTestAndTrain(0.8);  //Use 80% of data for training
    trainingData.add(testAndTrain.getTrain());
    testData.add(testAndTrain.getTest());
}

DataSet allTrainingData = DataSet.merge(trainingData);
DataSet allTestData = DataSet.merge(testData);

//We need to normalize our data. We'll use NormalizeStandardize (which gives us mean 0, unit variance):
DataNormalization normalizer = new NormalizerStandardize();
normalizer.fit(allTrainingData);           //Collect the statistics (mean/stdev) from the training data. This does not modify the input data
normalizer.transform(allTrainingData);     //Apply normalization to the training data
normalizer.transform(allTestData);         //Apply normalization to the test data. This is using statistics calculated from the *training* set

long seed = 6;
int firstHiddenLayerSize = labelIndex/6;
int secondHiddenLayerSize = firstHiddenLayerSize/4;

//log.info("Build model....");
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
        .seed(seed)
        .activation(Activation.TANH)
        .weightInit(WeightInit.XAVIER)
        .updater(new Sgd(0.1))
        .l2(1e-4)
        .list()
        .layer(new DenseLayer.Builder().nIn(labelIndex).nOut(firstHiddenLayerSize)
                .build())
        .layer(new DenseLayer.Builder().nIn(firstHiddenLayerSize).nOut(secondHiddenLayerSize)
                .build())
        .layer( new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                .activation(Activation.SOFTMAX) //Override the global TANH activation with softmax for this layer
                .nIn(secondHiddenLayerSize).nOut(numClasses).build())
        .build();

//run the model
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();

//record score once every 100 iterations
model.setListeners(new ScoreIterationListener(100));

for(int i=0; i<5000; i++ ) {
    model.fit(allTrainingData);
}

//evaluate the model on the test set
Evaluation eval = new Evaluation(numClasses);

INDArray output = model.output(allTestData.getFeatures());

eval.eval(allTestData.getLabels(), output);
log.info(eval.stats());

// Save the Model
File locationToSave = new File(args[1]);
model.save(locationToSave, false);

预测：

// Open the network file
File locationToLoad = new File(args[0]);
MultiLayerNetwork model = MultiLayerNetwork.load(locationToLoad, false);
model.init();

// First: get the dataset using the record reader. CSVRecordReader handles loading/parsing
int numLinesToSkip = 0;
char delimiter = ',';

// Data to predict
CSVRecordReader recordReader = new CSVRecordReader(numLinesToSkip, delimiter);  //skip no lines at the top - i.e. no header
recordReader.initialize(new FileSplit(new File(args[1])));

//Second: the RecordReaderDataSetIterator handles conversion to DataSet objects, ready for use in neural network
int batchSize = 4000;

DataSetIterator iterator = new RecordReaderDataSetIterator.Builder(recordReader, batchSize).build();

List<DataSet> dataSetList = new ArrayList<>();

while (iterator.hasNext()) {
    DataSet allData = iterator.next();
    dataSetList.add(allData);
}

DataSet dataSet = DataSet.merge(dataSetList);

DataNormalization normalizer = new NormalizerStandardize();
normalizer.fit(dataSet);
normalizer.transform(dataSet);

// Now use it to classify some data
INDArray output = model.output(dataSet.getFeatures());

// Save result
BufferedWriter writer = new BufferedWriter(new FileWriter(args[2], true));
for (int i=0; i<output.rows(); i++) {
    writer
            .append(output.getRow(i).argMax().toString())
            .append(" ")
            .append(String.valueOf(i))
            .append(" ")
            .append(output.getRow(i).toString())
            .append('\n');
}
writer.close();

Answer 1

您需要对训练和预测使用相同的标准化数据。否则它会在转换您的数据时使用错误的统计数据。

您当前的操作方式导致数据看起来与训练数据非常不同，这就是您得到如此不同结果的原因。

Answer 2

确保将规范化器与模型一起保存如下：

import org.nd4j.linalg.dataset.api.preprocessor.serializer.NormalizerSerializer; 
NormalizerSerializer SUT = NormalizerSerializer.getDefault(); 

SUT.write(normalizer,new File("outputFile.bin")); 

NormalizeStandardize restored = SUT.restore(new File("outputFile.bin");

对相同数据的不同预测

Different predictions for the same data

machine-learning

deeplearning4j

dl4j