Accord.net 朴素贝叶斯学习 "Index was outside the bounds of the array"

Accord.net NaiveBayesLearning "Index was outside the bounds of the array"

我在 dot net core 1.1 中使用 Accord.net 3.7.0。

我使用的算法是朴素贝叶斯算法。而学习机制的源码如下:

    public LearningResultViewModel NaiveBayes(int[][] inputs, int[] outputs)
    {
        // Create a new Naive Bayes learning
        var learner = new NaiveBayesLearning();

        // Learn a Naive Bayes model from the examples
        NaiveBayes nb = learner.Learn(inputs, outputs);

        #region test phase
        // Compute the machine outputs
        int[] predicted = nb.Decide(inputs);

        // Use confusion matrix to compute some statistics.
        ConfusionMatrix confusionMatrix = new ConfusionMatrix(predicted, outputs, 1, 0);
        #endregion

        LearningResultViewModel result = new LearningResultViewModel()
        {
            Distributions = nb.Distributions,
            NumberOfClasses = nb.NumberOfClasses,
            NumberOfInputs = nb.NumberOfInputs,
            NumberOfOutputs = nb.NumberOfOutputs,
            NumberOfSymbols = nb.NumberOfSymbols,
            Priors = nb.Priors,
            confusionMatrix = confusionMatrix
        };

        return result;
    }

我已经在一些数据上测试了这段代码,但是随着数据的增长

Index was outside the bounds of the array

发生错误。

因为我无法在 Learn 方法中导航,所以我不知道该怎么做。 运行-时间的屏幕截图是这样的:

无额外信息,无内部异常,无IDEA!!!

TG.

// UPDATE_1 ***

输入数组是一个 180 x 4 矩阵(数组),如下图所示:

每行有 4 列。手工检查(如果需要我也可以分享它的视频!!!)

输出数组是一个 180 数组,如下所示:

它只包含 0 和 1(如果需要我也可以分享它的视频!!!)。

关于 NaiveBayesinLearning 的文档在这里:

NaiveBayesinLearning

本页底部的更多示例:

More examples

learn 方法文档在这里:

learn method doc

根据评论和他们的想法,我怀疑矩阵的值。所以我调查了一下:

如上图所示,某些行的值低于零。输入矩阵由此处示例中使用的 Codification 生成:

NaiveBayes

使用以下文档:

Codification docs

编码-1 的值为 null。就像下面的屏幕截图:

所以我的解决方案是用 "null" 替换 null 值。但也许有更好的解决方案。

现在包含固定数据的调用方法如下:

    public LearningResultViewModel Learn(EMVDBContext dBContext, string userId, LearningAlgorithm learningAlgorithm)
    {
        var learningDataRaw = dBContext.Mutants
            .Include(mu => mu.MutationOperator)
            .Where(mu => mu.Equivalecy == 0 || mu.Equivalecy == 10);

        string[] featureTitles = new string[] {
        "ChangeType",
        "OperatorName",
        "OperatorBefore",
        "OperatorAfter",
        };

        string[][] learningInputNotCodified = learningDataRaw.Select(ldr => new string[] {
            ldr.ChangeType.ToString(),
            ldr.MutationOperator.Name??"null",
            ldr.MutationOperator.Before??"null",
            ldr.MutationOperator.After??"null",
        }).ToArray();

        int[] learningOutputNotCodified = learningDataRaw.Select(ldr => ldr.Equivalecy == 0 ? 0 : 1).ToArray();

        #region Codification phase
        // Create a new codification codebook to
        // convert strings into discrete symbols
        Codification codebook = new Codification(featureTitles, learningInputNotCodified);

        // Extract input and output pairs to train
        int[][] learningInput = codebook.Transform(learningInputNotCodified);

        switch (learningAlgorithm)
        {
            case LearningAlgorithm.NaiveBayesian:
                return learningService.NaiveBayes(learningInput, learningOutputNotCodified);
                break;
            case LearningAlgorithm.SVM:
                break;
            default:
                break;
        }
        #endregion

        return null;
    }

希望这对遇到同样问题的其他人有所帮助。