我如何使用 Accord.net 中不同长度的输入来训练数据集

How can i train a data set with inputs that can be in different lengths in Accord.net

我想使用 Accord.net ann 和 svm 对一些数据集进行分类,问题是我的数据集输入数组的长度不尽相同, 每个数组的长度可以从 10 到大约 64, 有没有办法处理这样的数据集,还是我需要让它们都具有相同的大小?

你的数据集是由数字序列组成的吗?如果是,那么您可以使用隐马尔可夫模型。如果你有分类问题,你可以使用隐马尔可夫分类器和 Baum-Welch 学习来创建序列分类器。

例如,考虑以下涉及不同长度数据样本的示例:

// Declare some testing data
int[][] inputs = new int[][]
{
    new int[] { 0,1,1,0 },   // Class 0
    new int[] { 0,0,1,0 },   // Class 0
    new int[] { 0,1,1,1,0 }, // Class 0
    new int[] { 0,1,0 },     // Class 0

    new int[] { 1,0,0,1 },   // Class 1
    new int[] { 1,1,0,1 },   // Class 1
    new int[] { 1,0,0,0,1 }, // Class 1
    new int[] { 1,0,1 },     // Class 1
};

int[] outputs = new int[]
{
    0,0,0,0, // First four sequences are of class 0
    1,1,1,1, // Last four sequences are of class 1
};


// We are trying to predict two different classes
int classes = 2;

// Each sequence may have up to two symbols (0 or 1)
int symbols = 2;

现在您可以创建隐马尔可夫模型来对其进行分类:

// Nested models will have two states each
int[] states = new int[] { 2, 2 };

// Creates a new Hidden Markov Model Sequence Classifier with the given parameters
HiddenMarkovClassifier classifier = new HiddenMarkovClassifier(classes, states, symbols);

// Create a new learning algorithm to train the sequence classifier
var teacher = new HiddenMarkovClassifierLearning(classifier,

    // Train each model until the log-likelihood changes less than 0.001
    modelIndex => new BaumWelchLearning(classifier.Models[modelIndex])
    {
        Tolerance = 0.001,
        Iterations = 0
    }
);

// Train the sequence classifier using the algorithm
double likelihood = teacher.Run(inputs, outputs);

// Classify the sequences as belonging to one of the classes:
int output = classifier.Decide(new int[] { 1,0,0,1 }) // output should be 1