java.lang.OutOfMemoryError: GC overhead limit exceeded when creating data structure for 1 million elements

java.lang.OutOfMemoryError: GC overhead limit exceeded when creating data structure for 1 million elements

当我 运行 下面显示的代码时,我得到 java.lang.OutOfMemoryError:line 16: svm_node node = new svm_node(); 上超出了 GC 开销限制。代码是 运行 在约 100 万个元素的数组上,其中每个元素包含 100 个短裤。

// read in a problem (in svmlight format)
private void read(SupportVector[] vectors) throws IOException
{
    int length = vectors.length; // Length of training data
    double[] classification = new double[length]; // This is redundant for our one-class SVM.
    svm_node[][] trainingSet = new svm_node[length][]; // The training set.
    for(int i = 0; i < length; i++)
    {
        classification[i] = 1; // Since classifications are redundant in our setup, they all belong to the same class, 1.

        // each vector. The vector has to be one index longer than the actual vector,
        // because the implementation needs an empty node in the end with index -1.
        svm_node[] vector = new svm_node[vectors[i].getLength() + 1];

        double[] doubles = vectors[i].toDouble(); // The SVM runs on doubles.
        for(int j = 0; j < doubles.length; j++) {
            svm_node node = new svm_node();
            node.index = j;
            node.value = doubles[j];
            vector[j] = node;
        }
        svm_node last = new svm_node();
        last.index = -1;
        vector[vector.length - 1] = last;

        trainingSet[i] = vector;
    }

    svm_problem problem = new svm_problem();
    problem.l = length;
    problem.y = classification;
    problem.x = trainingSet;
}

从异常来看,我猜垃圾收集器无法正确清理我的新 svm_nodes,但我看不出如何优化我的对象创建,以避免创建太多新 svn_nodes,无助地坐在堆里。

我无法更改数据结构,因为它是 LIBSVM 用作其支持向量机的输入。

我的问题是:此错误是否与垃圾收集器无法收集我的 svm_nodes 有关,或者我只是想解析包含太多元素的数据结构?

PS: 我已经为我的 32 位应用程序 (2gb) 将堆大小设置为最大值。

我在 64 位环境中启动应用程序并将堆增加到 2gb 以上,这解决了问题。我仍然相信有一个奇怪的 GC 怪癖,但我找不到它,增加堆也解决了这个问题。