在 Weka 中附加两个文件

Question

我有两个数据集。基本上，它们是两个 .arff 文件。

Fold1.arff 包含：

@relation iris

@attribute sepallength numeric
@attribute sepalwidth numeric
@attribute petallength numeric
@attribute petalwidth numeric
@attribute class {Iris-setosa,Iris-versicolor,Iris-virginica}

@data
5.1,3.5,1.4,0.2,Iris-setosa
5.4,3.7,1.5,0.2,Iris-setosa
5.4,3.4,1.7,0.2,Iris-setosa
4.8,3.1,1.6,0.2,Iris-setosa
5,3.5,1.3,0.3,Iris-setosa
7,3.2,4.7,1.4,Iris-versicolor
5,2,3.5,1,Iris-versicolor
5.9,3.2,4.8,1.8,Iris-versicolor
5.5,2.4,3.8,1.1,Iris-versicolor
5.5,2.6,4.4,1.2,Iris-versicolor
6.3,3.3,6,2.5,Iris-virginica
6.5,3.2,5.1,2,Iris-virginica
6.9,3.2,5.7,2.3,Iris-virginica
7.4,2.8,6.1,1.9,Iris-virginica
6.7,3.1,5.6,2.4,Iris-virginica

Fold2.arff 包含

@relation iris

@attribute sepallength numeric
@attribute sepalwidth numeric
@attribute petallength numeric
@attribute petalwidth numeric
@attribute class {Iris-setosa,Iris-versicolor,Iris-virginica}

@data
4.9,3,1.4,0.2,Iris-setosa
4.8,3.4,1.6,0.2,Iris-setosa
5.1,3.7,1.5,0.4,Iris-setosa
5.4,3.4,1.5,0.4,Iris-setosa
4.5,2.3,1.3,0.3,Iris-setosa
6.4,3.2,4.5,1.5,Iris-versicolor
5.9,3,4.2,1.5,Iris-versicolor
6.1,2.8,4,1.3,Iris-versicolor
5.5,2.4,3.7,1,Iris-versicolor
6.1,3,4.6,1.4,Iris-versicolor
5.8,2.7,5.1,1.9,Iris-virginica
6.4,2.7,5.3,1.9,Iris-virginica
5.6,2.8,4.9,2,Iris-virginica
7.9,3.8,6.4,2,Iris-virginica
6.9,3.1,5.1,2.3,Iris-virginica

现在我尝试使用命令附加它们：

java weka.core.Instances append d:\fold1.arff d:\fold2.arff > d:\result.arff

我运行来自 Weka 简单 CLI 字段的命令。

我收到这个错误：

Usage:
weka.core.Instances help
    Prints this help
weka.core.Instances <filename>
    Outputs dataset statistics
weka.core.Instances merge <filename1> <filename2>
    Merges the datasets (must have same number of rows).
    Generated dataset gets output on stdout.
weka.core.Instances append <filename1> <filename2>
    Appends the second dataset to the first (must have same number of attributes).
    Generated dataset gets output on stdout.
weka.core.Instances headers <filename1> <filename2>
    Compares the structure of the two datasets and outputs whether they
    differ or not.
weka.core.Instances randomize <seed> <filename>
    Randomizes the dataset and outputs it on stdout.

我的两个文件有相同的行数，你可以从上面的例子中看到。那为什么result.arff文件无法创建呢？

谢谢。

Answer 1

我可以在你的两个文件中看到属性相同并且具有相同数量的属性（总共 5 个）

从 official documentation 开始，追加的当前语法是

    weka.core.Instances append <filename1> <filename2>

这会将文件名 2 附加到文件名 1。无需指定输出文件。即所有更改都存储在 filename1 本身

备注最好用引号（单引号或双引号）传递文件名，尤其是在文件名有空白 space 的情况下（例如，fold 1.arff 而不是 fold1.arff）

      java weka.core.Instances append “d:\fold1.arff” “d:\fold2.arff”

在少数语言中，黑斜杠是转义序列，因此必须使用两次

      java weka.core.Instances append “d:\fold1.arff” “d:\fold2.arff”

在 Weka 中附加两个文件

appending two files in Weka

append

weka