ML.NET v1.4,预期布尔值,得到单一异常

ML.NET v1.4, expected Boolean, got Single exception

我想训练二进制 classificator。我将 ML.NET 0.9 升级到 ML.NET 1.4。现在我的代码如下所示:

var mlContext = new MLContext();
var trainData = mlContext.Data.LoadFromTextFile<CancerData>("Cancer-train.csv", hasHeader: true, separatorChar: ';');
var pipeline = mlContext.Transforms.NormalizeMinMax("Features")
    .AppendCacheCheckpoint(mlContext)
    .Append(mlContext.BinaryClassification.Trainers.SdcaLogisticRegression(labelColumnName: "Target", featureColumnName: "Features"));

var model = pipeline.Fit(trainData);

我的测试数据是这样的:

B;11.49;14.59;73.99;404.9;0.1046;0.08228;0.05308;0.01969;0.1779;0.06574;0.2034;1.166;1.567;14.34;0.004957;0.02114;0.04156;0.008038;0.01843;0.003614;12.4;21.9;82.04;467.6;0.1352;0.201;0.2596;0.07431;0.2941;0.0918
M;16.25;19.51;109.8;815.8;0.1026;0.1893;0.2236;0.09194;0.2151;0.06578;0.3147;0.9857;3.07;33.12;0.009197;0.0547;0.08079;0.02215;0.02773;0.006355;17.39;23.05;122.1;939.7;0.1377;0.4462;0.5897;0.1775;0.3318;0.09136
B;12.16;18.03;78.29;455.3;0.09087;0.07838;0.02916;0.01527;0.1464;0.06284;0.2194;1.19;1.678;16.26;0.004911;0.01666;0.01397;0.005161;0.01454;0.001858;13.34;27.87;88.83;547.4;0.1208;0.2279;0.162;0.0569;0.2406;0.07729

CancerData class:

class CancerData
{
    [LoadColumn(1, 30), ColumnName("Features")]
    public float[] FeatureVector { get; set; }

    [LoadColumn(31)]
    public float Target { get; set; }
}

从上面的代码,我得到错误:

System.ArgumentOutOfRangeException: 'Schema mismatch for label column '': expected Boolean, got Single Arg_ParamName_Name'

我相信它是因为我的第一列中没有 true/false 值,但 B/M。如何以优雅的方式将此值转换为 true/false trainer 可以无一例外地适应的值? ML.NET是否为这种场景提供解决方案?或者我错了,我的代码有问题?

首先,你的目标列不是第31个,而是第0个吧?

我只是将其作为文本阅读,然后使用 MapValue:

转换为 bool
class CancerData
{
    [LoadColumn(1, 30), ColumnName("Features")]
    public float[] FeatureVector { get; set; }

    [LoadColumn(0)]
    public string Target { get; set; }
}

// ...

var trainData = mlContext.Data.LoadFromTextFile<CancerData>("Cancer-train.csv", hasHeader: true, separatorChar: ';');

var targetMap = new Dictionary<string, bool> { { "M", true }, { "B", false } };

var pipeline = mlContext.Transforms.Conversion.MapValue("Target", targetMap)
    .Append(mlContext.Transforms.NormalizeMinMax("Features"))
    .AppendCacheCheckpoint(mlContext)
    .Append(mlContext.BinaryClassification.Trainers.SdcaLogisticRegression(labelColumnName: "Target", featureColumnName: "Features"));