多类分类标签为字符串类型时出错
Error when multiclass classification label is type string
我刚开始使用 ML.Net,发现自己对 API 的快速发展和基于各种 API 版本的示例感到困惑。
我的目标是读入多个数字特征列和一个指定标签的文本列 ("Brand"),但我在该代码段的最后一行出现错误
var trainingDataView = mlContext.Data.ReadFromTextFile<PurchaseData>
(path: trainDataPath, hasHeader: true, separatorChar: ',');
var dataProcessPipeline = mlContext.Transforms
.Concatenate(DefaultColumnNames.Features,
nameof(PurchaseData.AgeBracket),
nameof(PurchaseData.Gender),
nameof(PurchaseData.IncomeBracket),
)
.Append(mlContext.Transforms.CopyColumns("Label", nameof(PurchaseData.Brand)))
.AppendCacheCheckpoint(mlContext);
var trainer = mlContext.MulticlassClassification.Trainers
.StochasticDualCoordinateAscent(featureColumn: DefaultColumnNames.Features);
var trainingPipeline = dataProcessPipeline.Append(trainer);
var trainedModel = trainingPipeline.Fit(trainingDataView);
'Schema mismatch for label column 'Label': expected float, double or KeyType, got Text'
为什么标签不是 expected/allowed 文本,我该如何解决?
您需要将 Label 转换为键类型,算法需要数字作为输入。
代替:
.Append(mlContext.Transforms.CopyColumns("Label", nameof(PurchaseData.Brand)))
与:
mlContext.Transforms.Conversion.MapValueToKey(outputColumnName: DefaultColumnNames.Label,inputColumnName:nameof(PurchaseData.Brand))
我刚开始使用 ML.Net,发现自己对 API 的快速发展和基于各种 API 版本的示例感到困惑。
我的目标是读入多个数字特征列和一个指定标签的文本列 ("Brand"),但我在该代码段的最后一行出现错误
var trainingDataView = mlContext.Data.ReadFromTextFile<PurchaseData>
(path: trainDataPath, hasHeader: true, separatorChar: ',');
var dataProcessPipeline = mlContext.Transforms
.Concatenate(DefaultColumnNames.Features,
nameof(PurchaseData.AgeBracket),
nameof(PurchaseData.Gender),
nameof(PurchaseData.IncomeBracket),
)
.Append(mlContext.Transforms.CopyColumns("Label", nameof(PurchaseData.Brand)))
.AppendCacheCheckpoint(mlContext);
var trainer = mlContext.MulticlassClassification.Trainers
.StochasticDualCoordinateAscent(featureColumn: DefaultColumnNames.Features);
var trainingPipeline = dataProcessPipeline.Append(trainer);
var trainedModel = trainingPipeline.Fit(trainingDataView);
'Schema mismatch for label column 'Label': expected float, double or KeyType, got Text'
为什么标签不是 expected/allowed 文本,我该如何解决?
您需要将 Label 转换为键类型,算法需要数字作为输入。
代替:
.Append(mlContext.Transforms.CopyColumns("Label", nameof(PurchaseData.Brand)))
与:
mlContext.Transforms.Conversion.MapValueToKey(outputColumnName: DefaultColumnNames.Label,inputColumnName:nameof(PurchaseData.Brand))