从 ml 模型的结果 table 的预测列中检索类别名称
Retrieve categories name from the predictions column in the result table of the ml model
我开发了一个 ml 模型(逻辑回归模型),使用 spark 2.4.3 和 java,它根据主题(输入)的关键字预测电子邮件的 WorkType(标签)那封电子邮件。我使用训练数据来训练模型,并将其用于测试数据如下:
LogisticRegressionModel lrModel = lr.fit(training);
Dataset<Row> result = lrModel.transform(testing);
result.select("WorkType","Subject","probability","label","prediction")
.orderBy(org.apache.spark.sql.functions.col("probability").desc())
.show(100, 30);
我得到的结果如下:
+------------------------+------------------------------+------------------------------+-----+----------+
| WorkType| Subject| probability|label|prediction|
+------------------------+------------------------------+------------------------------+-----+----------+
| Cancellation|Automatic reply: Ticket #72...|[0.8562867173211978,0.02423...| 0.0| 0.0|
| Cancellation|Ticket #72827 Cancelling Po...|[0.8244896056944511,0.03953...| 0.0| 0.0|
| Cancellation|Ticket #72827 Cancelling Po...|[0.8127553003889683,0.04411...| 0.0| 0.0|
| Cancellation|Ticket #72616 Daily Cancell...|[0.8115900852592474,0.03392...| 0.0| 0.0|
为了训练模型,worktype 被转换为标签,现在我们可以转换结果中的预测列,使其将 workType 字符串作为输出吗?请帮我。谢谢!
如果您正在使用 LabelEncoder 转换标签,使用 le.inverse_transform([0.0]) 您会得到字符串
我开发了一个 ml 模型(逻辑回归模型),使用 spark 2.4.3 和 java,它根据主题(输入)的关键字预测电子邮件的 WorkType(标签)那封电子邮件。我使用训练数据来训练模型,并将其用于测试数据如下:
LogisticRegressionModel lrModel = lr.fit(training);
Dataset<Row> result = lrModel.transform(testing);
result.select("WorkType","Subject","probability","label","prediction")
.orderBy(org.apache.spark.sql.functions.col("probability").desc())
.show(100, 30);
我得到的结果如下:
+------------------------+------------------------------+------------------------------+-----+----------+
| WorkType| Subject| probability|label|prediction|
+------------------------+------------------------------+------------------------------+-----+----------+
| Cancellation|Automatic reply: Ticket #72...|[0.8562867173211978,0.02423...| 0.0| 0.0|
| Cancellation|Ticket #72827 Cancelling Po...|[0.8244896056944511,0.03953...| 0.0| 0.0|
| Cancellation|Ticket #72827 Cancelling Po...|[0.8127553003889683,0.04411...| 0.0| 0.0|
| Cancellation|Ticket #72616 Daily Cancell...|[0.8115900852592474,0.03392...| 0.0| 0.0|
为了训练模型,worktype 被转换为标签,现在我们可以转换结果中的预测列,使其将 workType 字符串作为输出吗?请帮我。谢谢!
如果您正在使用 LabelEncoder 转换标签,使用 le.inverse_transform([0.0]) 您会得到字符串