如何使用 Spark LinearSVC 模型获得最佳功能?
Ho do get best features with Spark LinearSVC model?
我正在尝试使用 ChiSqSelector 来确定 Spark 2.2 LSVCModel 的最佳特征,因此:
import org.apache.spark.ml.feature.ChiSqSelector
val chiSelector = new ChiSqSelector().setNumTopFeatures(5).
setFeaturesCol("features").
setLabelCol("label").setOutputCol("selectedFeatures")
val pipeline = new Pipeline().setStages(Array(labelIndexer, monthIndexer, hashingTF
, idf, va, featureIndexer, chiSelector, lsvc, labelConverter))
val model = pipeline.fit(training)
val importantFeatures = model.selectedFeatures
import org.apache.spark.ml.classification.LinearSVCModel
val LSVCModel= model.stages(6).asInstanceOf[org.apache.spark.ml.classification.
LinearSVCModel]
val importantFeatures = LSVCModel.selectedFeatures
给出错误:
<console>:180: error: value selectedFeatures is not a member of
org.apache.spark.ml.classification.LinearSVCModel
val importantFeatures = LSVCModel.selectedFeatures
这个模型可以使用 ChiSqSelector 吗?如果没有,还有其他选择吗?
Linear SVC 不会做任何特征选择。您应该从管道中提取 ChiSqSelectorModel
,而不是 LinearSVCModel
。
import org.apache.spark.ml.feature.ChiSqSelectorModel
val chiSqModel = model.stages(6).asInstanceOf[ChiSqSelectorModel]
val importantFeatures = chiSqModel.selectedFeatures
我正在尝试使用 ChiSqSelector 来确定 Spark 2.2 LSVCModel 的最佳特征,因此:
import org.apache.spark.ml.feature.ChiSqSelector
val chiSelector = new ChiSqSelector().setNumTopFeatures(5).
setFeaturesCol("features").
setLabelCol("label").setOutputCol("selectedFeatures")
val pipeline = new Pipeline().setStages(Array(labelIndexer, monthIndexer, hashingTF
, idf, va, featureIndexer, chiSelector, lsvc, labelConverter))
val model = pipeline.fit(training)
val importantFeatures = model.selectedFeatures
import org.apache.spark.ml.classification.LinearSVCModel
val LSVCModel= model.stages(6).asInstanceOf[org.apache.spark.ml.classification.
LinearSVCModel]
val importantFeatures = LSVCModel.selectedFeatures
给出错误:
<console>:180: error: value selectedFeatures is not a member of
org.apache.spark.ml.classification.LinearSVCModel
val importantFeatures = LSVCModel.selectedFeatures
这个模型可以使用 ChiSqSelector 吗?如果没有,还有其他选择吗?
Linear SVC 不会做任何特征选择。您应该从管道中提取 ChiSqSelectorModel
,而不是 LinearSVCModel
。
import org.apache.spark.ml.feature.ChiSqSelectorModel
val chiSqModel = model.stages(6).asInstanceOf[ChiSqSelectorModel]
val importantFeatures = chiSqModel.selectedFeatures