从 FlinkML 多元线性回归中提取权重
Extracting weights from FlinkML Multiple Linear Regression
我是运行 Flink (0.10-SNAPSHOT) 的示例多元线性回归。我不知道如何提取权重(例如斜率和截距,beta0-beta1,你想怎么称呼它们)。我在 Scala 方面经验不足,这可能是我的一半问题。
感谢任何人提供的帮助。
object Job {
def main(args: Array[String]) {
// set up the execution environment
val env = ExecutionEnvironment.getExecutionEnvironment
val survival = env.readCsvFile[(String, String, String, String)]("/home/danger/IdeaProjects/quickstart/docs/haberman.data")
val survivalLV = survival
.map{tuple =>
val list = tuple.productIterator.toList
val numList = list.map(_.asInstanceOf[String].toDouble)
LabeledVector(numList(3), DenseVector(numList.take(3).toArray))
}
val mlr = MultipleLinearRegression()
.setStepsize(1.0)
.setIterations(100)
.setConvergenceThreshold(0.001)
mlr.fit(survivalLV)
println(mlr.toString()) // This doesn't do anything productive...
println(mlr.weightsOption) // Neither does this.
}
}
问题是您只构建了将计算权重的 Flink 作业 (DAG),但尚未执行。触发执行的最简单方法是使用 collect
方法,该方法会将 DataSet
的结果返回给您的客户端。
mlr.fit(survivalLV)
val weights = mlr.weightsOption match {
case Some(weights) => weights.collect()
case None => throw new Exception("Could not calculate the weights.")
}
println(weights)
我是运行 Flink (0.10-SNAPSHOT) 的示例多元线性回归。我不知道如何提取权重(例如斜率和截距,beta0-beta1,你想怎么称呼它们)。我在 Scala 方面经验不足,这可能是我的一半问题。
感谢任何人提供的帮助。
object Job {
def main(args: Array[String]) {
// set up the execution environment
val env = ExecutionEnvironment.getExecutionEnvironment
val survival = env.readCsvFile[(String, String, String, String)]("/home/danger/IdeaProjects/quickstart/docs/haberman.data")
val survivalLV = survival
.map{tuple =>
val list = tuple.productIterator.toList
val numList = list.map(_.asInstanceOf[String].toDouble)
LabeledVector(numList(3), DenseVector(numList.take(3).toArray))
}
val mlr = MultipleLinearRegression()
.setStepsize(1.0)
.setIterations(100)
.setConvergenceThreshold(0.001)
mlr.fit(survivalLV)
println(mlr.toString()) // This doesn't do anything productive...
println(mlr.weightsOption) // Neither does this.
}
}
问题是您只构建了将计算权重的 Flink 作业 (DAG),但尚未执行。触发执行的最简单方法是使用 collect
方法,该方法会将 DataSet
的结果返回给您的客户端。
mlr.fit(survivalLV)
val weights = mlr.weightsOption match {
case Some(weights) => weights.collect()
case None => throw new Exception("Could not calculate the weights.")
}
println(weights)