Flink: PageRank 类型不匹配错误

Flink: PageRank type mismatch error

我想从格式如下的 CSV 边文件计算 PageRank

12,13,1.0
12,14,1.0
12,15,1.0
12,16,1.0
12,17,1.0
...

我的代码:

var filename = "<filename>.csv"

val graph = Graph.fromCsvReader[Long,Double,Double]( 
                   env = env, 
                   pathEdges = filename, 
                   readVertices = false, 
                   hasEdgeValues = true, 
                   vertexValueInitializer = new MapFunction[Long, Double] { 
                           def map(id: Long): Double = 0.0 } )

val ranks = new PageRank[Long](0.85, 20).run(graph)

我从 Flink Scala Shell 得到以下错误:

error: type mismatch;
 found   : org.apache.flink.graph.scala.Graph[Long,_23,_24] where type _24 >: Double with _22, type _23 >: Double with _21
 required: org.apache.flink.graph.Graph[Long,Double,Double]
            val ranks = new PageRank[Long](0.85, 20).run(graph)
                                                         ^

我做错了什么?

(每个顶点的初始值 0.0 和每个边的初始值 1.0 是否正确?)

问题是您将 Scala org.apache.flink.graph.scala.Graph 提供给 PageRank.run,它期望 Java org.apache.flink.graph.Graph.

为了 运行 Scala Graph 对象的 GraphAlgorithm,你必须调用 Scala Graphrun 方法GraphAlgorithm.

graph.run(new PageRank[Long](0.85, 20))

更新

PageRank 算法的情况下,重要的是要注意该算法需要类型 Graph[K, java.lang.Double, java.lang.Double] 的实例。由于 Java 的 Double 类型不同于 Scala 的 Double 类型(在类型检查方面),因此必须考虑到这一点。

对于示例代码,这意味着

val graph = Graph.fromCsvReader[Long,java.lang.Double,java.lang.Double]( 
  env = env, 
  pathEdges = filename, 
  readVertices = false, 
  hasEdgeValues = true, 
  vertexValueInitializer = new MapFunction[Long, java.lang.Double] { 
         def map(id: Long): java.lang.Double = 0.0 } )
  .asInstanceOf[Graph[Long, java.lang.Double, java.lang.Double]]