Scala - Spark:return 来自特定节点的顶点属性
Scala - Spark : return vertex properties from particular node
我有一个图,我想计算最大度数。特别是具有最大度数的顶点我想知道所有属性。
这是代码片段:
def max(a: (VertexId, Int), b: (VertexId, Int)): (VertexId, Int) = {
if (a._2 > b._2) a else b
}
val maxDegrees : (VertexId, Int) = graphX.degrees.reduce(max)
max: (a: (org.apache.spark.graphx.VertexId, Int), b: (org.apache.spark.graphx.VertexId, Int))(org.apache.spark.graphx.VertexId, Int)
maxDegrees: (org.apache.spark.graphx.VertexId, Int) = (2063726182,56387)
val startVertexRDD = graphX.vertices.filter{case (hash_id, (id, state)) => hash_id == maxDegrees._1}
startVertexRDD.collect()
但是它返回了这个异常:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 145.0 failed 1 times, most recent failure: Lost task 0.0 in stage 145.0 (TID 5380, localhost, executor driver): scala.MatchError: (1009147972,null) (of class scala.Tuple2)
如何解决?
我认为这是问题所在。这里:
val startVertexRDD = graphX.vertices.filter{case (hash_id, (id, state)) => hash_id == maxDegrees._1}
所以它尝试像这样比较一些元组
(2063726182,56387)
期待这样的事情:
(hash_id, (id, state))
提高 scala.MatchError 因为正在比较 (VertextId, Int) 的 Tuple2 和 (VertexId, Tuple2(id, state))
也要注意这一点:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 145.0 failed 1 times, most recent failure: Lost task 0.0 in stage 145.0 (TID 5380, localhost, executor driver): scala.MatchError: (1009147972,null) (of class scala.Tuple2)
具体在这里:
scala.MatchError: (1009147972,null)
没有计算顶点 1009147972 的度数,因此在比较时也会出现一些问题。
希望这对您有所帮助。
我有一个图,我想计算最大度数。特别是具有最大度数的顶点我想知道所有属性。 这是代码片段:
def max(a: (VertexId, Int), b: (VertexId, Int)): (VertexId, Int) = {
if (a._2 > b._2) a else b
}
val maxDegrees : (VertexId, Int) = graphX.degrees.reduce(max)
max: (a: (org.apache.spark.graphx.VertexId, Int), b: (org.apache.spark.graphx.VertexId, Int))(org.apache.spark.graphx.VertexId, Int)
maxDegrees: (org.apache.spark.graphx.VertexId, Int) = (2063726182,56387)
val startVertexRDD = graphX.vertices.filter{case (hash_id, (id, state)) => hash_id == maxDegrees._1}
startVertexRDD.collect()
但是它返回了这个异常:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 145.0 failed 1 times, most recent failure: Lost task 0.0 in stage 145.0 (TID 5380, localhost, executor driver): scala.MatchError: (1009147972,null) (of class scala.Tuple2)
如何解决?
我认为这是问题所在。这里:
val startVertexRDD = graphX.vertices.filter{case (hash_id, (id, state)) => hash_id == maxDegrees._1}
所以它尝试像这样比较一些元组
(2063726182,56387)
期待这样的事情:
(hash_id, (id, state))
提高 scala.MatchError 因为正在比较 (VertextId, Int) 的 Tuple2 和 (VertexId, Tuple2(id, state))
也要注意这一点:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 145.0 failed 1 times, most recent failure: Lost task 0.0 in stage 145.0 (TID 5380, localhost, executor driver): scala.MatchError: (1009147972,null) (of class scala.Tuple2)
具体在这里:
scala.MatchError: (1009147972,null)
没有计算顶点 1009147972 的度数,因此在比较时也会出现一些问题。
希望这对您有所帮助。