从特定顶点创建图形 graphx spark
Creat graph from specific vertices graphx spark
我想用火车数据集构建图表。这是我的代码:
val vertices = df.rdd.flatMap(row => row.getAs[Seq[Row]](3)
.map(element => (element.getLong(0),element.getBoolean(1),element.getBoolean(2))))
val verticesTrain = vertices.filter{case(id,test,validation) => (test==false)&&(validation==false)}.map(_._1)
val edges = df.rdd.flatMap(row => row.getAs[Seq[Row]](1)
.map(element => (element.getLong(0),element.getLong(1))))
val graph = Graph.apply(verticesTrain.map(vertex => (vertex,1.0)),edges.map{case(s,d)=>Edge(s,d,1.0)})
然而,当我计算图形的顶点时,我似乎拥有所有顶点,而不仅仅是来自 verticesTrain
的顶点
graph.vertices.count()
Out: Long
56944
verticesTrain.count()
Out: Long
44906
如何构建图,仅使用 verticesTrain 作为顶点?
使用子图有效:
当您想从图中过滤掉边或顶点时,应使用此函数。
这是我用于解决这个特定问题的代码:
val graph = Graph.apply(verticesTrain.map(vertex => (vertex,1.0)),edges.map{case(s,d)=>Edge(s,d,1.0)})
val filtered = graph.subgraph(vpred = (vid,vd)=>vd!=null.asInstanceOf[Double])
filtered.vertices.count()
Out: Long
44906
我想用火车数据集构建图表。这是我的代码:
val vertices = df.rdd.flatMap(row => row.getAs[Seq[Row]](3)
.map(element => (element.getLong(0),element.getBoolean(1),element.getBoolean(2))))
val verticesTrain = vertices.filter{case(id,test,validation) => (test==false)&&(validation==false)}.map(_._1)
val edges = df.rdd.flatMap(row => row.getAs[Seq[Row]](1)
.map(element => (element.getLong(0),element.getLong(1))))
val graph = Graph.apply(verticesTrain.map(vertex => (vertex,1.0)),edges.map{case(s,d)=>Edge(s,d,1.0)})
然而,当我计算图形的顶点时,我似乎拥有所有顶点,而不仅仅是来自 verticesTrain
的顶点graph.vertices.count()
Out: Long
56944
verticesTrain.count()
Out: Long
44906
如何构建图,仅使用 verticesTrain 作为顶点?
使用子图有效:
当您想从图中过滤掉边或顶点时,应使用此函数。
这是我用于解决这个特定问题的代码:
val graph = Graph.apply(verticesTrain.map(vertex => (vertex,1.0)),edges.map{case(s,d)=>Edge(s,d,1.0)})
val filtered = graph.subgraph(vpred = (vid,vd)=>vd!=null.asInstanceOf[Double])
filtered.vertices.count()
Out: Long
44906