图之间的结构运算符
Structural Operators between graphs
这个问题是对上一个问题的 "sequel"。我是 spark graphx 和 scala 的新手,我想知道如何执行下面的操作。
如何将两个图合并为一个新图,使新图具有以下内容属性:
The attributes of the common edges of the two graphs are averaged (or in a more general way, apply an averaging function between the edge attributes (edge attributes are of type double))
我们认为公共边=相同的srcId和相同的dstId,并且顶点和边是唯一的。
假设您只有两个图并且都包含相同的一组顶点而没有重复的边,您可以使用组合边并在新图上使用 groupEdges
方法:
val graph1: Graph[T,Double] = ???
val graph2: Graph[T,Double] = ???
Graph(graph1.vertices, graph1.edges.union(graph2.edges))
.groupEdges((val1, val2) => (val1 + val2) / 2.0)
或更通用一点:
Graph(graph1.vertices, graph1.edges.union(graph2.edges))
.mapEdges(e => (e.attr, 1.0))
.groupEdges((val1, val2) => (val1._1 + val2._1, val1._2 + val2._2))
.mapEdges(e => e.attr._1 / e.attr._2)
如果这还不够,您可以组合值并从头开始创建新图表:
def edgeToPair (e: Edge[Double]) = ((e.srcId, e.dstId), e.attr)
val pairs1 = graph1.edges.map(edgeToPair)
val pairs2 = graph2.edges.map(edgeToPair)
// Combine edges
val newEdges = pairs1.union(pairs2)
.aggregateByKey((0.0, 0.0))(
(acc, e) => (acc._1 + e, acc._2 + 1.0),
(acc1, acc2) => (acc1._1 + acc2._1, acc1._2 + acc2._2)
).map{case ((srcId, dstId), (acc, count)) => Edge(srcId, dstId, acc / count)}
// Combine vertices assuming there are no conflicts
// like different labels
val newVertices = graph1.vertices.union(graph2.vertices).distinct
// Create new graph
val newGraph = Graph(newVertices, newEdges)
其中 aggregateByKey
可以替换为 groupByKey
,然后是需要所有值的映射,例如中位数。
这个问题是对上一个问题的 "sequel"。我是 spark graphx 和 scala 的新手,我想知道如何执行下面的操作。
如何将两个图合并为一个新图,使新图具有以下内容属性:
The attributes of the common edges of the two graphs are averaged (or in a more general way, apply an averaging function between the edge attributes (edge attributes are of type double))
我们认为公共边=相同的srcId和相同的dstId,并且顶点和边是唯一的。
假设您只有两个图并且都包含相同的一组顶点而没有重复的边,您可以使用组合边并在新图上使用 groupEdges
方法:
val graph1: Graph[T,Double] = ???
val graph2: Graph[T,Double] = ???
Graph(graph1.vertices, graph1.edges.union(graph2.edges))
.groupEdges((val1, val2) => (val1 + val2) / 2.0)
或更通用一点:
Graph(graph1.vertices, graph1.edges.union(graph2.edges))
.mapEdges(e => (e.attr, 1.0))
.groupEdges((val1, val2) => (val1._1 + val2._1, val1._2 + val2._2))
.mapEdges(e => e.attr._1 / e.attr._2)
如果这还不够,您可以组合值并从头开始创建新图表:
def edgeToPair (e: Edge[Double]) = ((e.srcId, e.dstId), e.attr)
val pairs1 = graph1.edges.map(edgeToPair)
val pairs2 = graph2.edges.map(edgeToPair)
// Combine edges
val newEdges = pairs1.union(pairs2)
.aggregateByKey((0.0, 0.0))(
(acc, e) => (acc._1 + e, acc._2 + 1.0),
(acc1, acc2) => (acc1._1 + acc2._1, acc1._2 + acc2._2)
).map{case ((srcId, dstId), (acc, count)) => Edge(srcId, dstId, acc / count)}
// Combine vertices assuming there are no conflicts
// like different labels
val newVertices = graph1.vertices.union(graph2.vertices).distinct
// Create new graph
val newGraph = Graph(newVertices, newEdges)
其中 aggregateByKey
可以替换为 groupByKey
,然后是需要所有值的映射,例如中位数。