获取所有连接到 Apache Spark GraphX 节点的节点
Get all the nodes connected to a node in Apache Spark GraphX
假设我们在 Apache GraphX 中得到的输入为:
顶点RDD:
val vertexArray = Array(
(1L, "Alice"),
(2L, "Bob"),
(3L, "Charlie"),
(4L, "David"),
(5L, "Ed"),
(6L, "Fran")
)
边缘RDD:
val edgeArray = Array(
Edge(1L, 2L, 1),
Edge(2L, 3L, 1),
Edge(3L, 4L, 1),
Edge(5L, 6L, 1)
)
我需要连接到 Apache Spark GraphX 节点的所有组件
1,[1,2,3,4]
5,[5,6]
您可以使用 ConnectedComponents
其中 returns
a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
并重塑结果
graph.connectedComponents.vertices.map(_.swap).groupByKey
假设我们在 Apache GraphX 中得到的输入为:
顶点RDD:
val vertexArray = Array(
(1L, "Alice"),
(2L, "Bob"),
(3L, "Charlie"),
(4L, "David"),
(5L, "Ed"),
(6L, "Fran")
)
边缘RDD:
val edgeArray = Array(
Edge(1L, 2L, 1),
Edge(2L, 3L, 1),
Edge(3L, 4L, 1),
Edge(5L, 6L, 1)
)
我需要连接到 Apache Spark GraphX 节点的所有组件
1,[1,2,3,4]
5,[5,6]
您可以使用 ConnectedComponents
其中 returns
a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
并重塑结果
graph.connectedComponents.vertices.map(_.swap).groupByKey