R、igraph、tidygraph 中的图形学习
Graph learning in R, igraph, tidygraph
我有一个图表,每个节点都有一个值(红色值)。
我想做以下两件事(我猜1是2的特例):
应该为每个节点分配指向它的直接对等点的平均值。例如节点 #5 (1+2)/2=1.5
或节点 #3 (0+2+0)/3=2/3
.
而不是直接邻居,包括所有连接的节点,但扩散时间为 1/n,其中 n 是到节点的距离。距离越远,信息来自我们的信号越弱。
我研究了 igraph 的功能,但找不到任何正在执行此操作的功能(虽然我可能已经监督过了)。我该如何计算?
下面是具有随机值的示例网络的代码。
library(tidyverse)
library(tidygraph)
library(ggraph)
set.seed(6)
q <- tidygraph::play_erdos_renyi(6, p = 0.2) %>%
mutate(id = row_number(),
value = sample(0:3, size = 6, replace = T))
q %>%
ggraph(layout = "with_fr") +
geom_edge_link(arrow = arrow(length = unit(0.2, "inches"),
type = "closed")) +
geom_node_label(aes(label = id)) +
geom_node_text(aes(label = value), color = "red", size = 7,
nudge_x = 0.2, nudge_y = 0.2)
编辑,找到解决方法1
q %>%
mutate(value_smooth = map_local_dbl(order = 1, mindist = 1, mode = "in",
.f = function(neighborhood, ...) {
mean(as_tibble(neighborhood, active = 'nodes')$value)
}))
编辑2,解决2,我猜不是最优雅的
q %>%
mutate(value_smooth = map_local_dbl(order = 1, mindist = 0, mode = "in",
.f = function(neighborhood, node, ...) {
ne <- neighborhood
ne <- ne %>%
mutate(d = node_distance_to(which(as_tibble(ne,
active = "nodes")$id == node)))
as_tibble(ne, active = 'nodes') %>%
filter(d != 0) %>%
mutate(helper = value/d) %>%
summarise(m = mean(value)) %>%
pull(m)
}))
编辑 3,比 map_local_dbl
更快的替代方法
map_local
循环遍历图的所有节点。对于大图,这需要很长时间。对于仅计算均值,这不是必需的。一个更快的替代方法是使用邻接矩阵和一些矩阵乘法。
q_adj <- q %>%
igraph::as_adjacency_matrix()
# out
(q_adj %*% as_tibble(q)$value) / Matrix::rowSums(q_adj)
# in
(t(q_adj) %*% as_tibble(q)$value) / Matrix::colSums(q_adj)
邻接矩阵的平方为二阶邻接矩阵,依此类推。因此也可以创建问题 2 的解决方案。
Each node should be assigned the mean of the value of the direct peers
directing to it.
我猜你是认真的
应为每个节点分配指向它的直接对等点值的平均值,在更改任何节点值之前
这似乎微不足道 - 也许我遗漏了什么?
Loop over nodes
Sum values of adjacent nodes
Calculate mean and store in vector by node index
Loop over nodes
Set node value to mean stored in previous loop
也许你可以试试下面的代码
q %>%
set_vertex_attr(
name = "value",
value = sapply(
ego(., mode = "in", mindist = 1),
function(x) mean(x$value)
)
)
这给出了
# A tbl_graph: 6 nodes and 7 edges
#
# A directed simple graph with 1 component
#
# Node Data: 6 x 2 (active)
id value
<int> <dbl>
1 1 0.5
2 2 NaN
3 3 0.667
4 4 NaN
5 5 1.5
6 6 NaN
#
# Edge Data: 7 x 2
from to
<int> <int>
1 3 1
2 6 1
3 1 3
# ... with 4 more rows
我有一个图表,每个节点都有一个值(红色值)。
我想做以下两件事(我猜1是2的特例):
应该为每个节点分配指向它的直接对等点的平均值。例如节点 #5
(1+2)/2=1.5
或节点 #3(0+2+0)/3=2/3
.而不是直接邻居,包括所有连接的节点,但扩散时间为 1/n,其中 n 是到节点的距离。距离越远,信息来自我们的信号越弱。
我研究了 igraph 的功能,但找不到任何正在执行此操作的功能(虽然我可能已经监督过了)。我该如何计算?
下面是具有随机值的示例网络的代码。
library(tidyverse)
library(tidygraph)
library(ggraph)
set.seed(6)
q <- tidygraph::play_erdos_renyi(6, p = 0.2) %>%
mutate(id = row_number(),
value = sample(0:3, size = 6, replace = T))
q %>%
ggraph(layout = "with_fr") +
geom_edge_link(arrow = arrow(length = unit(0.2, "inches"),
type = "closed")) +
geom_node_label(aes(label = id)) +
geom_node_text(aes(label = value), color = "red", size = 7,
nudge_x = 0.2, nudge_y = 0.2)
编辑,找到解决方法1
q %>%
mutate(value_smooth = map_local_dbl(order = 1, mindist = 1, mode = "in",
.f = function(neighborhood, ...) {
mean(as_tibble(neighborhood, active = 'nodes')$value)
}))
编辑2,解决2,我猜不是最优雅的
q %>%
mutate(value_smooth = map_local_dbl(order = 1, mindist = 0, mode = "in",
.f = function(neighborhood, node, ...) {
ne <- neighborhood
ne <- ne %>%
mutate(d = node_distance_to(which(as_tibble(ne,
active = "nodes")$id == node)))
as_tibble(ne, active = 'nodes') %>%
filter(d != 0) %>%
mutate(helper = value/d) %>%
summarise(m = mean(value)) %>%
pull(m)
}))
编辑 3,比 map_local_dbl
map_local
循环遍历图的所有节点。对于大图,这需要很长时间。对于仅计算均值,这不是必需的。一个更快的替代方法是使用邻接矩阵和一些矩阵乘法。
q_adj <- q %>%
igraph::as_adjacency_matrix()
# out
(q_adj %*% as_tibble(q)$value) / Matrix::rowSums(q_adj)
# in
(t(q_adj) %*% as_tibble(q)$value) / Matrix::colSums(q_adj)
邻接矩阵的平方为二阶邻接矩阵,依此类推。因此也可以创建问题 2 的解决方案。
Each node should be assigned the mean of the value of the direct peers directing to it.
我猜你是认真的
应为每个节点分配指向它的直接对等点值的平均值,在更改任何节点值之前
这似乎微不足道 - 也许我遗漏了什么?
Loop over nodes
Sum values of adjacent nodes
Calculate mean and store in vector by node index
Loop over nodes
Set node value to mean stored in previous loop
也许你可以试试下面的代码
q %>%
set_vertex_attr(
name = "value",
value = sapply(
ego(., mode = "in", mindist = 1),
function(x) mean(x$value)
)
)
这给出了
# A tbl_graph: 6 nodes and 7 edges
#
# A directed simple graph with 1 component
#
# Node Data: 6 x 2 (active)
id value
<int> <dbl>
1 1 0.5
2 2 NaN
3 3 0.667
4 4 NaN
5 5 1.5
6 6 NaN
#
# Edge Data: 7 x 2
from to
<int> <int>
1 3 1
2 6 1
3 1 3
# ... with 4 more rows