R、igraph、tidygraph 中的图形学习

Graph learning in R, igraph, tidygraph

我有一个图表,每个节点都有一个值(红色值)。

我想做以下两件事(我猜1是2的特例):

  1. 应该为每个节点分配指向它的直接对等点的平均值。例如节点 #5 (1+2)/2=1.5 或节点 #3 (0+2+0)/3=2/3.

  2. 而不是直接邻居,包括所有连接的节点,但扩散时间为 1/n,其中 n 是到节点的距离。距离越远,信息来自我们的信号越弱。

我研究了 igraph 的功能,但找不到任何正在执行此操作的功能(虽然我可能已经监督过了)。我该如何计算?

下面是具有随机值的示例网络的代码。

library(tidyverse)
library(tidygraph)
library(ggraph)

set.seed(6)
q <- tidygraph::play_erdos_renyi(6, p = 0.2) %>% 
  mutate(id = row_number(),
         value = sample(0:3, size = 6, replace = T))
q %>% 
  ggraph(layout = "with_fr") +
  geom_edge_link(arrow = arrow(length = unit(0.2, "inches"), 
                               type = "closed")) +
  geom_node_label(aes(label = id)) +
  geom_node_text(aes(label = value), color = "red", size = 7, 
                 nudge_x = 0.2, nudge_y = 0.2)

编辑,找到解决方法1

q %>% 
  mutate(value_smooth = map_local_dbl(order = 1, mindist = 1, mode = "in", 
                                      .f = function(neighborhood, ...) {
    mean(as_tibble(neighborhood, active = 'nodes')$value)
  }))

编辑2,解决2,我猜不是最优雅的

q %>% 
  mutate(value_smooth = map_local_dbl(order = 1, mindist = 0, mode = "in", 
                                      .f = function(neighborhood, node, ...) {
    ne <- neighborhood
    
    ne <- ne %>%
      mutate(d = node_distance_to(which(as_tibble(ne, 
                                                  active = "nodes")$id == node)))
    
    as_tibble(ne, active = 'nodes') %>% 
      filter(d != 0) %>% 
      mutate(helper = value/d) %>% 
      summarise(m = mean(value)) %>% 
      pull(m)
    }))

编辑 3,比 map_local_dbl

更快的替代方法

map_local 循环遍历图的所有节点。对于大图,这需要很长时间。对于仅计算均值,这不是必需的。一个更快的替代方法是使用邻接矩阵和一些矩阵乘法。

q_adj <- q %>% 
  igraph::as_adjacency_matrix()

# out
(q_adj %*% as_tibble(q)$value) / Matrix::rowSums(q_adj)

# in
(t(q_adj) %*% as_tibble(q)$value) / Matrix::colSums(q_adj)

邻接矩阵的平方为二阶邻接矩阵,依此类推。因此也可以创建问题 2 的解决方案。

Each node should be assigned the mean of the value of the direct peers directing to it.

我猜你是认真的

应为每个节点分配指向它的直接对等点值的平均值,在更改任何节点值之前

这似乎微不足道 - 也许我遗漏了什么?

Loop over nodes
    Sum values of adjacent nodes
    Calculate mean and store in vector by node index
Loop over nodes
    Set node value to mean stored in previous loop

也许你可以试试下面的代码

q %>%
    set_vertex_attr(
        name = "value",
        value = sapply(
            ego(., mode = "in", mindist = 1),
            function(x) mean(x$value)
        )
    )

这给出了

# A tbl_graph: 6 nodes and 7 edges
#
# A directed simple graph with 1 component
#
# Node Data: 6 x 2 (active)
     id   value
  <int>   <dbl>
1     1   0.5
2     2 NaN
3     3   0.667
4     4 NaN
5     5   1.5
6     6 NaN
#
# Edge Data: 7 x 2
   from    to
  <int> <int>
1     3     1
2     6     1
3     1     3
# ... with 4 more rows