在 R 中定义和分类单独的网络

Define and categorise separate networks in R

我有一个问题我一直无法优化,我确信 igraph 或 tidy graphs 必须已经拥有这个功能,或者必须有更好的方法来做到这一点。我正在使用 R 和 igraph 来执行此操作,但可能 tidygraphs 也可以完成这项工作。

问题:如何将超过 200 万条边(节点 1 - 链接到 - 节点 2)的列表定义到它们自己的单独网络中,然后将网络定义为它的最高权重节点类别。

数据:

边缘:

from to
1 2
3 4
5 6
7 6
8 6

这将创建 3 个网络 N.B。在真实的例子中,我们有循环和多条边进出节点(这就是我使用 igraph 的原因,因为它很容易处理这些)。

数据:节点类别:

id cat weight
1 traffic accident 10
2 abuse 50
3 abuse 50
4 speeding 5
5 murder 100
6 abuse 50
7 speeding 5
8 abuse 50

决赛table: 最后的 table 对每个节点进行分类,并用节点的最大类别标记每个网络

id idcat networkid networkcat
1 traffic accident 1 50
2 abuse 1 50
3 abuse 2 50
4 speeding 2 50
5 murder 3 100
6 abuse 3 100
7 speeding 3 100
8 abuse 3 100

当前迭代方案及代码: 如果没有更好的解决方案,那么也许我们可以加快迭代速度?

library(tidyverse)
library(igraph)
library(purrr) #might be an answer
library(tidyverse)
library(tidygraph) #might be an answer

from <- c(1,3,5,7,8)
to <- c(2,4,6,6,6)
edges <- data.frame(from,to)

id <- c(1,2,3,4,5,6,7,8)
cat <- c("traffic accident","abuse","abuse","speeding","murder","abuse","speeding","abuse")
weight <- c(10,50,50,5,100,50,5,50)

details <- data.frame(id,cat,weight) 

g <- graph_from_data_frame(edges)# we can add the vertex details here as well g <- 
graph_from_data_frame(edges,vertices=details) but we join these in later
plot(g)

dg <- decompose(g)# decomposing the network defines the separate networks 

networks <- data.frame(id=as.integer(),
                   network_id=as.integer())

for (i in 1:length(dg)) { # this is likely too many to do at once. As the networks are already defined we can split this into chunks. There is a case here for parellisation
  n <- dg[[i]][1] %>% # using the decomposed list of lists from i graph. There is an issue here as the list comes back with the node as an index. I can't find an easier way to get this out
    as.data.frame() %>% # I can't work a way to bring out the data without changing to df and then using row names
    row.names() %>% # and this returns a vector
    as.data.frame() %>% 
    rename(id=1) %>% 
    mutate(network_id = i,
           id=as.integer(id))

  networks <-bind_rows(n,networks)
}  

networks <- networks %>% 
  inner_join(details) # one way to bring in details

n_weight <- networks %>%
  group_by(network_id) %>% 
  summarise(network_weight=max(weight))

networks <- networks %>% 
  inner_join(n_weight)

networks # final answer

filtered_n <- networks %>% 
  filter(network_weight==100) %>% 
  select(network_id) %>% 
  distinct()#this brings out just the network ID's of whatever we happen to want

filtered_n <- networks %>% 
  filter(network_id %in% filtered_n_id$network_id)

edges %>% 
  filter(from %in% filtered_n$id | to %in% filtered_n$id ) %>% 
  graph_from_data_frame() %>% 
  plot() # returns only the network/s that we want to view

这是一个仅使用 igraph 和 base R 的解决方案。

networkid <- components(g)$membership
Table <- aggregate(id, list(networkid),  function(x) { max(weight[x]) })
networkcat <-  Table$x[networkid]
Final <- data.frame(id, idcat=cat, networkid, networkcat)

Final
  id            idcat networkid networkcat
1  1 traffic accident         1         50
2  2            abuse         1         50
3  3            abuse         2         50
4  4         speeding         2         50
5  5           murder         3        100
6  6            abuse         3        100
7  7         speeding         3        100
8  8            abuse         3        100