returns 祖先和 children 网络中的 R 函数

Function in R which returns ancestors and children in a network

我想在 R 中创建一个函数 "f",它在条目中有 data.frame 个个体和个体之间的边(例如称为 A2),并且 returns另一个 data.frame 只有 A2 的 "ancestors" 和 "children" 以及祖先的祖先和 children 的 children !

为了说明我的复杂问题:

 library(visNetwork)
 nodes <- data.frame(id = c(paste0("A",1:5),paste0("B",1:3)),
                label = c(paste0("A",1:5),paste0("B",1:3)))
 edges <- data.frame(from = c("A1","A1","A2","A3","A4","B1","B2"),
                to = c("A2","A3","A4","A4","A5","B3","B3"))
 visNetwork(nodes, edges) %>% 
   visNodes(font = list(size=45)) %>% 
    visHierarchicalLayout(direction = "LR", levelSeparation = 500)

在此示例中,data.frame 包含 2 个不同的独立网络:1 个网络具有 "A",另一个网络具有 "B"。

我想实现一个函数 f(data=edges, indiv="A2") ,其中 return 是一个 data.frame ,其中包含所有相关的 data.frame 行与 "A" 的网络:

f(edges,"A2") 将 return data.frame 边的提取

 head(f(edges,"A2"))
 #  from to
 #1   A1 A2
 #2   A1 A3
 #3   A2 A4
 #4   A3 A4
 #5   A4 A5

我希望你能帮助我。

非常感谢!

这对我有用:

library(igraph)
g <- graph_from_literal(A1--A2, A1--A3, A2--A4, A3--A4, A4--A5, B1--B3, B2--B3 )
sg_a2 <- subcomponent(g, 'A2', 'in')
as_data_frame(subgraph.edges(g, sg_a2))

它给出:

#  from to
#1   A1 A2
#2   A1 A3
#3   A2 A4
#4   A3 A4
#5   A4 A5

您可以尝试仅过滤连接到 A2 的节点(即距离不等于 Inf

library(tidygraph)
edges <- data.frame(from = c("A1","A1","A2","A3","A4","B1","B2"),
                    to = c("A2","A3","A4","A4","A5","B3","B3"))
as_tbl_graph(edges) %>% 
  filter(is.finite(node_distance_to(name=="A2", mode="all")))

这给出了

# A tbl_graph: 5 nodes and 5 edges
#
# A directed acyclic simple graph with 1 component
#
# Node Data: 5 x 1 (active)
   name
  <chr>
1    A1
2    A2
3    A3
4    A4
5    A5
#
# Edge Data: 5 x 2
   from    to
  <int> <int>
1     1     2
2     1     3
3     2     4
# ... with 2 more rows

我编写了一个简单的算法来查找与一个人相关联的所有家庭(我相信它可以改进)。就像@romles 建议的那样,您可以对 igraph 等一些 R 包做同样的事情。但是,在这种情况下,我的函数似乎比 igraph 选项的性能更高。

edges <- data.frame(from = c("A1","A1","A2","A3","A4","B1","B2"),
                    to = c("A2","A3","A4","A4","A5","B3","B3"),
                    stringsAsFactors = FALSE)
f <- function(data, indiv){
    children_ancestors <- function(indiv){
        # Find children and ancestors of an indiv
        c(data[data[,"from"]==indiv,"to"],data[data[,"to"]==indiv,"from"])
    }
    family <- indiv
    new_people <- children_ancestors(indiv) # New people to inspect
    while(length(diff_new_p <- setdiff(new_people,family)) > 0){
        # if the new people aren't yet in the family :
        family <- c(family, diff_new_p)
        new_people <- unlist(sapply(diff_new_p, children_ancestors))
        new_people <- unique(new_people)
    }
    data[(data[,1] %in% family) | (data[,2] %in% family),]
}

f(edges, "A2") 给出了预期的结果。与 igraph 函数比较:

library(igraph)
library(microbenchmark)
edges2 <- graph_from_data_frame(edges, directed = FALSE)
microbenchmark(simple_function = f(edges,"A2"),
               igraph_option = as_data_frame(subgraph.edges(edges2, subcomponent(edges2, 'A2', 'in')))
               )
#Unit: microseconds
#            expr      min       lq     mean   median       uq      max neval
# simple_function  874.411  968.323 1206.037 1123.515 1325.075 2957.931   100
#   igraph_option 1239.896 1451.364 1802.341 1721.227 1984.380 3907.089   100