从 tidygraph 包中获取边缘数据

Get Edge Data from the tidygraph package

应该很简单,但是我卡在了这个操作上。我感兴趣的是提取块边缘数据:23,502 x 3。并指示节点的名称。简而言之,我需要通过名称知道每对节点的权重。

代码:

# A tbl_graph: 11539 nodes and 23502 edges
#
# An undirected simple graph with 2493 components
#
# Node Data: 11,539 x 3 (active)
  name            neighbors groups
  <chr>               <dbl>  <int>
1 CHANSATITPORN N         1   1540
2 POBKEEREE V             1   1540
3 SAINIS G                4    361
4 HARITOS G               4    361
5 KRIEMADIS T             4    361
6 PAPASOLOMOU I           3    361
# … with 11,533 more rows
#
# Edge Data: 23,502 x 3
   from    to weight
  <int> <int>  <dbl>
1     1     2      1
2     3     4      2
3     3     5      2
# … with 23,499 more rows

您可以仅在边缘上使用 data.frame() 来提取边缘信息。您可以将我命名为 tg 的示例 tidygraph 对象替换为您的 tidygraph 对象名称,下面的代码应该适合您。

library(igraph)
library(tidygraph)
library(tibble)

# https://tidygraph.data-imaginist.com/reference/tbl_graph.html
rstat_nodes <- data.frame(name = c("Hadley", "David", "Romain", "Julia"))
rstat_edges <- data.frame(from = c(1, 1, 1, 2, 3, 3, 4, 4, 4),
                          to = c(2, 3, 4, 1, 1, 2, 1, 2, 3),
                          weight = c(1:9))
tg <- tbl_graph(nodes = rstat_nodes, edges = rstat_edges)
tg
#> # A tbl_graph: 4 nodes and 9 edges
#> #
#> # A directed simple graph with 1 component
#> #
#> # Node Data: 4 x 1 (active)
#>   name  
#>   <fct> 
#> 1 Hadley
#> 2 David 
#> 3 Romain
#> 4 Julia 
#> #
#> # Edge Data: 9 x 3
#>    from    to weight
#>   <int> <int>  <int>
#> 1     1     2      1
#> 2     1     3      2
#> 3     1     4      3
#> # ... with 6 more rows


# Get edge information ----
edge_list <-
  tg %>%
  activate(edges) %>%
  data.frame()
edge_list
#>   from to weight
#> 1    1  2      1
#> 2    1  3      2
#> 3    1  4      3
#> 4    2  1      4
#> 5    3  1      5
#> 6    3  2      6
#> 7    4  1      7
#> 8    4  2      8
#> 9    4  3      9

但是如果您也想要其中的名称,这里有一些代码可以简单地提取节点信息并将数据连接在一起。

# Separate out edges and node data frames
tg_nodes <-
  tg %>%
  activate(nodes) %>%
  data.frame() %>%
  rownames_to_column("rowid") %>%
  mutate(rowid = as.integer(rowid))
tg_edges <-
  tg %>%
  activate(edges) %>%
  data.frame()

named_edge_list <-
  tg_edges %>%
  # Rename from nodes
  left_join(tg_nodes, by = c("from" = "rowid")) %>%
  select(-from) %>%  # Remove unneeded column
  rename(from = name) %>%  # Rename column with names now
  
  # Rename to nodes
  left_join(tg_nodes, by = c("to" = "rowid")) %>%
  select(-to) %>%  # Remove unneeded column
  rename(to = name) %>%  # Rename column with names now

  # Cleaning up
  select(from, to, weight)


named_edge_list
#>     from     to weight
#> 1 Hadley  David      1
#> 2 Hadley Romain      2
#> 3 Hadley  Julia      3
#> 4  David Hadley      4
#> 5 Romain Hadley      5
#> 6 Romain  David      6
#> 7  Julia Hadley      7
#> 8  Julia  David      8
#> 9  Julia Romain      9

reprex package (v0.3.0)

于 2020-09-21 创建

加载包 igraph() 并将函数 get.edgelist() 应用于激活的边集。要获得正确的输出,随后还要应用 data.frame() 。您将获得命名的边缘列表。

library(igraph)
edge_list <-
tg %>%
activate(edges) %>%
get.edgelist() %>%
data.frame()

提取边然后与节点连接以获得已接受答案中的名称很直观,但需要很多步骤。

使用igraph::get.edgelist(第二个答案)的方法丢失了存储在边中的附加信息(在问题中:权重)。

这是一个应该有效的解决方案。

your_tbl_graph %>% 
  activate(edges) %>% 
  mutate(to_name = .N()$name[to], 
         from_name = .N()$name[from]) %>% 
  as_tibble() %>% 
  select(from = from_name, to = to_name, weight)