交叉连接不同的数据元素以在 R 中创建二分图

Cross join of different data elements to create a Bipartite graph in R

我想根据实际事件创建条件和处理器的二分图(在 r 中)。如果我可以将我的数据转换为正确的格式,我可以轻松做到这一点,如下面的 table 所示:

生理学 GP 化学家 迷幻 牙医
背痛 1 1 1 0 0
抑郁症 0 1 1 1 0
流感 0 1 1 0 0
焦虑 0 1 0 1 0
牙疼 0 0 0 0 1

为了进一步说明,最左边的列行是“条件”,列是“处理者”(显然是虚构的),如果针对特定条件访问了处理者,则交集 = 1。

我的数据是下面的长格式。它由患者 ID、包含 Condition/Treater 单词的术语和表示该术语是条件还是治疗者类型的类型组成。

df <-
  data.frame(
    id = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5),
    term = c("BackPain", "Physio", "GP", "Chemist", "Depression", "GP", "Chemist", "Psych", "Flu", "GP", "Chemist", "Anxiety", "GP", "Psych", "Toothache", "Dentist"),
    type = c("Condition", "Treater", "Treater", "Treater", "Condition", "Treater", "Treater", "Treater", "Condition", "Treater", "Treater", "Condition", "Treater", "Treater", "Condition", "Treater")
  )

我怀疑我需要一个聪明的 pivot_wider 类型的解决方案,或者完全绕过上述结构,直接从我的源数据转到 igraph Bipartite 格式。我到处搜索,找不到类似的 questions/answers 数据是长格式的。

有什么想法:1) 将长格式转换为宽格式或 2) 如何从长格式直接转换为 igraph 二分图?

不胜感激!谢谢

要将数据转换为宽格式,您可以使用:

library(tidyr)
library(dplyr)

df %>% 
  filter(type == "Treater") %>% 
  mutate(type = 1 * (type == "Treater")) %>% 
  pivot_wider(names_from = "term", values_from = "type", values_fill = 0) %>% 
  left_join(df %>% filter(type == "Condition"), by = "id") %>% 
  select(Condition = term, Physio, GP, Chemist, Psych, Dentist)

哪个returns

# A tibble: 5 x 6
  Condition  Physio    GP Chemist Psych Dentist
  <chr>       <dbl> <dbl>   <dbl> <dbl>   <dbl>
1 BackPain        1     1       1     0       0
2 Depression      0     1       1     1       0
3 Flu             0     1       1     0       0
4 Anxiety         0     1       0     1       0
5 Toothache       0     0       0     0       1

从长边 data.frame 直接到没有关联矩阵的二分图的另一种选择:

library(igraph)
#create long edges data.frame
d <- do.call(rbind, tapply(df$term, df$id, 
    function(x) data.frame(Condition=x[1L], Treater=x[-1L])))

#create graph
g <- graph_from_data_frame(d, directed=FALSE)

#see https://rpubs.com/pjmurphy/317838 by Phil Murphy & Brendan Knapp
V(g)$type <- bipartite_mapping(g)$type
plot(g, layout=layout_as_bipartite)
library(igraph)
library(magrittr)
library(data.table)
data.table::setDT(df)
DT <- df[!type == "Condition", ][df[type == "Condition", ], on = .(id)]
DT.wide <- dcast(DT, term ~ i.term, value.var = "id", fun.aggregate = length)
graph_from_incidence_matrix(as.matrix(DT.wide, rownames = 1)) %>%
  add_layout_(as_bipartite()) %>%
  plot()

一个data.table选项

table(setDT(df)[, expand.grid(split(term, type)), id][, id := NULL])

给予

            Treater
Condition    Physio GP Chemist Psych Dentist
  BackPain        1  1       1     0       0
  Depression      0  1       1     1       0
  Flu             0  1       1     0       0
  Anxiety         0  1       0     1       0
  Toothache       0  0       0     0       1

如果你想有情节,你可以像下面这样多加两行

table(setDT(df)[, expand.grid(split(term, type)), id][, id := NULL]) %>%
    graph_from_incidence_matrix() %>%
    plot(layout = layout_as_bipartite)

这给出了