交叉连接不同的数据元素以在 R 中创建二分图
Cross join of different data elements to create a Bipartite graph in R
我想根据实际事件创建条件和处理器的二分图(在 r 中)。如果我可以将我的数据转换为正确的格式,我可以轻松做到这一点,如下面的 table 所示:
生理学
GP
化学家
迷幻
牙医
背痛
1
1
1
0
0
抑郁症
0
1
1
1
0
流感
0
1
1
0
0
焦虑
0
1
0
1
0
牙疼
0
0
0
0
1
为了进一步说明,最左边的列行是“条件”,列是“处理者”(显然是虚构的),如果针对特定条件访问了处理者,则交集 = 1。
我的数据是下面的长格式。它由患者 ID、包含 Condition/Treater 单词的术语和表示该术语是条件还是治疗者类型的类型组成。
df <-
data.frame(
id = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5),
term = c("BackPain", "Physio", "GP", "Chemist", "Depression", "GP", "Chemist", "Psych", "Flu", "GP", "Chemist", "Anxiety", "GP", "Psych", "Toothache", "Dentist"),
type = c("Condition", "Treater", "Treater", "Treater", "Condition", "Treater", "Treater", "Treater", "Condition", "Treater", "Treater", "Condition", "Treater", "Treater", "Condition", "Treater")
)
我怀疑我需要一个聪明的 pivot_wider 类型的解决方案,或者完全绕过上述结构,直接从我的源数据转到 igraph Bipartite 格式。我到处搜索,找不到类似的 questions/answers 数据是长格式的。
有什么想法:1) 将长格式转换为宽格式或 2) 如何从长格式直接转换为 igraph 二分图?
不胜感激!谢谢
要将数据转换为宽格式,您可以使用:
library(tidyr)
library(dplyr)
df %>%
filter(type == "Treater") %>%
mutate(type = 1 * (type == "Treater")) %>%
pivot_wider(names_from = "term", values_from = "type", values_fill = 0) %>%
left_join(df %>% filter(type == "Condition"), by = "id") %>%
select(Condition = term, Physio, GP, Chemist, Psych, Dentist)
哪个returns
# A tibble: 5 x 6
Condition Physio GP Chemist Psych Dentist
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 BackPain 1 1 1 0 0
2 Depression 0 1 1 1 0
3 Flu 0 1 1 0 0
4 Anxiety 0 1 0 1 0
5 Toothache 0 0 0 0 1
从长边 data.frame 直接到没有关联矩阵的二分图的另一种选择:
library(igraph)
#create long edges data.frame
d <- do.call(rbind, tapply(df$term, df$id,
function(x) data.frame(Condition=x[1L], Treater=x[-1L])))
#create graph
g <- graph_from_data_frame(d, directed=FALSE)
#see https://rpubs.com/pjmurphy/317838 by Phil Murphy & Brendan Knapp
V(g)$type <- bipartite_mapping(g)$type
plot(g, layout=layout_as_bipartite)
library(igraph)
library(magrittr)
library(data.table)
data.table::setDT(df)
DT <- df[!type == "Condition", ][df[type == "Condition", ], on = .(id)]
DT.wide <- dcast(DT, term ~ i.term, value.var = "id", fun.aggregate = length)
graph_from_incidence_matrix(as.matrix(DT.wide, rownames = 1)) %>%
add_layout_(as_bipartite()) %>%
plot()
一个data.table
选项
table(setDT(df)[, expand.grid(split(term, type)), id][, id := NULL])
给予
Treater
Condition Physio GP Chemist Psych Dentist
BackPain 1 1 1 0 0
Depression 0 1 1 1 0
Flu 0 1 1 0 0
Anxiety 0 1 0 1 0
Toothache 0 0 0 0 1
如果你想有情节,你可以像下面这样多加两行
table(setDT(df)[, expand.grid(split(term, type)), id][, id := NULL]) %>%
graph_from_incidence_matrix() %>%
plot(layout = layout_as_bipartite)
这给出了
我想根据实际事件创建条件和处理器的二分图(在 r 中)。如果我可以将我的数据转换为正确的格式,我可以轻松做到这一点,如下面的 table 所示:
生理学 | GP | 化学家 | 迷幻 | 牙医 | |
---|---|---|---|---|---|
背痛 | 1 | 1 | 1 | 0 | 0 |
抑郁症 | 0 | 1 | 1 | 1 | 0 |
流感 | 0 | 1 | 1 | 0 | 0 |
焦虑 | 0 | 1 | 0 | 1 | 0 |
牙疼 | 0 | 0 | 0 | 0 | 1 |
为了进一步说明,最左边的列行是“条件”,列是“处理者”(显然是虚构的),如果针对特定条件访问了处理者,则交集 = 1。
我的数据是下面的长格式。它由患者 ID、包含 Condition/Treater 单词的术语和表示该术语是条件还是治疗者类型的类型组成。
df <-
data.frame(
id = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5),
term = c("BackPain", "Physio", "GP", "Chemist", "Depression", "GP", "Chemist", "Psych", "Flu", "GP", "Chemist", "Anxiety", "GP", "Psych", "Toothache", "Dentist"),
type = c("Condition", "Treater", "Treater", "Treater", "Condition", "Treater", "Treater", "Treater", "Condition", "Treater", "Treater", "Condition", "Treater", "Treater", "Condition", "Treater")
)
我怀疑我需要一个聪明的 pivot_wider 类型的解决方案,或者完全绕过上述结构,直接从我的源数据转到 igraph Bipartite 格式。我到处搜索,找不到类似的 questions/answers 数据是长格式的。
有什么想法:1) 将长格式转换为宽格式或 2) 如何从长格式直接转换为 igraph 二分图?
不胜感激!谢谢
要将数据转换为宽格式,您可以使用:
library(tidyr)
library(dplyr)
df %>%
filter(type == "Treater") %>%
mutate(type = 1 * (type == "Treater")) %>%
pivot_wider(names_from = "term", values_from = "type", values_fill = 0) %>%
left_join(df %>% filter(type == "Condition"), by = "id") %>%
select(Condition = term, Physio, GP, Chemist, Psych, Dentist)
哪个returns
# A tibble: 5 x 6
Condition Physio GP Chemist Psych Dentist
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 BackPain 1 1 1 0 0
2 Depression 0 1 1 1 0
3 Flu 0 1 1 0 0
4 Anxiety 0 1 0 1 0
5 Toothache 0 0 0 0 1
从长边 data.frame 直接到没有关联矩阵的二分图的另一种选择:
library(igraph)
#create long edges data.frame
d <- do.call(rbind, tapply(df$term, df$id,
function(x) data.frame(Condition=x[1L], Treater=x[-1L])))
#create graph
g <- graph_from_data_frame(d, directed=FALSE)
#see https://rpubs.com/pjmurphy/317838 by Phil Murphy & Brendan Knapp
V(g)$type <- bipartite_mapping(g)$type
plot(g, layout=layout_as_bipartite)
library(igraph)
library(magrittr)
library(data.table)
data.table::setDT(df)
DT <- df[!type == "Condition", ][df[type == "Condition", ], on = .(id)]
DT.wide <- dcast(DT, term ~ i.term, value.var = "id", fun.aggregate = length)
graph_from_incidence_matrix(as.matrix(DT.wide, rownames = 1)) %>%
add_layout_(as_bipartite()) %>%
plot()
一个data.table
选项
table(setDT(df)[, expand.grid(split(term, type)), id][, id := NULL])
给予
Treater
Condition Physio GP Chemist Psych Dentist
BackPain 1 1 1 0 0
Depression 0 1 1 1 0
Flu 0 1 1 0 0
Anxiety 0 1 0 1 0
Toothache 0 0 0 0 1
如果你想有情节,你可以像下面这样多加两行
table(setDT(df)[, expand.grid(split(term, type)), id][, id := NULL]) %>%
graph_from_incidence_matrix() %>%
plot(layout = layout_as_bipartite)
这给出了