从数据框创建图形 - 行作为顶点,公共列值作为边?
Create graph from data frame - rows as vertices and common column values as edges?
我在正确使用 graph_from_data_frame 时遇到了一些问题 - 错误:...数据框应该至少包含两列,但它已经包含了。
我有一个数据框,让我们以一群学生为例。
每一行是一个学生姓名,还有多列元数据,其中大部分是无关紧要的。我想使用一个特定的列“Class”,表示它们在哪个 class 中(让它们在 15 class 中,每个 30 个)。我想做一个图,每个学生都是一个顶点,“Class”列中具有相同值的学生得到一条无向边。
这个命令会是什么样子?
只是添加一些上下文的更新:我希望绘制的 nodes/edges 数量 令人难以置信 大(它是不是字面上的 class 个学生),以至于示例中使用的一对一表示是不可行的。因此,我一直在寻找一种更有效的边缘编码方式。
library(tidyverse)
library(igraph)
df = tibble(
class = c("1","1","1","2","2","2","3","3","3"),
name = c("a","b","c","d","e","f","g","h","i")
)
names = df %>% select(name)
relations = df %>%
mutate(name2 = df$name)
for (i in unique(select(df,class))$class){
from = relations %>%
filter(class == i) %>%
select(name)
to = relations %>%
filter(class == i) %>%
select(name2)
# Form relationships between all students in each class
if (i == 1){edge_list = tidyr::crossing(from, to)}
else {edge_list = bind_rows(edge_list, tidyr::crossing(from, to))}
}
# Prevent self-loop edges and duplicate relationships
edge_list = edge_list %>% filter(name != name2)
edge_list = edge_list[!duplicated(t(apply(edge_list, 1, sort))), ]
plot(graph_from_data_frame(edge_list, directed = FALSE, vertices = names))
我在正确使用 graph_from_data_frame 时遇到了一些问题 - 错误:...数据框应该至少包含两列,但它已经包含了。
我有一个数据框,让我们以一群学生为例。
每一行是一个学生姓名,还有多列元数据,其中大部分是无关紧要的。我想使用一个特定的列“Class”,表示它们在哪个 class 中(让它们在 15 class 中,每个 30 个)。我想做一个图,每个学生都是一个顶点,“Class”列中具有相同值的学生得到一条无向边。
这个命令会是什么样子?
只是添加一些上下文的更新:我希望绘制的 nodes/edges 数量 令人难以置信 大(它是不是字面上的 class 个学生),以至于示例中使用的一对一表示是不可行的。因此,我一直在寻找一种更有效的边缘编码方式。
library(tidyverse)
library(igraph)
df = tibble(
class = c("1","1","1","2","2","2","3","3","3"),
name = c("a","b","c","d","e","f","g","h","i")
)
names = df %>% select(name)
relations = df %>%
mutate(name2 = df$name)
for (i in unique(select(df,class))$class){
from = relations %>%
filter(class == i) %>%
select(name)
to = relations %>%
filter(class == i) %>%
select(name2)
# Form relationships between all students in each class
if (i == 1){edge_list = tidyr::crossing(from, to)}
else {edge_list = bind_rows(edge_list, tidyr::crossing(from, to))}
}
# Prevent self-loop edges and duplicate relationships
edge_list = edge_list %>% filter(name != name2)
edge_list = edge_list[!duplicated(t(apply(edge_list, 1, sort))), ]
plot(graph_from_data_frame(edge_list, directed = FALSE, vertices = names))