R中的友谊网络识别

friendship network identification in R

我想确定同一网络中的所有人都通过友谊提名直接或间接连接的网络,而来自不同网络的学生没有连接。

我正在使用添加健康数据。每个学生最多提名 10 个朋友。 比如,示例数据可能如下所示:

ID  FID_1   FID_2   FID_3   FID_4   FID_5   FID_6   FID_7   FID_8   FID_9   FID_10
1   2           6   7          9    10        NA     NA     NA        NA    NA
2   5           9   12        45    13        90     87     6         NA    NA
3   1           2   4          7    8          9     10     14        16    18
100   110       120   122      125   169     178    190    200       500  520
500    100      110   122      125   169     178    190    200       500  520
700    800      789    900     NA     NA       NA     NA    NA        NA   NA
1000   789     2000     820    900    NA       NA     NA    NA        NA   NA

大约有 85,000 人。谁能告诉我如何获得网络 ID? 所以,我希望数据如下所示

ID   network_ID           ID  network_ID
1     1                   700   3  
2     1                   789   3
3     1                   800   3
4     1                   820   3
5     1                   900   3
6     1                  1000   3
7     1                  2000   3
8     1
9     1
10    1
12    1
13    1
14    1
16    1
18    1
90    1
87    1
100   2
110   2
120   2
122   2
125   2
169   2
178   2
190   2
200   2
500   2
520   2

因此,与ID 1直接或间接联系的所有人都属于网络1。2是1的朋友。因此,与2直接或间接联系的所有人也在1的网络中,依此类推。 700 未连接到 1 或 1 的朋友或 1 的朋友的朋友等等。因此 700 在不同的网络中,即网络 3.

任何帮助将不胜感激...

更新

library(igraph)
library(dplyr)
library(data.table)

setDT(df) %>%
    melt(id.var = "ID", variable.name = "FID", value.name = "ID2") %>%
    na.omit() %>%
    setcolorder(c("ID", "ID2", "FID")) %>%
    graph_from_data_frame() %>%
    components() %>%
    membership() %>%
    stack() %>%
    setNames(c("Network_ID", "ID")) %>%
    rev() %>%
    type.convert(as.is = TRUE) %>%
    arrange(Network_ID, ID)

给予

     ID Network_ID
1     1          1
2     2          1
3     3          1
4     4          1
5     5          1
6     6          1
7     7          1
8     8          1
9     9          1
10   10          1
11   12          1
12   13          1
13   14          1
14   16          1
15   18          1
16   45          1
17   87          1
18   90          1
19  100          2
20  110          2
21  120          2
22  122          2
23  125          2
24  169          2
25  178          2
26  190          2
27  200          2
28  500          2
29  520          2
30  700          3
31  789          3
32  800          3
33  820          3
34  900          3
35 1000          3
36 2000          3

数据

> dput(df)
structure(list(ID = c(1L, 2L, 3L, 100L, 500L, 700L, 1000L), FID_1 = c(2L,
5L, 1L, 110L, 100L, 800L, 789L), FID_2 = c(6L, 9L, 2L, 120L,
110L, 789L, 2000L), FID_3 = c(7L, 12L, 4L, 122L, 122L, 900L,
820L), FID_4 = c(9L, 45L, 7L, 125L, 125L, NA, 900L), FID_5 = c(10L,
13L, 8L, 169L, 169L, NA, NA), FID_6 = c(NA, 90L, 9L, 178L, 178L,
NA, NA), FID_7 = c(NA, 87L, 10L, 190L, 190L, NA, NA), FID_8 = c(NA,
6L, 14L, 200L, 200L, NA, NA), FID_9 = c(NA, NA, 16L, 500L, 500L,
NA, NA), FID_10 = c(NA, NA, 18L, 520L, 520L, NA, NA)), class = "data.frame", row.names = c(NA,
-7L))

您在找这样的东西吗?

library(data.table)
library(dplyr)

setDT(df) %>%
    melt(id.var = "ID", variable.name = "FID", value.name = "ID2") %>%
    na.omit() %>%
    setcolorder(c("ID", "ID2", "FID")) %>%
    graph_from_data_frame() %>%
    plot(edge.label = E(.)$FID)


数据

structure(list(ID = 1:3, FID_1 = c(2L, 5L, 1L), FID_2 = c(6L,
9L, 2L), FID_3 = c(7L, 12L, 4L), FID_4 = c(9L, 45L, 7L), FID_5 = c(10L,
12L, 8L), FID_6 = c(NA, 90L, 9L), FID_7 = c(NA, 87L, 10L), FID_8 = c(NA,
6L, 14L), FID_9 = c(NA, NA, 16L), FID_10 = c(NA, NA, 18L)), class = "data.frame", row.names = c(NA,
-3L))