在创建邻接矩阵之前需要设置截止值

Need to set cutoffs before creating an adjacency matrix

这是我拥有的数据集的一小部分:

      Winner    Player 1    Player 2    Player 3
       Susan    Archie      Heck         Jay
       Archie   Brown       Susan        Jay
       Heck     Archie      Jay          Brown
       Jay      Brown       Archie       Susan
       Brown    Susan       Archie       Jay
       Archie   Brown       Susan        Heck
       Susan    Heck        Jay          Brown
       Jay      Heck        Susan        Brown
       Susan    Archie      Heck         Brown
       Lee      Susan       Jay          Heck
       Kyle     Heck        Jay          Susan

我使用以下代码将其转换为邻接矩阵:

   d = read.csv("res.csv")
   lvs <- sort(as.character(unique(unlist(d))))
   d[] <- lapply(d, factor, levels = lvs)
   res <- table(d[c("Player.1","Winner")]) + 
   table(d[c("Player.2","Winner")]) + 
   table(d[c("Player.3","Winner")])  
   diag(res) <- 0

我需要做的是设置截止值。所以唯一应该包含在矩阵中的人是至少有过 2 场比赛的球员。

输出应该是一个邻接矩阵,其中只有至少玩过两次的玩家。因此,原始矩阵如下所示:

          Winner    Susan   Archie  Heck    Jay     Brown   Lee     Kyle
          Susan       0       2      0       2         1     1       1
          Archie      2       0      1       1         1     0       0
          Heck        3       1      0       1         0     1       1
          Jay         2       1      1       0         1     1       1
          Brown       2       2      1       2         0     0       0
          Lee         0       0      0       0         0     0       0
          Kyle        0       0      0       0         0     0       0

但剔除只匹配一次的玩家后,得到的矩阵如下:

          Winner    Susan   Archie  Heck    Jay     Brown   Lee     Kyle
          Susan       0       2      0       2         1     0       0
          Archie      2       0      1       1         1     0       0
          Heck        3       1      0       1         0     0       0
          Jay         2       1      1       0         1     0       0
          Brown       2       2      0       2         0     0       0
          Lee         0       0      0       0         0     0       0
          Kyle        0       0      0       0         0     0       0

我们可以通过 gather 转换为 'long' 格式

更轻松地做到这一点
library(tidyverse)
out <- gather(d, key, val, -Winner) %>% 
          select(-key) %>%
          mutate(val = factor(val, levels = lvs)) %>% 
          table %>% 
          t

然后将 0

玩家行的列设置为 0 值
out[, names(which(!rowSums(out)))] <- 0

数据

d <- structure(list(Winner = structure(c(7L, 1L, 3L, 4L, 2L, 1L, 7L, 
4L, 7L, 6L, 5L), .Label = c("Archie", "Brown", "Heck", "Jay", 
"Kyle", "Lee", "Susan"), class = "factor"), Player1 = structure(c(1L, 
2L, 1L, 2L, 7L, 2L, 3L, 3L, 1L, 7L, 3L), .Label = c("Archie", 
"Brown", "Heck", "Jay", "Kyle", "Lee", "Susan"), class = "factor"), 
    Player2 = structure(c(3L, 7L, 4L, 1L, 1L, 7L, 4L, 7L, 3L, 
    4L, 4L), .Label = c("Archie", "Brown", "Heck", "Jay", "Kyle", 
    "Lee", "Susan"), class = "factor"), Player3 = structure(c(4L, 
    4L, 2L, 7L, 4L, 3L, 2L, 2L, 2L, 3L, 7L), .Label = c("Archie", 
    "Brown", "Heck", "Jay", "Kyle", "Lee", "Susan"), 
 class = "factor")), row.names = c(NA, 
-11L), class = "data.frame")