在创建邻接矩阵之前需要设置截止值
Need to set cutoffs before creating an adjacency matrix
这是我拥有的数据集的一小部分:
Winner Player 1 Player 2 Player 3
Susan Archie Heck Jay
Archie Brown Susan Jay
Heck Archie Jay Brown
Jay Brown Archie Susan
Brown Susan Archie Jay
Archie Brown Susan Heck
Susan Heck Jay Brown
Jay Heck Susan Brown
Susan Archie Heck Brown
Lee Susan Jay Heck
Kyle Heck Jay Susan
我使用以下代码将其转换为邻接矩阵:
d = read.csv("res.csv")
lvs <- sort(as.character(unique(unlist(d))))
d[] <- lapply(d, factor, levels = lvs)
res <- table(d[c("Player.1","Winner")]) +
table(d[c("Player.2","Winner")]) +
table(d[c("Player.3","Winner")])
diag(res) <- 0
我需要做的是设置截止值。所以唯一应该包含在矩阵中的人是至少有过 2 场比赛的球员。
输出应该是一个邻接矩阵,其中只有至少玩过两次的玩家。因此,原始矩阵如下所示:
Winner Susan Archie Heck Jay Brown Lee Kyle
Susan 0 2 0 2 1 1 1
Archie 2 0 1 1 1 0 0
Heck 3 1 0 1 0 1 1
Jay 2 1 1 0 1 1 1
Brown 2 2 1 2 0 0 0
Lee 0 0 0 0 0 0 0
Kyle 0 0 0 0 0 0 0
但剔除只匹配一次的玩家后,得到的矩阵如下:
Winner Susan Archie Heck Jay Brown Lee Kyle
Susan 0 2 0 2 1 0 0
Archie 2 0 1 1 1 0 0
Heck 3 1 0 1 0 0 0
Jay 2 1 1 0 1 0 0
Brown 2 2 0 2 0 0 0
Lee 0 0 0 0 0 0 0
Kyle 0 0 0 0 0 0 0
我们可以通过 gather
转换为 'long' 格式
更轻松地做到这一点
library(tidyverse)
out <- gather(d, key, val, -Winner) %>%
select(-key) %>%
mutate(val = factor(val, levels = lvs)) %>%
table %>%
t
然后将 0
玩家行的列设置为 0 值
out[, names(which(!rowSums(out)))] <- 0
数据
d <- structure(list(Winner = structure(c(7L, 1L, 3L, 4L, 2L, 1L, 7L,
4L, 7L, 6L, 5L), .Label = c("Archie", "Brown", "Heck", "Jay",
"Kyle", "Lee", "Susan"), class = "factor"), Player1 = structure(c(1L,
2L, 1L, 2L, 7L, 2L, 3L, 3L, 1L, 7L, 3L), .Label = c("Archie",
"Brown", "Heck", "Jay", "Kyle", "Lee", "Susan"), class = "factor"),
Player2 = structure(c(3L, 7L, 4L, 1L, 1L, 7L, 4L, 7L, 3L,
4L, 4L), .Label = c("Archie", "Brown", "Heck", "Jay", "Kyle",
"Lee", "Susan"), class = "factor"), Player3 = structure(c(4L,
4L, 2L, 7L, 4L, 3L, 2L, 2L, 2L, 3L, 7L), .Label = c("Archie",
"Brown", "Heck", "Jay", "Kyle", "Lee", "Susan"),
class = "factor")), row.names = c(NA,
-11L), class = "data.frame")
这是我拥有的数据集的一小部分:
Winner Player 1 Player 2 Player 3
Susan Archie Heck Jay
Archie Brown Susan Jay
Heck Archie Jay Brown
Jay Brown Archie Susan
Brown Susan Archie Jay
Archie Brown Susan Heck
Susan Heck Jay Brown
Jay Heck Susan Brown
Susan Archie Heck Brown
Lee Susan Jay Heck
Kyle Heck Jay Susan
我使用以下代码将其转换为邻接矩阵:
d = read.csv("res.csv")
lvs <- sort(as.character(unique(unlist(d))))
d[] <- lapply(d, factor, levels = lvs)
res <- table(d[c("Player.1","Winner")]) +
table(d[c("Player.2","Winner")]) +
table(d[c("Player.3","Winner")])
diag(res) <- 0
我需要做的是设置截止值。所以唯一应该包含在矩阵中的人是至少有过 2 场比赛的球员。
输出应该是一个邻接矩阵,其中只有至少玩过两次的玩家。因此,原始矩阵如下所示:
Winner Susan Archie Heck Jay Brown Lee Kyle
Susan 0 2 0 2 1 1 1
Archie 2 0 1 1 1 0 0
Heck 3 1 0 1 0 1 1
Jay 2 1 1 0 1 1 1
Brown 2 2 1 2 0 0 0
Lee 0 0 0 0 0 0 0
Kyle 0 0 0 0 0 0 0
但剔除只匹配一次的玩家后,得到的矩阵如下:
Winner Susan Archie Heck Jay Brown Lee Kyle
Susan 0 2 0 2 1 0 0
Archie 2 0 1 1 1 0 0
Heck 3 1 0 1 0 0 0
Jay 2 1 1 0 1 0 0
Brown 2 2 0 2 0 0 0
Lee 0 0 0 0 0 0 0
Kyle 0 0 0 0 0 0 0
我们可以通过 gather
转换为 'long' 格式
library(tidyverse)
out <- gather(d, key, val, -Winner) %>%
select(-key) %>%
mutate(val = factor(val, levels = lvs)) %>%
table %>%
t
然后将 0
玩家行的列设置为 0 值out[, names(which(!rowSums(out)))] <- 0
数据
d <- structure(list(Winner = structure(c(7L, 1L, 3L, 4L, 2L, 1L, 7L,
4L, 7L, 6L, 5L), .Label = c("Archie", "Brown", "Heck", "Jay",
"Kyle", "Lee", "Susan"), class = "factor"), Player1 = structure(c(1L,
2L, 1L, 2L, 7L, 2L, 3L, 3L, 1L, 7L, 3L), .Label = c("Archie",
"Brown", "Heck", "Jay", "Kyle", "Lee", "Susan"), class = "factor"),
Player2 = structure(c(3L, 7L, 4L, 1L, 1L, 7L, 4L, 7L, 3L,
4L, 4L), .Label = c("Archie", "Brown", "Heck", "Jay", "Kyle",
"Lee", "Susan"), class = "factor"), Player3 = structure(c(4L,
4L, 2L, 7L, 4L, 3L, 2L, 2L, 2L, 3L, 7L), .Label = c("Archie",
"Brown", "Heck", "Jay", "Kyle", "Lee", "Susan"),
class = "factor")), row.names = c(NA,
-11L), class = "data.frame")