如何根据矩阵位置计算总分?
How to calculate a overall score based on matrix positions?
我有一个包含 12 列不同参与者的数据框,在前 5 名中。它看起来像这样:
> top_5
4 5 8 9 11 12 15 16 19 20 22 23
[1,] "Nia" "Hung" "Hanaaa" "Ramziyya" "Marissa" "Jaelyn" "Shyanne" "Jaabir" "Dionicio" "Nia" "Shyanne" "Roger"
[2,] "Razeena" "Husni" "Bradly" "Marissa" "Bradly" "Muhsin" "Razeena" "Dionicio" "Magnus" "Kelsey" "Nia" "Schyler"
[3,] "Shyanne" "Schyler" "Necko" "Johannah" "Tatiana" "Glenn" "Nia" "Jaelyn" "Shyanne" "Hanaaa" "Mildred" "German"
[4,] "Schyler" "German" "Hung" "Lubaaba" "Johannah" "Magnus" "Dionicio" "German" "German" "Razeena" "Dionicio" "Jaabir"
[5,] "Husni" "Necko" "Razeena" "Afeefa" "Schyler" "Dionicio" "Jaabir" "Roger" "Johannah" "Remy" "Jaabir" "Jaelyn"
(并且可以使用它重新创建):
structure(c("Nia", "Razeena", "Shyanne", "Schyler", "Husni",
"Hung", "Husni", "Schyler", "German", "Necko", "Hanaaa", "Bradly",
"Necko", "Hung", "Razeena", "Ramziyya", "Marissa", "Johannah",
"Lubaaba", "Afeefa", "Marissa", "Bradly", "Tatiana", "Johannah",
"Schyler", "Jaelyn", "Muhsin", "Glenn", "Magnus", "Dionicio",
"Shyanne", "Razeena", "Nia", "Dionicio", "Jaabir", "Jaabir",
"Dionicio", "Jaelyn", "German", "Roger", "Dionicio", "Magnus",
"Shyanne", "German", "Johannah", "Nia", "Kelsey", "Hanaaa", "Razeena",
"Remy", "Shyanne", "Nia", "Mildred", "Dionicio", "Jaabir", "Roger",
"Schyler", "German", "Jaabir", "Jaelyn"), .Dim = c(5L, 12L), .Dimnames = list(
NULL, c("4", "5", "8", "9", "11", "12", "15", "16", "19",
"20", "22", "23")))
现在,如果参与者在顶行,则意味着他们在该列中排在第一位(因此对于第 1 列,"Nia" 排在第一位,"Razeena" 排在第二位,依此类推。 ).排名第一名 5 分,第二名 4 分,依此类推。现在我想为矩阵中的每个参与者计算 her/his 分。
我的目标是进入总分前 5。我该怎么做?
一个选项是 split
行索引与矩阵值反转为 list
并通过遍历 list
获得每个 list
元素的 sum
=13=] (sapply
)
out <- sapply(split(row(top_5)[nrow(top_5):1, ], top_5), sum)
out
#Afeefa Bradly Dionicio German Glenn Hanaaa Hung Husni Jaabir Jaelyn Johannah Kelsey Lubaaba Magnus Marissa Mildred Muhsin
# 1 8 14 9 3 8 7 5 9 9 6 4 2 6 9 3 4
# Necko Nia Ramziyya Razeena Remy Roger Schyler Shyanne Tatiana
# 4 17 5 11 1 6 10 16 3
head(out[order(-out)], 5)
# Nia Shyanne Dionicio Razeena Schyler
# 17 16 14 11 10
或者另一种选择是 rowsum
rowsum(c(row(top_5)[nrow(top_5):1, ]), group = c(top_5))
使用tidyverse
函数:
library(tidyr)
library(dplyr)
top_5 %>%
as.data.frame %>%
head(.,5) %>%
mutate(rank = nrow(.):1) %>%
pivot_longer(., -c(rank), values_to = "name", names_to = "col") %>%
group_by(name) %>%
summarise_at(vars(rank), list(points = sum))
#> # A tibble: 26 x 2
#> name points
#> <fct> <int>
#> 1 Husni 5
#> 2 Nia 17
#> 3 Razeena 11
#> 4 Schyler 10
#> 5 Shyanne 16
#> 6 German 9
#> 7 Hung 7
#> 8 Necko 4
#> 9 Bradly 8
#> 10 Hanaaa 8
#> # ... with 16 more rows
这是一个类似于 M-- 的答案的 "convert to long then summarise by group" 方法,但是 data.table
library(data.table)
df <- as.data.table(top_5)[, points := .N:1]
total_points <- melt(df, 'points')[, .(points = sum(points)), value]
setorder(total_points, -points)
head(total_points, 5)
# value points
# 1: Nia 17
# 2: Shyanne 16
# 3: Dionicio 14
# 4: Razeena 11
# 5: Schyler 10
或者一个与akrun非常相似的想法,只是用tapply
代替sapply
+ split
out <- sort(tapply(c(6 - row(top_5)), c(top_5), sum), decreasing = TRUE)
head(out, 5)
# Nia Shyanne Dionicio Razeena Schyler
# 17 16 14 11 10
我有一个包含 12 列不同参与者的数据框,在前 5 名中。它看起来像这样:
> top_5
4 5 8 9 11 12 15 16 19 20 22 23
[1,] "Nia" "Hung" "Hanaaa" "Ramziyya" "Marissa" "Jaelyn" "Shyanne" "Jaabir" "Dionicio" "Nia" "Shyanne" "Roger"
[2,] "Razeena" "Husni" "Bradly" "Marissa" "Bradly" "Muhsin" "Razeena" "Dionicio" "Magnus" "Kelsey" "Nia" "Schyler"
[3,] "Shyanne" "Schyler" "Necko" "Johannah" "Tatiana" "Glenn" "Nia" "Jaelyn" "Shyanne" "Hanaaa" "Mildred" "German"
[4,] "Schyler" "German" "Hung" "Lubaaba" "Johannah" "Magnus" "Dionicio" "German" "German" "Razeena" "Dionicio" "Jaabir"
[5,] "Husni" "Necko" "Razeena" "Afeefa" "Schyler" "Dionicio" "Jaabir" "Roger" "Johannah" "Remy" "Jaabir" "Jaelyn"
(并且可以使用它重新创建):
structure(c("Nia", "Razeena", "Shyanne", "Schyler", "Husni",
"Hung", "Husni", "Schyler", "German", "Necko", "Hanaaa", "Bradly",
"Necko", "Hung", "Razeena", "Ramziyya", "Marissa", "Johannah",
"Lubaaba", "Afeefa", "Marissa", "Bradly", "Tatiana", "Johannah",
"Schyler", "Jaelyn", "Muhsin", "Glenn", "Magnus", "Dionicio",
"Shyanne", "Razeena", "Nia", "Dionicio", "Jaabir", "Jaabir",
"Dionicio", "Jaelyn", "German", "Roger", "Dionicio", "Magnus",
"Shyanne", "German", "Johannah", "Nia", "Kelsey", "Hanaaa", "Razeena",
"Remy", "Shyanne", "Nia", "Mildred", "Dionicio", "Jaabir", "Roger",
"Schyler", "German", "Jaabir", "Jaelyn"), .Dim = c(5L, 12L), .Dimnames = list(
NULL, c("4", "5", "8", "9", "11", "12", "15", "16", "19",
"20", "22", "23")))
现在,如果参与者在顶行,则意味着他们在该列中排在第一位(因此对于第 1 列,"Nia" 排在第一位,"Razeena" 排在第二位,依此类推。 ).排名第一名 5 分,第二名 4 分,依此类推。现在我想为矩阵中的每个参与者计算 her/his 分。
我的目标是进入总分前 5。我该怎么做?
一个选项是 split
行索引与矩阵值反转为 list
并通过遍历 list
获得每个 list
元素的 sum
=13=] (sapply
)
out <- sapply(split(row(top_5)[nrow(top_5):1, ], top_5), sum)
out
#Afeefa Bradly Dionicio German Glenn Hanaaa Hung Husni Jaabir Jaelyn Johannah Kelsey Lubaaba Magnus Marissa Mildred Muhsin
# 1 8 14 9 3 8 7 5 9 9 6 4 2 6 9 3 4
# Necko Nia Ramziyya Razeena Remy Roger Schyler Shyanne Tatiana
# 4 17 5 11 1 6 10 16 3
head(out[order(-out)], 5)
# Nia Shyanne Dionicio Razeena Schyler
# 17 16 14 11 10
或者另一种选择是 rowsum
rowsum(c(row(top_5)[nrow(top_5):1, ]), group = c(top_5))
使用tidyverse
函数:
library(tidyr)
library(dplyr)
top_5 %>%
as.data.frame %>%
head(.,5) %>%
mutate(rank = nrow(.):1) %>%
pivot_longer(., -c(rank), values_to = "name", names_to = "col") %>%
group_by(name) %>%
summarise_at(vars(rank), list(points = sum))
#> # A tibble: 26 x 2
#> name points
#> <fct> <int>
#> 1 Husni 5
#> 2 Nia 17
#> 3 Razeena 11
#> 4 Schyler 10
#> 5 Shyanne 16
#> 6 German 9
#> 7 Hung 7
#> 8 Necko 4
#> 9 Bradly 8
#> 10 Hanaaa 8
#> # ... with 16 more rows
这是一个类似于 M-- 的答案的 "convert to long then summarise by group" 方法,但是 data.table
library(data.table)
df <- as.data.table(top_5)[, points := .N:1]
total_points <- melt(df, 'points')[, .(points = sum(points)), value]
setorder(total_points, -points)
head(total_points, 5)
# value points
# 1: Nia 17
# 2: Shyanne 16
# 3: Dionicio 14
# 4: Razeena 11
# 5: Schyler 10
或者一个与akrun非常相似的想法,只是用tapply
代替sapply
+ split
out <- sort(tapply(c(6 - row(top_5)), c(top_5), sum), decreasing = TRUE)
head(out, 5)
# Nia Shyanne Dionicio Razeena Schyler
# 17 16 14 11 10