R:对多行和多列进行分组

R: Group Over Multiple Rows and Columns

我正在使用 R 编程语言。我有以下数据集:

set.seed(123)

Game = c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,15,16,16,17,17,18,18,19,19,20,20)

id = c(3,4,3,4,3,4,3,4,3,4,3,4, 3,4,3,4,3,4,3,4)

c <- c("1", "2")    
coin <- sample(c, 20, replace=TRUE, prob=c(0.5,0.5))

  
winner <- c("win", "win", "lose", "lose", "tie", "tie", "lose", "lose", "win", "win", "win", "win", "lose", "lose", "tie", "tie", "lose", "lose", "win", "win", "win", "win", "lose", "lose", "tie", "tie", "lose", "lose", "win", "win", "win", "win", "lose", "lose", "tie", "tie", "lose", "lose", "win", "win")

my_data = data.frame(Game, id, coin, winner)

数据(“my_data”)看起来像这样:

  Game id coin winner
1    1  3    2    win
2    1  4    1    win
3    2  3    2   lose
4    2  4    1   lose
5    3  3    1    tie
6    3  4    2    tie

对于这个数据集(“my_data”),我想执行以下操作:

我尝试使用以下代码完成此操作:

第 1 部分:(手动)找出每个游戏的唯一硬币组合(例如,1,1 OR 1,2 OR 2,1 OR 2,2)

for (i in 1:19) {
for (j in 2:20) {

my_data$comb = ifelse(my_data[i,3] == "1" & my_data[j,3] == "1", "one,one", ifelse(my_data[i,3] == "2" & my_data[j,3] == "1", "two, one", ifelse(my_data[i,3] == "1" & my_data[j,3] == "2", "one,two", "two,two)))

}
}

第 2 部分:(如果有效)找出第 1 部分中每个独特组合的Win/Tie/Loss 细分:

library(dplyr)

my_data %>% group_by(comb) %>% summarise(percent = n() )

所需的输出应如下所示 (注意: 1,2 = 2,1):

目前,我正在将“my_data”导入 Microsoft Excel - 但有人可以告诉我如何在 R 中执行此操作吗?

谁能告诉我如何获得上面的 table?

谢谢!

您可以使用以下代码:

library(tidyverse)
ff <- my_data %>% group_by(Game) %>% arrange(Game, coin) %>% 
  do(as.data.frame(t(combn(.[["coin"]], 2)))) %>% mutate(coin = paste(V1, V2, sep = ",")) %>% select(Game, coin)

my_data <- my_data %>% select(Game, winner) %>% distinct() %>% left_join(ff)

那么你想要的输出可以通过以下方式获得:

my_data %>% group_by(coin, winner) %>% summarise(n = n()) %>% mutate(p = 100 * n / sum(n, na.rm = T))

# A tibble: 5 x 4
# Groups:   coin [3]
  coin  winner     n     p
  <chr> <chr>  <int> <dbl>
1 1,1   lose       4 100  
2 1,2   lose       2  14.3
3 1,2   tie        4  28.6
4 1,2   win        8  57.1
5 2,2   lose       2 100  

这是我的方法。我不确定这是否是预期的方式:

library(dplyr)

my_data %>% 
  group_by(Game) %>% 
  mutate(combinations = toString(coin)) %>% 
  distinct(combinations, .keep_all = TRUE) %>% 
  ungroup() %>% 
  group_by(combinations, winner) %>% 
  summarise(n = n()) %>% 
  mutate(freq = n/sum(n)) 
  combinations winner     n  freq
  <chr>        <chr>  <int> <dbl>
1 1, 1         lose       4 1    
2 1, 2         tie        2 0.333
3 1, 2         win        4 0.667
4 2, 1         lose       2 0.25 
5 2, 1         tie        2 0.25 
6 2, 1         win        4 0.5  
7 2, 2         lose       2 1