如何计算 R 中所有成对组合的均值

How to calculate mean for all pairwise combinations in R

我有一个像这样的数据框 (DF):

             x                y
 1 " Accession of China"    0.401     
 2 " Afghanistan"           0.486     
 3 " Albania"               0.581     
 4 " Algeria"               0.431     
 5 " Andean Community"      0.341     
 6 " Andorra"               0.378   

它有一个国家列表 (x) 和一个与每个国家相关联的值 (y) 我需要计算所有可能的国家组合以及每个组合中两个值的平均值

示例:

                  x                       y
1 "Accession of China - Afghanistan" (0.401 + 0.486)/2
2 "Accession of China - Albania"     (0.401 + 0.581)/2

所有可能的组合都应该这样做,没有重复的组合。 我面临的挑战是找到一种方法来使用 tidyverse

非常感谢:)

您可以使用 combn :

library(dplyr) #dplyr > 1.0.0

result <- DF %>%
           summarise(x = combn(x, 2, paste0, collapse = '-'), 
                     y = combn(y, 2, mean))

result

#                                       x      y
#1        Accession of China- Afghanistan 0.4435
#2            Accession of China- Albania 0.4910
#3            Accession of China- Algeria 0.4160
#4   Accession of China- Andean Community 0.3710
#5            Accession of China- Andorra 0.3895
#6                   Afghanistan- Albania 0.5335
#7                   Afghanistan- Algeria 0.4585
#8          Afghanistan- Andean Community 0.4135
#9                   Afghanistan- Andorra 0.4320
#10                      Albania- Algeria 0.5060
#11             Albania- Andean Community 0.4610
#12                      Albania- Andorra 0.4795
#13             Algeria- Andean Community 0.3860
#14                      Algeria- Andorra 0.4045
#15             Andean Community- Andorra 0.3595

这也可以使用基础 R 来完成:

result <- data.frame(x = combn(DF$x, 2, paste0, collapse = '-'),
                     y = combn(DF$y, 2, mean))

数据

DF <- structure(list(x = c(" Accession of China", " Afghanistan", " Albania", 
" Algeria", " Andean Community", " Andorra"), y = c(0.401, 0.486, 
0.581, 0.431, 0.341, 0.378)), class = "data.frame", row.names = c(NA, -6L))