计算 R 中两个数据帧之间的分组成对相关性
Calculate grouped pairwise correlation between two dataframe in R
我有两个结构相同的数据框:
df1 <- data.frame(group1=c("A","A","A","B","B","C","C","C"),
group2 = c(1,1,2,1,1,2,2,1),
col1 = c(1,2,3,4,5,6,7,8),
col2 = c(3,5,7,4,3,7,2,7))
df2 <- data.frame(group1=c("A","A","A","B","B","C","C","C"),
group2 = c(1,1,2,1,1,2,2,1),
col1 = c(6,2,7,5,2,5,7,7),
col2 = c(7,2,5,21,6,9,4,2))
两个数据框中的前两列相同。我想计算具有相同名称的列之间的相关性(即 df1 中的 col1 和 df2 中的 col1 之间的相关性)。
预期结果:
group1
group2
col1 correlation
col2 correlation
A
1
0.1
0.5
A
2
0.05
0.04
B
1
0.46
0.2
下面的代码应该可以完成工作。然而,由于实际数据框中要关联的列不止两列。键入所有这些列名称非常痛苦。有什么聪明的方法可以做到这一点吗?提前致谢!
df <- data.frame(df1,df2) %>% group_by(df1.group1,df2.group2)
%>% mutate(col1_cor = cor(df1.col1,df2.col1), col2_cor = cor(df1.col2,df2.col2)) %>%
select(df1.group1,df1.group2,col1_cor,col2_cor)
您可以使用 id
变量行绑定两个数据帧以区分它们并计算每个 col
列的相关性。
library(dplyr)
bind_rows(df1, df2, .id = 'id') %>%
group_by(group1, group2) %>%
summarise(across(starts_with('col'),
~cor(.x[id == 1], .x[id == 2]), .names = '{col}_cor'), .groups = 'drop')
我有两个结构相同的数据框:
df1 <- data.frame(group1=c("A","A","A","B","B","C","C","C"),
group2 = c(1,1,2,1,1,2,2,1),
col1 = c(1,2,3,4,5,6,7,8),
col2 = c(3,5,7,4,3,7,2,7))
df2 <- data.frame(group1=c("A","A","A","B","B","C","C","C"),
group2 = c(1,1,2,1,1,2,2,1),
col1 = c(6,2,7,5,2,5,7,7),
col2 = c(7,2,5,21,6,9,4,2))
两个数据框中的前两列相同。我想计算具有相同名称的列之间的相关性(即 df1 中的 col1 和 df2 中的 col1 之间的相关性)。
预期结果:
group1 | group2 | col1 correlation | col2 correlation |
---|---|---|---|
A | 1 | 0.1 | 0.5 |
A | 2 | 0.05 | 0.04 |
B | 1 | 0.46 | 0.2 |
下面的代码应该可以完成工作。然而,由于实际数据框中要关联的列不止两列。键入所有这些列名称非常痛苦。有什么聪明的方法可以做到这一点吗?提前致谢!
df <- data.frame(df1,df2) %>% group_by(df1.group1,df2.group2)
%>% mutate(col1_cor = cor(df1.col1,df2.col1), col2_cor = cor(df1.col2,df2.col2)) %>%
select(df1.group1,df1.group2,col1_cor,col2_cor)
您可以使用 id
变量行绑定两个数据帧以区分它们并计算每个 col
列的相关性。
library(dplyr)
bind_rows(df1, df2, .id = 'id') %>%
group_by(group1, group2) %>%
summarise(across(starts_with('col'),
~cor(.x[id == 1], .x[id == 2]), .names = '{col}_cor'), .groups = 'drop')