如何在 R 中的系数相关性 (R) 计算中为每个数据帧子集列?
How to subset a column for each dataframe in coefficient correlation (R) calculation in R?
我有两个数据帧 Vobs
和 Vest
。请参阅以下示例:
dput(head(Vobs,20))
structure(list(ID = c("LAM_1", "LAM_2", "LAM_3", "LAM_4", "LAM_5",
"LAM_6", "LAM_7", "AUR_1", "AUR_2", "AUR_3", "AUR_4", "AUR_5",
"AUR_6"), SOS = c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24,
26), EOS = c(3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27)), row.names = c(NA,
-13L), class = c("tbl_df", "tbl", "data.frame"))
dput(head(Vest,30))
structure(list(ID = c("LAM", "LAM", "LAM", "LAM", "LAM", "AUR",
"AUR", "AUR", "AUR", "AUR", "AUR", "P0", "P01", "P01", "P02",
"P1", "P2", "P3", "P4", "P13", "P14", "P15", "P17", "P18", "P19",
"P20", "P22", "P23", "P24"), EVI_SOS = c(2, 6, 10, 14, NA, 20,
24, 28, 32, 36, NA, 42, 42, NA, 48, 48, 52, 56, 60, 64, 68, NA,
NA, 72, NA, 78, 82, 86, 90), EVI_EOS = c(3, 7, 11, 15, NA, 21,
25, 29, 33, 37, NA, 43, 43, NA, 49, 49, 53, 57, 61, 65, 69, NA,
NA, 73, NA, 79, 83, 87, 91), NDVI_SOS = c(4, 8, 12, 16, 18, 22,
26, 30, 34, 38, 40, 44, 44, 46, 50, 50, 54, 58, 62, 66, 70, NA,
NA, 74, 76, 80, 84, 88, 92), NDVI_EOS = c(5, 9, 13, 17, 19, 23,
27, 31, 35, 39, 41, 45, 45, 47, 51, 51, 55, 59, 63, 67, 71, NA,
NA, 75, 77, 81, 85, 89, 93)), row.names = c(NA, -29L), class = c("tbl_df",
"tbl", "data.frame"))
我想做两个数据帧之间的相关系数(R)。例如,我假装在 Vobs 的 SOS
列和 [=36 的 EVI_SOS
列之间执行 R =]Vest 关于 LAM
ID(存在于两个数据框中)。
换句话说,我想对感兴趣的 ID 的数据进行子集化。在这个例子中,我对 LAM
ID 感兴趣,因为 Vest 和 LAM_3
到 LAM_7
(即 LAM_3
, LAM_4
、LAM_5
、LAM_6
、LAM_7
) for Vobs.
我一直在使用这段代码:
cor(Vobs$SOS, Vest$EVI_SOS, use = "complete.obs")
但我错过了两个不同数据帧的两列的 ID 子集。我怎样才能使用这段代码做子集?
任何帮助将不胜感激。
在您的特定情况下,要使用顺序数字后缀对字符变量进行子集化,请尝试使用 sprint()
附加数字和子集,如下所示:
sprintf("LAM_%s",3:7)
[1] "LAM_3" "LAM_4" "LAM_5" "LAM_6" "LAM_7"
所以:
Vobs[Vobs$ID %in% sprintf("LAM_%s",3:7),"SOS"]
# SOS
# <dbl>
# 1 6
# 2 8
# 3 10
# 4 12
# 5 14
由于 Vest
数据集只有 LAM
用于观察,您可以更轻松地进行子集化。尝试
cor(Vobs[Vobs$ID %in% sprintf("LAM_%s",3:7),"SOS"],
Vest[Vest$ID %in% "LAM","EVI_SOS"], use = "complete.obs")
我有两个数据帧 Vobs
和 Vest
。请参阅以下示例:
dput(head(Vobs,20))
structure(list(ID = c("LAM_1", "LAM_2", "LAM_3", "LAM_4", "LAM_5",
"LAM_6", "LAM_7", "AUR_1", "AUR_2", "AUR_3", "AUR_4", "AUR_5",
"AUR_6"), SOS = c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24,
26), EOS = c(3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27)), row.names = c(NA,
-13L), class = c("tbl_df", "tbl", "data.frame"))
dput(head(Vest,30))
structure(list(ID = c("LAM", "LAM", "LAM", "LAM", "LAM", "AUR",
"AUR", "AUR", "AUR", "AUR", "AUR", "P0", "P01", "P01", "P02",
"P1", "P2", "P3", "P4", "P13", "P14", "P15", "P17", "P18", "P19",
"P20", "P22", "P23", "P24"), EVI_SOS = c(2, 6, 10, 14, NA, 20,
24, 28, 32, 36, NA, 42, 42, NA, 48, 48, 52, 56, 60, 64, 68, NA,
NA, 72, NA, 78, 82, 86, 90), EVI_EOS = c(3, 7, 11, 15, NA, 21,
25, 29, 33, 37, NA, 43, 43, NA, 49, 49, 53, 57, 61, 65, 69, NA,
NA, 73, NA, 79, 83, 87, 91), NDVI_SOS = c(4, 8, 12, 16, 18, 22,
26, 30, 34, 38, 40, 44, 44, 46, 50, 50, 54, 58, 62, 66, 70, NA,
NA, 74, 76, 80, 84, 88, 92), NDVI_EOS = c(5, 9, 13, 17, 19, 23,
27, 31, 35, 39, 41, 45, 45, 47, 51, 51, 55, 59, 63, 67, 71, NA,
NA, 75, 77, 81, 85, 89, 93)), row.names = c(NA, -29L), class = c("tbl_df",
"tbl", "data.frame"))
我想做两个数据帧之间的相关系数(R)。例如,我假装在 Vobs 的 SOS
列和 [=36 的 EVI_SOS
列之间执行 R =]Vest 关于 LAM
ID(存在于两个数据框中)。
换句话说,我想对感兴趣的 ID 的数据进行子集化。在这个例子中,我对 LAM
ID 感兴趣,因为 Vest 和 LAM_3
到 LAM_7
(即 LAM_3
, LAM_4
、LAM_5
、LAM_6
、LAM_7
) for Vobs.
我一直在使用这段代码:
cor(Vobs$SOS, Vest$EVI_SOS, use = "complete.obs")
但我错过了两个不同数据帧的两列的 ID 子集。我怎样才能使用这段代码做子集?
任何帮助将不胜感激。
在您的特定情况下,要使用顺序数字后缀对字符变量进行子集化,请尝试使用 sprint()
附加数字和子集,如下所示:
sprintf("LAM_%s",3:7)
[1] "LAM_3" "LAM_4" "LAM_5" "LAM_6" "LAM_7"
所以:
Vobs[Vobs$ID %in% sprintf("LAM_%s",3:7),"SOS"]
# SOS
# <dbl>
# 1 6
# 2 8
# 3 10
# 4 12
# 5 14
由于 Vest
数据集只有 LAM
用于观察,您可以更轻松地进行子集化。尝试
cor(Vobs[Vobs$ID %in% sprintf("LAM_%s",3:7),"SOS"],
Vest[Vest$ID %in% "LAM","EVI_SOS"], use = "complete.obs")