如何比较 R 中的多个特定列
How to compare multiple specific columns in R
我对 r 经验不多,感谢您的帮助。我有一个这样的数据集:
df <- tibble(
a = rnorm(10),
b = rnorm(10),
c = rnorm(10),
d = rnorm(10),
Adam = rnorm(10),
Aaron = rnorm(10),
Abby = rnorm(10),
Brett= rnorm(10),
Bobby= rnorm(10),
Blaine= rnorm(10),
Cate= rnorm(10),
Camila= rnorm(10),
Calvin= rnorm(10),
Dana= rnorm(10),
Debbie= rnorm(10),
Derek= rnorm(10))
我正在尝试计算余弦 similarity between column A and the columns with A names( Adam, Aaron, Abby) and similarly between column B and columns with B names (Brett, Bobby, Blaine) etc. I tried using map
from purrr 包,但不太明白。
提前致谢。
我们可以根据列名的第一个字符将数据集拆分为 list
个数据集,然后用 map
遍历 list
,执行 [=14] =](来自 tcR
包),在所有列和第一列之间('a'、'b'、'c'、'd' - 在 list
元素)
library(tcR)
library(dplyr)
library(purrr)
df %>%
split.default(toupper(substr(names(.), 1, 1))) %>%
map_dfc( ~ { nm1 <- names(.x)[1]
.x %>%
summarise_at(-1, ~ cosine.similarity(!! rlang::sym(nm1), .))})
# A tibble: 1 x 12
# Adam Aaron Abby Brett Bobby Blaine Cate Camila Calvin Dana Debbie Derek
#* <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 -0.0444 0.110 0.356 -0.00975 -0.0277 -0.0297 0.270 -0.222 -0.364 0.172 -0.0108 -0.0498
我对 r 经验不多,感谢您的帮助。我有一个这样的数据集:
df <- tibble(
a = rnorm(10),
b = rnorm(10),
c = rnorm(10),
d = rnorm(10),
Adam = rnorm(10),
Aaron = rnorm(10),
Abby = rnorm(10),
Brett= rnorm(10),
Bobby= rnorm(10),
Blaine= rnorm(10),
Cate= rnorm(10),
Camila= rnorm(10),
Calvin= rnorm(10),
Dana= rnorm(10),
Debbie= rnorm(10),
Derek= rnorm(10))
我正在尝试计算余弦 similarity between column A and the columns with A names( Adam, Aaron, Abby) and similarly between column B and columns with B names (Brett, Bobby, Blaine) etc. I tried using map
from purrr 包,但不太明白。
提前致谢。
我们可以根据列名的第一个字符将数据集拆分为 list
个数据集,然后用 map
遍历 list
,执行 [=14] =](来自 tcR
包),在所有列和第一列之间('a'、'b'、'c'、'd' - 在 list
元素)
library(tcR)
library(dplyr)
library(purrr)
df %>%
split.default(toupper(substr(names(.), 1, 1))) %>%
map_dfc( ~ { nm1 <- names(.x)[1]
.x %>%
summarise_at(-1, ~ cosine.similarity(!! rlang::sym(nm1), .))})
# A tibble: 1 x 12
# Adam Aaron Abby Brett Bobby Blaine Cate Camila Calvin Dana Debbie Derek
#* <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 -0.0444 0.110 0.356 -0.00975 -0.0277 -0.0297 0.270 -0.222 -0.364 0.172 -0.0108 -0.0498