将 recode() 与通过 paste() 生成的变量名一起使用

Using recode() with variable names generated through paste()

我正在尝试将包含答案标签的变量作为字符重新编码为数值。为此,我正在使用 dplyr 的 recode()

为了自动执行此操作,我想使用 paste() 生成变量名称,但显然 recode() 无法从 paste.

获取输出

我已经尝试了 noquote()as.name() 但是对于这两个 R 告诉我重新编码不能使用 class "noquote"/"name" 的对象。

示例:

item1 <- c("Don't agree at all", "Totally agree")
item2 <- c("Indifferent", "Totally agree")

for (i in 1:2) {
recode(paste("item", i, sep=""), "Totally agree"=1, "Indifferent"=2, "Don't agree at all"=3)
}

我会期待

> item1
  [1] 3 1

我该如何解决这个问题?

更新

我找到了解决方法,首先将相关列提取到另一个数据框中,然后将 recode() 函数与 sapply() 一起应用。现在我可以重新合并数据框了。

关于“但显然 recode() 无法从粘贴中获取输出。”:这与 recode 无关,但(几乎)any R函数以这种方式工作。 paste returns 一个字符串,recode 期望一个向量作为它的第一个参数...(值得注意的例外,除其他外:library 我们可以传递一个字符串 作为对象的库名称)。

如果您坚持使用“for 循环”方法,您可以做的是结合使用 assigneval(sym("a string")):

item1 <- c("Don't agree at all", "Totally agree")
item2 <- c("Indifferent", "Totally agree")


library(dplyr)
for (i in 1:2) {
  assign(paste("item", i, sep="") , recode(eval(sym(paste("item", i, sep=""))), "Totally agree"=1, "Indifferent"=2, "Don't agree at all"=3))
}

这导致:

item1

[1] 3 1

item2

[1] 2 1

编辑:

一种可能更直接、更“dplyr”-y 的方法类似于:

tdd <- data.frame(item1, item2) %>% 
  mutate_at(vars(starts_with("item")), ~recode(., "Totally agree"=1, "Indifferent"=2, "Don't agree at all"=3))

tdd 现在是:

tdd
  item1 item2
1     3     2
2     1     1

您只需将命名向量 vmget 中的变量列表一起放入 Map 并对其进行子集化。

v <- c("Totally agree"=1, "Indifferent"=2, "Don't agree at all"=3)

Map(function(x, y) unname(y[x]), mget(ls(pattern="^item")), list(v))
# $item1
# [1] 3 1
# 
# $item2
# [1] 2 1

或者,假设您有这样一个数据框,

head(dat1)
#   id         item1              item2         x
# 1  1 Totally agree      Totally agree 0.0356312
# 2  2 Totally agree      Totally agree 1.3149588
# 3  3 Totally agree        Indifferent 0.9781675
# 4  4 Totally agree        Indifferent 0.8817912
# 5  5   Indifferent        Indifferent 0.4822047
# 6  6   Indifferent Don't agree at all 0.9657529

然后你可以用类似的方式来做这个。我们甚至可以简化代码,因为我们不再需要 Map 到 return unnamed 个对象。

v1 <- c("Totally agree"=1, "Indifferent"=2, "Don't agree at all"=3)

item_nm <- c("item1", "item2")
dat1[item_nm] <- Map(`[`, list(v1), dat2[item_nm])
dat1
#    id item1 item2          x
# 1   1     1     1  0.0356312
# 2   2     1     1  1.3149588
# 3   3     1     2  0.9781675
# 4   4     1     2  0.8817912
# 5   5     2     2  0.4822047
# 6   6     2     3  0.9657529
# 7   7     2     3 -0.8145709
# 8   8     1     1  0.2839578
# 9   9     3     1 -0.1616986
# 10 10     3     3  1.9355718

每个 Map 迭代都会回收第二个参数(即 list(v1, v1) 也可以)。

更一般地说,对于您想要重新编码的每一列,list Map.

的第二个参数中多一个向量
head(dat2)
#   id         item1  item2         x
# 1  1 Totally agree Always 0.0356312
# 2  2 Totally agree Always 1.3149588
# 3  3 Totally agree   Both 0.9781675
# 4  4 Totally agree   Both 0.8817912
# 5  5   Indifferent   Both 0.4822047
# 6  6   Indifferent  Never 0.9657529

v2 <- c("Always"=1, "Both"=2, "Never"=3)

dat2[item_nm] <- Map(`[`, list(v1, v2), dat2[item_nm])
dat2
#    id item1 item2          x
# 1   1     1     1  0.0356312
# 2   2     1     1  1.3149588
# 3   3     1     2  0.9781675
# 4   4     1     2  0.8817912
# 5   5     2     2  0.4822047
# 6   6     2     3  0.9657529
# 7   7     2     3 -0.8145709
# 8   8     1     1  0.2839578
# 9   9     3     1 -0.1616986
# 10 10     3     3  1.9355718

数据:

dat1 <- structure(list(id = 1:10, item1 = c("Totally agree", "Totally agree", 
"Totally agree", "Totally agree", "Indifferent", "Indifferent", 
"Indifferent", "Totally agree", "Don't agree at all", "Don't agree at all"
), item2 = c("Totally agree", "Totally agree", "Indifferent", 
"Indifferent", "Indifferent", "Don't agree at all", "Don't agree at all", 
"Totally agree", "Totally agree", "Don't agree at all"), x = c(0.0356311982051355, 
1.31495884897891, 0.978167526364279, 0.881791226863203, 0.482204688262918, 
0.965752878105794, -0.814570938270238, 0.283957806364306, -0.161698647607024, 
1.93557176599585)), class = "data.frame", row.names = c(NA, -10L
))

dat2 <- structure(list(id = 1:10, item1 = c("Totally agree", "Totally agree", 
"Totally agree", "Totally agree", "Indifferent", "Indifferent", 
"Indifferent", "Totally agree", "Don't agree at all", "Don't agree at all"
), item2 = c("Always", "Always", "Both", "Both", "Both", "Never", 
"Never", "Always", "Always", "Never"), x = c(0.0356311982051355, 
1.31495884897891, 0.978167526364279, 0.881791226863203, 0.482204688262918, 
0.965752878105794, -0.814570938270238, 0.283957806364306, -0.161698647607024, 
1.93557176599585)), class = "data.frame", row.names = c(NA, -10L
))