在 R 中使用 case when 重新编码变量子集

Recode subset of variables using case when in R

我正在尝试用 R 重新编码一些调查数据。这是一些与我实际拥有的数据相似的数据。

df <- data.frame(
  A = rep("Y",5),
  B=seq(as.POSIXct("2014-01-13"), as.POSIXct("2014-01-17"), by="days"),
  C = c("Neither agree nor disagree",
        "Somewhat agree",
        "Somewhat disagree",
        "Strongly agree",
        "Strongly disagree"),
  D=c("Neither agree nor disagree",
         "Somewhat agree",
         "Somewhat disagree",
         "Strongly agree",
         "Strongly disagree")
)



我查阅了一些其他帖子并编写了以下代码:

init2<-df %>%
  mutate_at(vars(c(1:4)), function(x) case_when( x == "Neither agree nor disagree" ~ 3, 
                                     x == "Somewhat agree" ~ 4, 
                                     x == "Somewhat disagree"~ 2,
                                     x== "Strongly agree"~ 5,
                                     x== "Strongly disaagree"~ 1
                                     
                                     ))

但这会引发错误

Error: Problem with `mutate()` column `B`.
i `B = (function (x) ...`.
x character string is not in a standard unambiguous format

Run `rlang::last_error()` to see where the error occurred. 

我输入的日期是 POSIXct。我应该改变他们的格式吗?此问题的修复方法是什么?谢谢

尝试将 POSIXt 列重新编码为您的李克特量表没有意义;尝试重新编码 "Y" 列对我来说也没有意义,但至少你没有收到错误。

我建议你:

  1. 明确mutate你想要的列,

    df %>%
      mutate(across(c(C, D), ~ case_when(
        . == "Neither agree nor disagree" ~ 3,
        . == "Somewhat agree"             ~ 4,
        . == "Somewhat disagree"          ~ 2,
        . == "Strongly agree"             ~ 5,
        . == "Strongly disagree"          ~ 1
      )))
    #   A          B C D
    # 1 Y 2014-01-13 3 3
    # 2 Y 2014-01-14 4 4
    # 3 Y 2014-01-15 2 2
    # 4 Y 2014-01-16 5 5
    # 5 Y 2014-01-17 1 1
    
  2. 显式排除 你不想要的列,

    df %>%
      mutate(across(-c(A, B), ~ case_when(
        . == "Neither agree nor disagree" ~ 3,
        . == "Somewhat agree"             ~ 4,
        . == "Somewhat disagree"          ~ 2,
        . == "Strongly agree"             ~ 5,
        . == "Strongly disagree"          ~ 1
      )))
    
  3. 通过一些过滤器有条件地处理它们(虽然这不是万无一失的):

    df %>%
      mutate(across(where(~ all(grepl("agree", .))), ~ case_when(
        . == "Neither agree nor disagree" ~ 3,
        . == "Somewhat agree"             ~ 4,
        . == "Somewhat disagree"          ~ 2,
        . == "Strongly agree"             ~ 5,
        . == "Strongly disagree"          ~ 1
      )))
    

仅供参考,根据 https://dplyr.tidyverse.org/reference/mutate_all.html(2021 年 11 月 7 日):

Scoped verbs (_if, _at, _all) have been superseded by the use of across() in an existing verb. See vignette("colwise") for details.

它与 where 完美搭配,由 tidyselect 软件包提供(秘密地)。