R:使用重新编码、变异和 case_when 重新编码变量

R: Recoding variables using recode, mutate and case_when

我想为数据集中 core.vars 定义的以下变量重新编码以下值 < 4 = -1、4 = 0、> 4 = 1,并仍然保留其余变量数据框。

temp.df <- as.tibble (mtcars)
other.vars <- c('hp', 'drat', 'wt')
core.vars <- c('mpg', 'cyl', 'disp')
temp.df <- rownames_to_column (temp.df, var ="cars_id")
temp.df <- temp.df %>% mutate_if (is.integer, as.numeric)

我已经尝试了很多方法来实现这个。使用 case_whenmutaterecode 但没有运气。 recode 需要一个向量,所以我的想法是使用 case_whenmutate 为每个感兴趣的变量创建一个向量,然后重新编码这些值。但是他们失败了。

temp.df <- temp.df %>% 
           mutate_at(.vars %in% (core.vars)), '< 4' = "-1", '4' = "0", '> 4' = "1")

Error: unexpected ',' in "temp.df <- temp.df %>% mutate_at(.vars %in% (core.vars)),"

temp.df <- temp.df %>% 
           mutate_at(vars(one_of(core.vars)), '< 4' = "-1", '4' = "0", '> 4' = "1")

Error in inherits(x, "fun_list") : argument ".funs" is missing, with no default

 temp.df <- temp.df %>% 
            mutate (temp.df, case_when (vars(one_of(core.vars)), recode ('< 4' = "-1", '4' = "0", '> 4' = "1")))

Error in mutate_impl(.data, dots) : Column temp.df is of unsupported class data.frame

 temp.df <- temp.df %>% 
            case_when (vars(one_of(core.vars)), recode ('< 4' = "-1", '4' = "0", '> 4' = "1"))

Error in recode.character(< 4 = "-1", 4 = "0", > 4 = "1") : argument ".x" is missing, with no default

temp.df <- temp.df %>% rowwise() %>% mutate_at(vars (core.vars),
                                            funs (case_when (
                                                recode(., '< 4' = -1, '0' = 0, '>4' = 1)
                                            ))) %>%
 ungroup()`

Error in mutate_impl(.data, dots) : Evaluation error: Case 1 (recode(mpg,< 4= -1,0= 0,>4= 1)) must be a two-sided formula, not a double. In addition: Warning message: In recode.numeric(mpg, < 4 = -1, 0 = 0, >4 = 1) : NAs introduced by coercion

论坛上的先前问题包括如何对单个变量执行此操作,但是如前所述,我有 100 个变量和 300 个样本,因此无法逐行单独输入它们。

理想情况下,最好不要创建单独的数据框然后进行连接,或者像 mutate 那样创建多个单独的变量。

我确定有一个 for 循环 and/or ifelse 方法,但我试图使用 tidyverse 来实现目标。任何建议都会有所帮助。

temp.df %>%
  mutate_at(vars(one_of(core.vars)), 
            function(x) case_when(
              x < 4 ~ -1,
              x == 4 ~ 0,
              x > 4 ~ 1
            ))

输出

# A tibble: 32 x 12
   cars_id             mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
   <chr>             <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1 Mazda RX4             1     1     1   110  3.9   2.62  16.5     0     1     4     4
 2 Mazda RX4 Wag         1     1     1   110  3.9   2.88  17.0     0     1     4     4
 3 Datsun 710            1     0     1    93  3.85  2.32  18.6     1     1     4     1
 4 Hornet 4 Drive        1     1     1   110  3.08  3.22  19.4     1     0     3     1
 5 Hornet Sportabout     1     1     1   175  3.15  3.44  17.0     0     0     3     2
 6 Valiant               1     1     1   105  2.76  3.46  20.2     1     0     3     1
 7 Duster 360            1     1     1   245  3.21  3.57  15.8     0     0     3     4
 8 Merc 240D             1     0     1    62  3.69  3.19  20       1     0     4     2
 9 Merc 230              1     0     1    95  3.92  3.15  22.9     1     0     4     2
10 Merc 280              1     1     1   123  3.92  3.44  18.3     1     0     4     4