R:使用重新编码、变异和 case_when 重新编码变量
R: Recoding variables using recode, mutate and case_when
我想为数据集中 core.vars 定义的以下变量重新编码以下值 < 4 = -1、4 = 0、> 4 = 1,并仍然保留其余变量数据框。
temp.df <- as.tibble (mtcars)
other.vars <- c('hp', 'drat', 'wt')
core.vars <- c('mpg', 'cyl', 'disp')
temp.df <- rownames_to_column (temp.df, var ="cars_id")
temp.df <- temp.df %>% mutate_if (is.integer, as.numeric)
我已经尝试了很多方法来实现这个。使用 case_when
、mutate
、recode
但没有运气。 recode
需要一个向量,所以我的想法是使用 case_when
或 mutate
为每个感兴趣的变量创建一个向量,然后重新编码这些值。但是他们失败了。
temp.df <- temp.df %>%
mutate_at(.vars %in% (core.vars)), '< 4' = "-1", '4' = "0", '> 4' = "1")
Error: unexpected ',' in "temp.df <- temp.df %>% mutate_at(.vars %in% (core.vars)),"
temp.df <- temp.df %>%
mutate_at(vars(one_of(core.vars)), '< 4' = "-1", '4' = "0", '> 4' = "1")
Error in inherits(x, "fun_list") : argument ".funs" is missing, with no default
temp.df <- temp.df %>%
mutate (temp.df, case_when (vars(one_of(core.vars)), recode ('< 4' = "-1", '4' = "0", '> 4' = "1")))
Error in mutate_impl(.data, dots) : Column temp.df
is of unsupported class data.frame
temp.df <- temp.df %>%
case_when (vars(one_of(core.vars)), recode ('< 4' = "-1", '4' = "0", '> 4' = "1"))
Error in recode.character(< 4
= "-1", 4
= "0", > 4
= "1") : argument ".x" is missing, with no default
temp.df <- temp.df %>% rowwise() %>% mutate_at(vars (core.vars),
funs (case_when (
recode(., '< 4' = -1, '0' = 0, '>4' = 1)
))) %>%
ungroup()`
Error in mutate_impl(.data, dots) : Evaluation error: Case 1 (recode(mpg,
< 4= -1,
0= 0,
>4= 1)
) must be a two-sided formula, not a double. In addition: Warning message: In recode.numeric(mpg, < 4
= -1, 0
= 0, >4
= 1) : NAs introduced by coercion
论坛上的先前问题包括如何对单个变量执行此操作,但是如前所述,我有 100 个变量和 300 个样本,因此无法逐行单独输入它们。
理想情况下,最好不要创建单独的数据框然后进行连接,或者像 mutate 那样创建多个单独的变量。
我确定有一个 for 循环 and/or ifelse 方法,但我试图使用 tidyverse 来实现目标。任何建议都会有所帮助。
temp.df %>%
mutate_at(vars(one_of(core.vars)),
function(x) case_when(
x < 4 ~ -1,
x == 4 ~ 0,
x > 4 ~ 1
))
输出
# A tibble: 32 x 12
cars_id mpg cyl disp hp drat wt qsec vs am gear carb
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Mazda RX4 1 1 1 110 3.9 2.62 16.5 0 1 4 4
2 Mazda RX4 Wag 1 1 1 110 3.9 2.88 17.0 0 1 4 4
3 Datsun 710 1 0 1 93 3.85 2.32 18.6 1 1 4 1
4 Hornet 4 Drive 1 1 1 110 3.08 3.22 19.4 1 0 3 1
5 Hornet Sportabout 1 1 1 175 3.15 3.44 17.0 0 0 3 2
6 Valiant 1 1 1 105 2.76 3.46 20.2 1 0 3 1
7 Duster 360 1 1 1 245 3.21 3.57 15.8 0 0 3 4
8 Merc 240D 1 0 1 62 3.69 3.19 20 1 0 4 2
9 Merc 230 1 0 1 95 3.92 3.15 22.9 1 0 4 2
10 Merc 280 1 1 1 123 3.92 3.44 18.3 1 0 4 4
我想为数据集中 core.vars 定义的以下变量重新编码以下值 < 4 = -1、4 = 0、> 4 = 1,并仍然保留其余变量数据框。
temp.df <- as.tibble (mtcars)
other.vars <- c('hp', 'drat', 'wt')
core.vars <- c('mpg', 'cyl', 'disp')
temp.df <- rownames_to_column (temp.df, var ="cars_id")
temp.df <- temp.df %>% mutate_if (is.integer, as.numeric)
我已经尝试了很多方法来实现这个。使用 case_when
、mutate
、recode
但没有运气。 recode
需要一个向量,所以我的想法是使用 case_when
或 mutate
为每个感兴趣的变量创建一个向量,然后重新编码这些值。但是他们失败了。
temp.df <- temp.df %>%
mutate_at(.vars %in% (core.vars)), '< 4' = "-1", '4' = "0", '> 4' = "1")
Error: unexpected ',' in "temp.df <- temp.df %>% mutate_at(.vars %in% (core.vars)),"
temp.df <- temp.df %>%
mutate_at(vars(one_of(core.vars)), '< 4' = "-1", '4' = "0", '> 4' = "1")
Error in inherits(x, "fun_list") : argument ".funs" is missing, with no default
temp.df <- temp.df %>%
mutate (temp.df, case_when (vars(one_of(core.vars)), recode ('< 4' = "-1", '4' = "0", '> 4' = "1")))
Error in mutate_impl(.data, dots) : Column
temp.df
is of unsupported class data.frame
temp.df <- temp.df %>%
case_when (vars(one_of(core.vars)), recode ('< 4' = "-1", '4' = "0", '> 4' = "1"))
Error in recode.character(
< 4
= "-1",4
= "0",> 4
= "1") : argument ".x" is missing, with no default
temp.df <- temp.df %>% rowwise() %>% mutate_at(vars (core.vars),
funs (case_when (
recode(., '< 4' = -1, '0' = 0, '>4' = 1)
))) %>%
ungroup()`
Error in mutate_impl(.data, dots) : Evaluation error: Case 1 (
recode(mpg,
< 4= -1,
0= 0,
>4= 1)
) must be a two-sided formula, not a double. In addition: Warning message: In recode.numeric(mpg,< 4
= -1,0
= 0,>4
= 1) : NAs introduced by coercion
论坛上的先前问题包括如何对单个变量执行此操作,但是如前所述,我有 100 个变量和 300 个样本,因此无法逐行单独输入它们。
理想情况下,最好不要创建单独的数据框然后进行连接,或者像 mutate 那样创建多个单独的变量。
我确定有一个 for 循环 and/or ifelse 方法,但我试图使用 tidyverse 来实现目标。任何建议都会有所帮助。
temp.df %>%
mutate_at(vars(one_of(core.vars)),
function(x) case_when(
x < 4 ~ -1,
x == 4 ~ 0,
x > 4 ~ 1
))
输出
# A tibble: 32 x 12
cars_id mpg cyl disp hp drat wt qsec vs am gear carb
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Mazda RX4 1 1 1 110 3.9 2.62 16.5 0 1 4 4
2 Mazda RX4 Wag 1 1 1 110 3.9 2.88 17.0 0 1 4 4
3 Datsun 710 1 0 1 93 3.85 2.32 18.6 1 1 4 1
4 Hornet 4 Drive 1 1 1 110 3.08 3.22 19.4 1 0 3 1
5 Hornet Sportabout 1 1 1 175 3.15 3.44 17.0 0 0 3 2
6 Valiant 1 1 1 105 2.76 3.46 20.2 1 0 3 1
7 Duster 360 1 1 1 245 3.21 3.57 15.8 0 0 3 4
8 Merc 240D 1 0 1 62 3.69 3.19 20 1 0 4 2
9 Merc 230 1 0 1 95 3.92 3.15 22.9 1 0 4 2
10 Merc 280 1 1 1 123 3.92 3.44 18.3 1 0 4 4