在 R 中使用 Tidyverse 将连续变量重新编码为具有 *specific" 类别的分类

Question

我找到了 this helpful answer to almost the same question，但它并不能完全满足我的需要。

我有受访者的年龄，这是一个连续变量，我想使用 tidyverse 将其重新编码为分类变量。上面的 link 包括函数 cut_number()、cut_interval() 和 cut_width() 的解释，但这些对我不起作用的原因是因为 I'我想重新编码到我已经提前确定的类别，即范围 18-34、35-54 和 55+。 None 个 cut 函数允许我这样做（或者至少我没有看到如何）。

我能够在没有 tidyverse 的情况下将我的代码获取到运行，使用：

data$age[data$"Age(Self-report)"<35] <- "18-34"
data$age[data$"Age(Self-report)">34 & data$"Age(Self-report)"<55] <- "35-54"
data$age[data$"Age(Self-report)">55] <- "55+"

但我试图在我的编码风格上保持一致，并且想学习如何在 Tidyverse 中做到这一点。感谢您提供的所有帮助！

Answer 1

tidyverse 方法将利用 dplyr::case_when 重新编码变量，如下所示：

data %>% 
  mutate(age = case_when(
    `Age(Self-report)` < 35 ~ "18-34",
    `Age(Self-report)` > 34 & `Age(Self-report)` < 55 ~ "35-54",
    `Age(Self-report)` > 55 ~ "55+"
  ))

在 R 中使用 Tidyverse 将连续变量重新编码为具有 *specific" 类别的分类

Recoding continuous variable into categorical with *specific" categories, in R using Tidyverse

r

categorical-data

data-cleaning

tidyverse