基于其他列 R 的值范围的新变量

Question

我知道有很多类似的问题，但我找不到答案。

我需要做的是将一个数值变量分为三个级别。

除了其他一些事情，我还尝试了以下内容：


data_long$average_success_grouped <- recode(data_long$average_success, <0.5 = no success, >0.5 & <0.9 = little success, >0.9 = success)

我的值介于 0 - 1 之间，我需要在三组的 0.5 和 0.9 处截断它们。

有人可以帮忙吗？

current Error: unexpected '<' in "data_long$averagre_success <- recode(data_long$average_success, <"

dput(data_long_migraine)
structure(list(average_success = c(0.333333333333333, 0.416666666666667, 0, 0.25, 
0.166666666666667, 0.133333333333333, 0.0285714285714286, 0, 
0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 0.0142857142857143, 
0.305555555555556, 0.861111111111111, 0.614285714285714, 0.371428571428571, 
1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 
0.583333333333333, 0.194444444444444, 0.333333333333333, 0.416666666666667, 
0, 0.25, 0.166666666666667, 0.133333333333333, 0.0285714285714286, 
0, 0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 
0.0142857142857143, 0.305555555555556, 0.861111111111111, 0.614285714285714, 
0.371428571428571, 1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 
0.128571428571429, 0.583333333333333, 0.194444444444444, 0.333333333333333, 
0.416666666666667, 0, 0.25, 0.166666666666667, 0.133333333333333, 
0.0285714285714286, 0, 0.266666666666667, 1, 0.214285714285714, 
0.472222222222222, 0.0142857142857143, 0.305555555555556, 0.861111111111111, 
0.614285714285714, 0.371428571428571, 1, 0.694444444444444, 0, 
0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 0.583333333333333, 
0.194444444444444, 0.333333333333333, 0.416666666666667, 0, 0.25, 
0.166666666666667, 0.133333333333333, 0.0285714285714286, 0, 
0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 0.0142857142857143, 
0.305555555555556, 0.861111111111111, 0.614285714285714, 0.371428571428571, 
1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 
0.583333333333333, 0.194444444444444), month = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("bad_days_1_month", 
"bad_days_2_month", "bad_days_3_month", "bad_days_4_month"
), class = "factor"), bad_days = c(5, 3, 8, 5, 0, 13, 2, 
3, 10, 13, 7, 3, 2, 23, 5, 4, 6, 17, 4, 3, 13, 10, 4, 8, 15, 
18, 2, 7, 7, 10, 1, 2, 10, 3, 0, 3, 16, 8, 4, 4, 26, 2, 6, 10, 
25, 5, 3, 11, 7, 4, 6, 11, 18, 4, 5, 7, 6, 7, 2, 11, 6, 0, 5, 
20, 4, 2, 4, 20, 0, 2, 2, 24, 6, 4, 4, 5, 3, 7, 8, 6, 2, 9, 8, 
8, 7, 3, 8, 6, 0, 5, 20, 9, 8, 2, 22, 1, 1, 5, 25, 3, 1, 6, 3, 
3, 4, 8, 11, 0), average_success_grouped = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA, 
-108L), class = "data.frame")

我之前尝试过其他东西，这让我找到了 average_success_grouped，里面只有“2”，但我记不清了

Answer 1

加载 tidyverse 库和数据

听起来你需要一个 ifelse 语句。首先，加载 tidyverse 包为级别添加一个新变量：

library(tidyverse)

我首先将您的输入保存到一个名为 df 的对象中：

df <- structure(list(average_success = c(0.333333333333333, 0.416666666666667, 0, 0.25, 0.166666666666667, 0.133333333333333, 0.0285714285714286, 0, 0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 0.0142857142857143, 0.305555555555556, 0.861111111111111, 0.614285714285714, 0.371428571428571, 1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 0.583333333333333, 0.194444444444444, 0.333333333333333, 0.416666666666667, 0, 0.25, 0.166666666666667, 0.133333333333333, 0.0285714285714286, 0, 0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 0.0142857142857143, 0.305555555555556, 0.861111111111111, 0.614285714285714, 0.371428571428571, 1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 0.583333333333333, 0.194444444444444, 0.333333333333333, 0.416666666666667, 0, 0.25, 0.166666666666667, 0.133333333333333, 0.0285714285714286, 0, 0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 0.0142857142857143, 0.305555555555556, 0.861111111111111, 0.614285714285714, 0.371428571428571, 1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 0.583333333333333, 0.194444444444444, 0.333333333333333, 0.416666666666667, 0, 0.25, 0.166666666666667, 0.133333333333333, 0.0285714285714286, 0, 0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 0.0142857142857143, 0.305555555555556, 0.861111111111111, 0.614285714285714, 0.371428571428571, 1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 0.583333333333333, 0.194444444444444), month = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("bad_days_1_month", "bad_days_2_month", "bad_days_3_month", "bad_days_4_month" ), class = "factor"), bad_days = c(5, 3, 8, 5, 0, 13, 2, 3, 10, 13, 7, 3, 2, 23, 5, 4, 6, 17, 4, 3, 13, 10, 4, 8, 15, 18, 2, 7, 7, 10, 1, 2, 10, 3, 0, 3, 16, 8, 4, 4, 26, 2, 6, 10, 25, 5, 3, 11, 7, 4, 6, 11, 18, 4, 5, 7, 6, 7, 2, 11, 6, 0, 5, 20, 4, 2, 4, 20, 0, 2, 2, 24, 6, 4, 4, 5, 3, 7, 8, 6, 2, 9, 8, 8, 7, 3, 8, 6, 0, 5, 20, 9, 8, 2, 22, 1, 1, 5, 25, 3, 1, 6, 3, 3, 4, 8, 11, 0), average_success_grouped = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA, -108L), class = "data.frame")

新数据框：

然后使用 mutate 和 ifelse 的 if/then 语句创建新变量：

df2 <- df %>%
  mutate(success_level = ifelse(average_success >.9 , "high success", 
                                ifelse(average_success <.5, "no success", "little")))

查看结果

如果您现在使用 View(df2)，您会得到这个新数据框：

基于其他列 R 的值范围的新变量

new variable based on value range of other column R

r

recode

group

加载 tidyverse 库和数据

新数据框：

查看结果