在 R 中使用 tidyverse 对 case_when 函数进行故障排除
troubleshooting case_when function using tidyverse in R
简单的问题,我不明白 case_when 的工作原理。在下面的示例中,我预计赛季有 4 个级别,但我只得到两个。
谢谢
data <- tibble(day = 1:366) %>%
mutate(
season = case_when(
day <= 60 | day > 335 ~ "winter",
day > 60 | day <= 151 ~ "spring",
day > 151 | day <= 242 ~ "summer",
day > 242 | day <= 335 ~ "autumn"
)
)
表达式 2 到 4 将是 &
而不是 |
。原因是 |
会因为重叠
而覆盖第一个条件中的一些值
library(dplyr)
data <- tibble(day = 1:366) %>%
mutate(
season = case_when(
day <= 60 | day > 335 ~ "winter",
day > 60 & day <= 151 ~ "spring",
day > 151 & day <= 242 ~ "summer",
day > 242 & day <= 335 ~ "autumn"
)
)
-正在检查
> n_distinct(data$season)
[1] 4
实际上你可以稍微减少这个 case_when() 语句,因为 case_when 一旦满足一个条件就会中断。因此,如果值是 lower/equal 到 60 或大于 335,则下一个条件足以定义为低于 151:
library(dplyr)
data <- tibble(day = 1:366) %>%
mutate(
season = case_when(
day <= 60 | day > 335 ~ "winter",
day <= 151 ~ "spring",
day <= 242 ~ "summer",
day <= 335 ~ "autumn"
)
)
您还可以使用 TRUE 案例,因为它用于不满足所有先决条件的情况:
data <- tibble(day = 1:366) %>%
mutate(
season = case_when(
day <= 60 ~ "winter",
day <= 151 ~ "spring",
day <= 242 ~ "summer",
day <= 335 ~ "autumn",
TRUE ~ "winter"
)
)
停止使用 case_when
,改用 cut
。
tibble(day = 1:366) |>
mutate(
season = cut(day,
c(0, 60, 151, 242, 335, 366),
c("winter", "spring", "summer", "autumn",
"winter")
)
)
简单的问题,我不明白 case_when 的工作原理。在下面的示例中,我预计赛季有 4 个级别,但我只得到两个。
谢谢
data <- tibble(day = 1:366) %>%
mutate(
season = case_when(
day <= 60 | day > 335 ~ "winter",
day > 60 | day <= 151 ~ "spring",
day > 151 | day <= 242 ~ "summer",
day > 242 | day <= 335 ~ "autumn"
)
)
表达式 2 到 4 将是 &
而不是 |
。原因是 |
会因为重叠
library(dplyr)
data <- tibble(day = 1:366) %>%
mutate(
season = case_when(
day <= 60 | day > 335 ~ "winter",
day > 60 & day <= 151 ~ "spring",
day > 151 & day <= 242 ~ "summer",
day > 242 & day <= 335 ~ "autumn"
)
)
-正在检查
> n_distinct(data$season)
[1] 4
实际上你可以稍微减少这个 case_when() 语句,因为 case_when 一旦满足一个条件就会中断。因此,如果值是 lower/equal 到 60 或大于 335,则下一个条件足以定义为低于 151:
library(dplyr)
data <- tibble(day = 1:366) %>%
mutate(
season = case_when(
day <= 60 | day > 335 ~ "winter",
day <= 151 ~ "spring",
day <= 242 ~ "summer",
day <= 335 ~ "autumn"
)
)
您还可以使用 TRUE 案例,因为它用于不满足所有先决条件的情况:
data <- tibble(day = 1:366) %>%
mutate(
season = case_when(
day <= 60 ~ "winter",
day <= 151 ~ "spring",
day <= 242 ~ "summer",
day <= 335 ~ "autumn",
TRUE ~ "winter"
)
)
停止使用 case_when
,改用 cut
。
tibble(day = 1:366) |>
mutate(
season = cut(day,
c(0, 60, 151, 242, 335, 366),
c("winter", "spring", "summer", "autumn",
"winter")
)
)