你能在 R 中结合 case_when 和 startsWith 进行复杂分组吗

Can you combine case_when and startsWith in R for complex groupings

我正在尝试将一些与字符串具有相似开头的复杂类别组合在一起。

这是第一个 case_when 子句的示例,您可以看到它很长(为简洁起见,我已经编辑了字符串)

有没有办法编写一个 case_when 语句,将所有以 'Conditions' 开头的值分组? (这也适用于以 'Mental health etc etc' 开头的其余子句。)

谢谢大家!

mutate(condition=case_when(health_conditions == 'Conditions ABC' | health_conditions == 'Conditions DEF' | health_conditions =='HIJ' | health_conditions == 'Conditions KLM, Parkinsons)' | health_conditions == 'Conditions NOP' ~ 'Conditions')

我们可以使用带有 grepl/str_detect 的正则表达式来组合这些情况

library(dplyr)
library(stringr)
df1 %>%
   mutate(condition = case_when(str_detect(health_conditions, 
      "^Conditions")|health_conditions == "HIJ" ~ 'Conditions'))

或者另一种选择是startsWith

df1 %>%
   mutate(condition = case_when(startsWith(health_conditions, 
      "Conditions")|health_conditions == "HIJ" ~ "Conditions"))

case_when() 有点代码味道。以下基本 R 习语更简单,使用 %in%:

conds <- c("Conditions DEF",  "HIJ") # add extra as required
df1$condition[df1$health_conditions %in% conds] <- "Condition"

或者,正如其他答案中所建议的,正则表达式可能会有所帮助:

df1$condition[grepl("Conditions", df1$health_conditions)] <- "Condition"