DataFrame 中多列的 Ifelse

Ifelse for Multiple Columns in DataFrame

我有如下数据集:

ID Winter Spring Summer Fall
1 high NA high low
2 low high NA low
3 low NA NA low
4 low high NA low

我想添加一个计算列,这样如果冬季、spring、夏季和秋季列中的任何一个包含“high”,则“1”将添加到该行,如下所示。否则它将包含 0.

ID Winter Spring Summer Fall calculated_column
1 high NA high low 1
2 low high NA low 1
3 low NA NA low 0
4 low high NA low 1

到目前为止我有这样的事情,我知道它是不正确的。我不确定如何指定多列而不是一列:

df$calculated_column <- ifelse(c(2:5)=="High",1,0)

我们可以用if_any

library(dplyr)
df1 <- df1 %>%
     mutate(calculated_column = +(if_any(-ID, ~ . %in% 'high')))

-输出

df1
 ID Winter Spring Summer Fall calculated_column
1  1   high   <NA>   high  low                 1
2  2    low   high   <NA>  low                 1
3  3    low   <NA>   <NA>  low                 0
4  4    low   high   <NA>  low                 1

或者如果我们想使用 base R,在逻辑矩阵

上创建带有 rowSums 的逻辑条件
df1$calculated_column <-  +(rowSums(df1[-1] == "high", na.rm = TRUE) > 0)

数据

df1 <- structure(list(ID = 1:4, Winter = c("high", "low", "low", "low"
), Spring = c(NA, "high", NA, "high"), Summer = c("high", NA, 
NA, NA), Fall = c("low", "low", "low", "low")), 
class = "data.frame", row.names = c(NA, 
-4L))

您还可以这样做:

df1$calculated_column = +grepl('high', do.call(paste, df1))
df1
  ID Winter Spring Summer Fall calculated_column
1  1   high   <NA>   high  low                 1
2  2    low   high   <NA>  low                 1
3  3    low   <NA>   <NA>  low                 0
4  4    low   high   <NA>  low                 1

这是一个base R解决方案:

calculated_column = (apply(df1,1,function(x) sum(grepl("high",x)))>0)*1

cbind(df1, calculated_column) 
  ID Winter Spring Summer Fall calculated_column
1  1   high   <NA>   high  low                 1
2  2    low   high   <NA>  low                 1
3  3    low   <NA>   <NA>  low                 0
4  4    low   high   <NA>  low                 1