R,创建由第一列组成的新列,或者如果满足条件,则创建第二/第三列的值

R, create new column that consists of 1st column or if condition is met, a value from the 2nd/3rd column

           a       b      c    d
1     boiler   maker   <NA> <NA> 
2      clerk assistant <NA> <NA> 
3     senior machine setter <NA> 
4   operated    <NA>   <NA> <NA> 
5 consultant    legal  <NA> <NA> 

如何创建一个新列,它采用列 'a' 中的值,除非任何其他列包含 legalassistant,在这种情况下它采用该值?

试试这个:

library("dplyr")

df %>%
    mutate(new=ifelse(b=="Legal" | c=="Legal" | d=="Legal", "Legal",
                      ifelse(b=="assistant" | c=="assistant" | d=="assistant", "assistant",
                             as.character(a))))
如果值为 factors,则需要

as.character。如果没有,那就没必要了。

这是一个base-R解决方案。我们使用 applyany 一次测试每一列。

df$col <- as.character(df$a)
df$col[apply(df == "Legal",1,any)] <- "Legal"
df$col[apply(df == "assistant",1,any)] <- "assistant"

@scoa 答案的基本 R 替代方案:

indx <- apply(mydf == "Legal",1,any) + apply(mydf == "assistant",1,any)*2 + 1L
mydf$col <- c("a","Legal","Assistent")[indx]

或一次性:

mydf$col <- c("a","Legal","Assistent")[apply(mydf == "Legal",1,any) + apply(mydf == "assistant",1,any)*2 + 1L]

给出:

> mydf
           a         b      c    d       col
1     boiler     maker   <NA> <NA>         a
2      clerk assistant   <NA> <NA> Assistent
3     senior   machine setter <NA>         a
4   operated      <NA>   <NA> <NA>         a
5 consultant     Legal   <NA> <NA>     Legal