R,创建由第一列组成的新列,或者如果满足条件,则创建第二/第三列的值
R, create new column that consists of 1st column or if condition is met, a value from the 2nd/3rd column
a b c d
1 boiler maker <NA> <NA>
2 clerk assistant <NA> <NA>
3 senior machine setter <NA>
4 operated <NA> <NA> <NA>
5 consultant legal <NA> <NA>
如何创建一个新列,它采用列 'a' 中的值,除非任何其他列包含 legal
或 assistant
,在这种情况下它采用该值?
试试这个:
library("dplyr")
df %>%
mutate(new=ifelse(b=="Legal" | c=="Legal" | d=="Legal", "Legal",
ifelse(b=="assistant" | c=="assistant" | d=="assistant", "assistant",
as.character(a))))
如果值为 factors
,则需要 as.character
。如果没有,那就没必要了。
这是一个base-R解决方案。我们使用 apply
和 any
一次测试每一列。
df$col <- as.character(df$a)
df$col[apply(df == "Legal",1,any)] <- "Legal"
df$col[apply(df == "assistant",1,any)] <- "assistant"
@scoa 答案的基本 R 替代方案:
indx <- apply(mydf == "Legal",1,any) + apply(mydf == "assistant",1,any)*2 + 1L
mydf$col <- c("a","Legal","Assistent")[indx]
或一次性:
mydf$col <- c("a","Legal","Assistent")[apply(mydf == "Legal",1,any) + apply(mydf == "assistant",1,any)*2 + 1L]
给出:
> mydf
a b c d col
1 boiler maker <NA> <NA> a
2 clerk assistant <NA> <NA> Assistent
3 senior machine setter <NA> a
4 operated <NA> <NA> <NA> a
5 consultant Legal <NA> <NA> Legal
a b c d
1 boiler maker <NA> <NA>
2 clerk assistant <NA> <NA>
3 senior machine setter <NA>
4 operated <NA> <NA> <NA>
5 consultant legal <NA> <NA>
如何创建一个新列,它采用列 'a' 中的值,除非任何其他列包含 legal
或 assistant
,在这种情况下它采用该值?
试试这个:
library("dplyr")
df %>%
mutate(new=ifelse(b=="Legal" | c=="Legal" | d=="Legal", "Legal",
ifelse(b=="assistant" | c=="assistant" | d=="assistant", "assistant",
as.character(a))))
如果值为 factors
,则需要 as.character
。如果没有,那就没必要了。
这是一个base-R解决方案。我们使用 apply
和 any
一次测试每一列。
df$col <- as.character(df$a)
df$col[apply(df == "Legal",1,any)] <- "Legal"
df$col[apply(df == "assistant",1,any)] <- "assistant"
@scoa 答案的基本 R 替代方案:
indx <- apply(mydf == "Legal",1,any) + apply(mydf == "assistant",1,any)*2 + 1L
mydf$col <- c("a","Legal","Assistent")[indx]
或一次性:
mydf$col <- c("a","Legal","Assistent")[apply(mydf == "Legal",1,any) + apply(mydf == "assistant",1,any)*2 + 1L]
给出:
> mydf
a b c d col
1 boiler maker <NA> <NA> a
2 clerk assistant <NA> <NA> Assistent
3 senior machine setter <NA> a
4 operated <NA> <NA> <NA> a
5 consultant Legal <NA> <NA> Legal