根据另一列中的初始字符串添加列值

Adding column values based on initial character strings in another column

我正在尝试根据另一列中是否存在特定字符值将“YES/NO”字符值添加到列中。

这是一个 example :

     V2.x          Clitic
1    can could     NA
2    d should      NA

如果example$V2.x中第一列以^d^ll开头,则example$Clitic中的值应为YES;如果不是,应该是NO。

所以在上面的 df 中 example[1,2] 应该是 NO 而 example[2,2] 应该是 YES。

希望在包含数百行和十几列的数据集上实现自动化。不知道该怎么做,尽管 grepl() 似乎很有用。非常感谢您的帮助。

结构:

structure(list(V2.x = structure(c(1L, 19L), .Label = c("can could", 
"can cud", "can may", "can might", "can should", "can will", 
"can would", "could can", "could may", "could might", "could should", 
"could used to", "could will", "d can", "d could", "d may", "d might", 
"d must", "d should", "d used to", "d will", "have to should", 
"have to will", "ll can", "ll could", "ll may", "ll might", "ll must", 
"ll shall", "ll should", "ll used to", "ll would", "may can", 
"may might", "may must", "may shall", "may should", "may used to", 
"may will", "may would", "might can", "might could", "might may", 
"might must", "might shall", "might should", "might will", "might would", 
"might wud", "must can", "must will", "must would", "shall can", 
"shall will", "should can", "should could", "should may", "should might", 
"should must", "should will", "should would", "used to could", 
"will can", "will could", "will kin", "will may", "will might", 
"will must", "will shall", "will should", "will would", "would can", 
"would could", "would may", "would might", "would must", "would should", 
"would will"), class = "factor"), Clitic = c(NA, NA)), row.names = 1:2, class = "data.frame")

您已经拥有可在 grepl 中使用的正则表达式,其中 returns 个逻辑值。

grepl('^(d|ll)', example$V2.x)
#[1] FALSE  TRUE

要获得“是”/“否”值,请将其插入 ifelse :

example$Clitic <- ifelse(grepl('^(d|ll)', example$V2.x), 'Yes', 'No')
#Without ifelse
#example$Clitic <- c('No', 'Yes')[grepl('^(d|ll)', example$V2.x) + 1]
example

#       V2.x Clitic
#1 can could     No
#2  d should    Yes

我们可以使用str_detect

library(stringr)
library(dplyr)
example %>%
   mutate(Clitic = case_when(str_detect(V2.x, "^(d|ll)") ~ "Yes", TRUE ~ "No"))