R:检测文本中的单词和标点符号
R: detect words and punctuation marks in text
我有一些自然出现的文字:
text="word1 word2 word3. word4, word5 word6 word7"
以及我想在该文本中检测到的一些元素:
elements=c("word2","word6 word7",".",",")
然而,
elements[sapply(paste0("\<",elements,"\>"),grepl,text)]
只有 returns 一元组 "word2" 和二元组 "word6 word7"。未检测到文本中的句点和逗号。
我怎样才能做到这一点?
elements[sapply(paste0("[",elements,"]"),grepl,text)] does the job.
您不需要包含方括号,因为方括号是正则表达式中的特殊元字符,表示字符 class.
> text="word1 word2 word3. word4, word5 word6 word7"
> elements=c("word2","word6 word7",".",",")
> elements[sapply(paste0(elements),grepl,text, fixed=T)]
[1] "word2" "word6 word7" "." ","
我有一些自然出现的文字:
text="word1 word2 word3. word4, word5 word6 word7"
以及我想在该文本中检测到的一些元素:
elements=c("word2","word6 word7",".",",")
然而,
elements[sapply(paste0("\<",elements,"\>"),grepl,text)]
只有 returns 一元组 "word2" 和二元组 "word6 word7"。未检测到文本中的句点和逗号。
我怎样才能做到这一点?
elements[sapply(paste0("[",elements,"]"),grepl,text)] does the job.
您不需要包含方括号,因为方括号是正则表达式中的特殊元字符,表示字符 class.
> text="word1 word2 word3. word4, word5 word6 word7"
> elements=c("word2","word6 word7",".",",")
> elements[sapply(paste0(elements),grepl,text, fixed=T)]
[1] "word2" "word6 word7" "." ","