使用标点符号在 R 中过滤

Filtering in R with punctuation characters

我有一列数据集如下所示:

$abc.MSFT

$MSFT

$msft

$abcMSFTxyz

我想要以下输出:

$MSFT  

$msft

我的过滤尝试:

dplyr::filter(Tweets, grepl("\bMMM$\b", ignore.case = TRUE, V2))

returns:

$abc.MSFT

$MSFT

$msft

dplyr::filter(Tweets,grepl("^$MMM$", ignore.case = TRUE, V2))

returns:

一种处理方法:

x <- c("$abc.MSFT", "$MSFT", "$msft", "$abcMSFTxyz")
Tweets <- data.frame(V2=x, stringsAsFactors=F)
Tweets
#           V2
#1   $abc.MSFT
#2       $MSFT
#3       $msft
#4 $abcMSFTxyz

#your way
dplyr::filter(Tweets, grepl("\bMMM$\b", ignore.case = TRUE, V2))
[1] V2
<0 rows> (or 0-length row.names)

#another way
dplyr::filter(Tweets, grepl("^\$msft$", ignore.case = TRUE, V2))
     V2
1 $MSFT
2 $msft

来自regex help

..there are 12 characters with special meanings: the backslash \, the caret ^, the dollar sign $, the period or dot ., the vertical bar or pipe symbol |, the question mark ?, the asterisk or star *, the plus sign +, the opening parenthesis (, the closing parenthesis ), and the opening square bracket [, the opening curly brace {, These special characters are often called "metacharacters".

修复:

If you want to use any of these characters as a literal in a regex, you need to escape them with a backslash. If you want to match 1+1=2, the correct regex is 1\+1=2. Otherwise, the plus sign has a special meaning.

研究正则表达式。他们值得花时间学习您希望使用的任何语言。