匹配R中的特殊字符

Matching special character in R

您好,我有以下数据。

shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2", 
                   "appple+20gfree", 
                   "BELI HG MSWAT ALA +VAT T 100g BAR WR", 
                   "TOOLAIT CASSE+LSST+SSSRE 40g SAC MDC")

在我的第二步中,我删除了 shopping_list 中的所有空格。

require(stringr)
shopping_list_trim <- str_replace_all(shopping_list, fixed(" "), "")
print(shopping_list_trim)
[1] "applesx4" "bagofflour" "bagofsugar"             
[4] "milkx2" "appple+20gfree" "BELIHGMSWATALA+VATT100gBARWR"
[7] "TOOLAITCASSE+LSST+SSSRE40gSACMDC"

如果我想提取不包含加号的字符串,我使用以下代码。

str_extract(shopping_list_trim, "^[^+]+$")
[1] "applesx4"   "bagofflour" "bagofsugar" "milkx2"  NA  NA NA     

想帮助提取包含加号的字符串。 我希望输出如下。

NA NA NA NA   "appple+20gfree" 
"BELIHGMSWATALA+VATT100gBARWR" "TOOLAITCASSE+LSST+SSSRE40gSACMDC"

有人知道如何只提取包含加号的字符串吗?

这样就可以了

> str_extract(shopping_list_trim, "^(?=.*\+)(.+)$")
[1] NA                                
[2] NA                                
[3] NA                                
[4] NA                                
[5] "appple+20gfree"                  
[6] "BELIHGMSWATALA+VATT100gBARWR"    
[7] "TOOLAITCASSE+LSST+SSSRE40gSACMDC"

正则表达式分解

^(?=.*\+) #Lookahead to check if there is one plus sign
(.+)$ #Capture the string if the above is true

如果您t/don不想使用环视,请尝试

^.*\+.*$

它匹配 anything 后跟 + 后跟 anything :)

See it work here at regex101.

此致