删除 R 中欧元符号后的字符

Removing characters after a EURO symbol in R

我在 "euro" 变量中保存了一个欧元符号:

euro <- "\u20AC"
euro
#[1] "€"

并且 "eurosearch" 变量包含 "services as defined in this SOW at a price of € 15,896.80 (if executed fro" .

eurosearch
[1] "services as defined in this SOW at a price of € 15,896.80 (if executed fro"

我想要欧元符号后的字符“15,896.80(如果执行来回”) 我正在使用此代码:

gsub("^.*[euro]","",eurosearch)

但我得到的结果是空的。如何获得预期的输出?

使用基数 r 中存在的正则匹配或 stringr 中的 str_extarct,等等

> x <- "services as defined in this SOW at a price of € 15,896.80 (if executed fro"
> regmatches(x, regexpr("(?<=€ )\S+", x, perl=T))
[1] "15,896.80"

> gsub("€ (\S+)|.", "\1", x)
[1] "15,896.80"

使用变量。

euro <- "\u20AC"
gsub(paste(euro , "(\S+)|."), "\1", x) 

如果这个使用变量的答案对你不起作用,那么你需要设置编码,

gsub(paste(euro , "(\S+)|."), "\1", `Encoding<-`(x, "UTF8"))

您只需使用 paste0:

连接字符串即可在模式中使用变量
euro <- "€"
eurosearch <- "services as defined in this SOW at a price of € 15,896.80 (if executed fro"
sub(paste0("^.*", gsub("([^A-Za-z_0-9])", "\\\1", euro), "\s*(\S+).*"), "\1", eurosearch)

euro <- "$"
eurosearch <- "services as defined in this SOW at a price of $ 25,196.4 (if executed fro"
sub(paste0("^.*", gsub("([^A-Za-z_0-9])", "\\\1", euro), "\s*(\S+).*"), "\1", eurosearch)

CodingGround demo

请注意,在 gsub("([^A-Za-z_0-9])", "\\\1", euro) 中,我转义了任何非单词符号,以便 $ 可以被视为文字,而不是特殊的正则表达式元字符(取自 this SO post)。