gsub 中的正则表达式无效
invalid regular expression in gsub
为什么电子邮件 regex
的 error
为 invalid regular expression '^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$', reason 'Invalid character range'
blogs.smpl <- "mail:mami@yahoo.com: subject:Lorem Ipsum body: is simply dummy text of the printing and typesetting industry.
Lorem Ipsum has been the industry's standard dummy text ever since the 1500s"
blogs.smpl <- gsub("^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$","",blogs.smpl)
原因是这一段:
[a-zA-Z0-9-.]
试着像这样把破折号放在最后:
[a-zA-Z0-9.-]
因为-
只应该在字符class的开头或结尾。否则,它表示它之前和之后的符号之间的范围。
最后一个字符 class 有误:[a-zA-Z0-9-.]
。必须转成[a-zA-Z0-9.-]
.
注意:在 R 中,您不能转义字符 class 内的连字符以匹配文字连字符,除非您使用 perl=TRUE
.
此外,请参阅 R String Manipulation PDF 了解有关 R 字符 classes(第 2 页)和一般正则表达式的更多信息。以下是摘录:
Here is a set of rules on how to match characters as regular
characters inside a character class: To match ]
inside a character
class put it first.
To match -
inside a character class put it first
or last.
To match ^
inside a character class put it anywhere, but first.
To match any other character or metacharacter (but \
) inside a
character class put it anywhere.
为什么电子邮件 regex
的 error
为 invalid regular expression '^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$', reason 'Invalid character range'
blogs.smpl <- "mail:mami@yahoo.com: subject:Lorem Ipsum body: is simply dummy text of the printing and typesetting industry.
Lorem Ipsum has been the industry's standard dummy text ever since the 1500s"
blogs.smpl <- gsub("^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$","",blogs.smpl)
原因是这一段:
[a-zA-Z0-9-.]
试着像这样把破折号放在最后:
[a-zA-Z0-9.-]
因为-
只应该在字符class的开头或结尾。否则,它表示它之前和之后的符号之间的范围。
最后一个字符 class 有误:[a-zA-Z0-9-.]
。必须转成[a-zA-Z0-9.-]
.
注意:在 R 中,您不能转义字符 class 内的连字符以匹配文字连字符,除非您使用 perl=TRUE
.
此外,请参阅 R String Manipulation PDF 了解有关 R 字符 classes(第 2 页)和一般正则表达式的更多信息。以下是摘录:
Here is a set of rules on how to match characters as regular characters inside a character class: To match
]
inside a character class put it first.To match
-
inside a character class put it first or last.To match
^
inside a character class put it anywhere, but first.To match any other character or metacharacter (but
\
) inside a character class put it anywhere.