正则表达式的含义，如 - \\d、\\D、^、$ 等

Question

这些表达是什么意思？我在哪里可以了解它们的用法？

\d 
\D 
\s 
\S 
\w 
\W
\t 
\n 
^   
$   
\   
|  etc..

我需要使用 stringr 包，但我完全不知道如何使用它们。

Answer 1

来自 ?regexp，在 扩展正则表达式 部分：

The caret ‘^’ and the dollar sign ‘$’ are metacharacters that respectively match the empty string at the beginning and end of a line. The symbols ‘\<’ and ‘>’ match the empty string at the beginning and end of a word. The symbol ‘\b’ matches the empty string at either edge of a word, and ‘\B’ matches the empty string provided it is not at an edge of a word. (The interpretation of ‘word’ depends on the locale and implementation: these are all extensions.)

来自类似 Perl 的正则表达式:

The escape sequences ‘\d’, ‘\s’ and ‘\w’ represent any decimal digit, space character and ‘word’ character (letter, digit or underscore in the current locale: in UTF-8 mode only ASCII letters and digits are considered) respectively, and their upper-case versions represent their negation. Vertical tab was not regarded as a space character in a ‘C’ locale before PCRE 8.34 (included in R 3.0.3). Sequences ‘\h’, ‘\v’, ‘\H’ and ‘\V’ match horizontal and vertical space or the negation. (In UTF-8 mode, these do match non-ASCII Unicode code points.)

请注意，在 R 输入中反斜杠通常需要 doubled/protected，例如您将使用 "\h" 来匹配水平 space.

来自 ?Quotes:

Backslash is used to start an escape sequence inside character constants. Escaping a character not in the following table is an error.
\n newline
\r carriage return
\t tab

正如其他人在上面评论的那样，如果您是第一次开始使用正则表达式，您可能需要更多帮助。这对于 Whosebug 来说有点题外话（指向站外资源的链接），但在 gsubfn package overview 的底部有一些指向正则表达式资源的链接。或者 Google "regular expression tutorial" ...

正则表达式的含义，如 - \\d、\\D、^、$ 等

Meaning of regular expressions like - \\d , \\D, ^ , $ etc

regex

r

gsub

stringr