Javascript 正则表达式引擎:对于非单词字符,单词边界在字符串开头不匹配

Javascript regex engine: Word boundaries not matching at start of string for non-word characters

我认为 \b 匹配单词和非单词字符之间的过渡,或者匹配字符串的开头或结尾。所以这应该是真的:

'#abc'.match(/\b#/)

但它是空的,至少在 Firefox 和 Chrome 中是这样。知道为什么吗?

\b 等同于 (^\w|\w$|\W\w|\w\W)。您可能已经从 the mozilla documentation 阅读了以下内容:

A word boundary matches the position between a word character followed by a non-word character, or between a non-word character followed by a word character, or the beginning of the string, or the end of the string.

写得不对。当与单词字符相邻时,它应该指定匹配字符串的开头或结尾。这就是在试图解释一些相当算法的东西时写长句子而不是使用要点的问题:它很难阅读,因此也很难校对。


来自 a source other than mozilla 的正确定义示例:

There are three different positions that qualify as word boundaries:

  • Before the first character in the string, if the first character is a word character.
  • After the last character in the string, if the last character is a word character.
  • Between two characters in the string, where one is a word character and the other is not a word character.

'#' 不是 Word 字符,因此在字符串开头没有要匹配的 Word 边界。就这么简单。

如果你删除'#',所以它只是'abc',那么'\b'将正确匹配Word boundary