php preg_replace 使用数组 - 带有重音字符的第一个或最后一个字母不起作用

php preg_replace using arrays - first or last letter having an accented character does not work

在这个例子中,我有单词 así,它以带重音符号的 i 字符结尾。

 $str = "A string containing the word así which should be changed to color purple";

  $prac[] = "/\basí\b/i";
  $prac2[] = "<span class='readword'  style='color:purple'>$0 </span>";

 $str= preg_replace($prac,$prac2,$str);

 echo $str;

不变。但是,如果我有一个不以重音字符结尾或开头的单词,它确实会改变。例如:

 $str = "A string containing another word which should be changed to color 
  purple";
  $prac[] = "/\banother word\b/i";
  $prac2[] = "<span class='readword'  style='color:purple'>$0 </span>";

 $str= preg_replace($prac,$prac2,$str);

 echo $str;
 ?>

如果重音位于单词的中间,它也总是有效。我还用这个词测试了数组本身和 preg_replace 本身。数组或 preg_replace 的单词似乎没有问题。仅当我在 preg_replace.

中使用数组作为参数时

请帮忙,在任何地方都找不到这方面的任何信息。

谢谢

显然 PHP 将重音字符视为单词边界,匹配单词边界的 3 个条件 \b 是:

  • Before the first character in the string, if the first character is a word character.
  • After the last character in the string, if the last character is a word character.
  • Between two characters in the string, where one is a word character and the other is not a word character.

来源:https://www.regular-expressions.info/wordboundaries.html

所以当你使用 /\basí\b/i 来匹配字符串中的 así 时,它不会导致没有满足 3 个引用条件,第一个和第二个是显而易见的,因为 así 在字符串的中间,第三个表示要匹配字符串中的 \b 我们需要两个字符,一个是单词字符,另一个不是,这里我们有 í 和 space </code> 这两个都不是单词字符。</p> <p>毕竟我也不确定我的理解是否正确</p> <p>对于解决方案,您可以将 reg exp 替换为 <code>/\basí(\b|\s+)/i

也检查Regex word boundary issue when angle brackets are adjacent to the boundary

http://php.net/manual/en/function.preg-replace.php#89471

使用 unicode 标志:

$str = "A string containing the word así which should be changed to color purple";
$prac[] = "/\basí\b/iu";
#             here __^
$prac2[] = "<span class='readword'  style='color:purple'>$0 </span>";
$str= preg_replace($prac,$prac2,$str);
echo $str;

给定示例的结果:

A string containing the word <span class='readword'  style='color:purple'>así </span> which should be changed to color purple