php preg_replace 使用数组 - 带有重音字符的第一个或最后一个字母不起作用
php preg_replace using arrays - first or last letter having an accented character does not work
在这个例子中,我有单词 así,它以带重音符号的 i 字符结尾。
$str = "A string containing the word así which should be changed to color purple";
$prac[] = "/\basí\b/i";
$prac2[] = "<span class='readword' style='color:purple'>$0 </span>";
$str= preg_replace($prac,$prac2,$str);
echo $str;
不变。但是,如果我有一个不以重音字符结尾或开头的单词,它确实会改变。例如:
$str = "A string containing another word which should be changed to color
purple";
$prac[] = "/\banother word\b/i";
$prac2[] = "<span class='readword' style='color:purple'>$0 </span>";
$str= preg_replace($prac,$prac2,$str);
echo $str;
?>
如果重音位于单词的中间,它也总是有效。我还用这个词测试了数组本身和 preg_replace 本身。数组或 preg_replace 的单词似乎没有问题。仅当我在 preg_replace.
中使用数组作为参数时
请帮忙,在任何地方都找不到这方面的任何信息。
谢谢
显然 PHP 将重音字符视为单词边界,匹配单词边界的 3 个条件 \b
是:
- Before the first character in the string, if the first character is a word character.
- After the last character in the string, if the last character is a word character.
- Between two characters in the string, where one is a word character and the other is not a word character.
来源:https://www.regular-expressions.info/wordboundaries.html
所以当你使用 /\basí\b/i
来匹配字符串中的 así
时,它不会导致没有满足 3 个引用条件,第一个和第二个是显而易见的,因为 así
在字符串的中间,第三个表示要匹配字符串中的 \b
我们需要两个字符,一个是单词字符,另一个不是,这里我们有 í
和 space </code> 这两个都不是单词字符。</p>
<p>毕竟我也不确定我的理解是否正确</p>
<p>对于解决方案,您可以将 reg exp 替换为 <code>/\basí(\b|\s+)/i
也检查Regex word boundary issue when angle brackets are adjacent to the boundary
使用 unicode 标志:
$str = "A string containing the word así which should be changed to color purple";
$prac[] = "/\basí\b/iu";
# here __^
$prac2[] = "<span class='readword' style='color:purple'>$0 </span>";
$str= preg_replace($prac,$prac2,$str);
echo $str;
给定示例的结果:
A string containing the word <span class='readword' style='color:purple'>así </span> which should be changed to color purple
在这个例子中,我有单词 así,它以带重音符号的 i 字符结尾。
$str = "A string containing the word así which should be changed to color purple";
$prac[] = "/\basí\b/i";
$prac2[] = "<span class='readword' style='color:purple'>$0 </span>";
$str= preg_replace($prac,$prac2,$str);
echo $str;
不变。但是,如果我有一个不以重音字符结尾或开头的单词,它确实会改变。例如:
$str = "A string containing another word which should be changed to color
purple";
$prac[] = "/\banother word\b/i";
$prac2[] = "<span class='readword' style='color:purple'>$0 </span>";
$str= preg_replace($prac,$prac2,$str);
echo $str;
?>
如果重音位于单词的中间,它也总是有效。我还用这个词测试了数组本身和 preg_replace 本身。数组或 preg_replace 的单词似乎没有问题。仅当我在 preg_replace.
中使用数组作为参数时请帮忙,在任何地方都找不到这方面的任何信息。
谢谢
显然 PHP 将重音字符视为单词边界,匹配单词边界的 3 个条件 \b
是:
- Before the first character in the string, if the first character is a word character.
- After the last character in the string, if the last character is a word character.
- Between two characters in the string, where one is a word character and the other is not a word character.
来源:https://www.regular-expressions.info/wordboundaries.html
所以当你使用 /\basí\b/i
来匹配字符串中的 así
时,它不会导致没有满足 3 个引用条件,第一个和第二个是显而易见的,因为 así
在字符串的中间,第三个表示要匹配字符串中的 \b
我们需要两个字符,一个是单词字符,另一个不是,这里我们有 í
和 space </code> 这两个都不是单词字符。</p>
<p>毕竟我也不确定我的理解是否正确</p>
<p>对于解决方案,您可以将 reg exp 替换为 <code>/\basí(\b|\s+)/i
也检查Regex word boundary issue when angle brackets are adjacent to the boundary
使用 unicode 标志:
$str = "A string containing the word así which should be changed to color purple";
$prac[] = "/\basí\b/iu";
# here __^
$prac2[] = "<span class='readword' style='color:purple'>$0 </span>";
$str= preg_replace($prac,$prac2,$str);
echo $str;
给定示例的结果:
A string containing the word <span class='readword' style='color:purple'>así </span> which should be changed to color purple