各种白色space有哪些变种?
What are all kinds of white space variants?
我想用下划线替换所有类型的白色 space。
但是我的问题是白space有很多种。
目前发现是:
- 不间断space
- zh-CN space
- em space
- 瘦space
我正在使用
preg_replace("/\p{Z}/", "_", $text);
我想要所有类型的白色 spaces 的列表。
您可以使用
preg_replace("/\s/u", "_", $text);
u
修饰符将使 \s
识别 Unicode,并且它将匹配任何 Unicode 空白字符。
这是对 U+0020U+00A0U+1680U+2000U+2001U+2002U+2003U+2004U+2005U+2006U+2007U+2008U+2009U+200AU+202FU+205FU+3000U+2028U+2029TextU+000B\r\n\t
字符串的测试:
$text = "
Text\x0B\r\n\t";
$res = preg_replace("/\s/u", "_", $text);
echo $res; // => ___________________Text____
U+0020 SPACE
U+00A0 NO-BREAK SPACE
U+1680 OGHAM SPACE MARK
U+2000 EN QUAD
U+2001 EM QUAD
U+2002 EN SPACE
U+2003 EM SPACE
U+2004 THREE-PER-EM SPACE
U+2005 FOUR-PER-EM SPACE
U+2006 SIX-PER-EM SPACE
U+2007 FIGURE SPACE
U+2008 PUNCTUATION SPACE
U+2009 THIN SPACE
U+200A HAIR SPACE
U+202F NARROW NO-BREAK SPACE
U+205F MEDIUM MATHEMATICAL SPACE
U+3000 IDEOGRAPHIC SPACE
U+2028 LINE SEPARATOR
U+2029 PARAGRAPH SEPARATOR
U+000A LINE FEED
U+000B LINE TABULATION
U+000D CARRIAGE RETURN (CR)
U+0009 CHARACTER TABULATION
我想用下划线替换所有类型的白色 space。
但是我的问题是白space有很多种。 目前发现是:
- 不间断space
- zh-CN space
- em space
- 瘦space
我正在使用
preg_replace("/\p{Z}/", "_", $text);
我想要所有类型的白色 spaces 的列表。
您可以使用
preg_replace("/\s/u", "_", $text);
u
修饰符将使 \s
识别 Unicode,并且它将匹配任何 Unicode 空白字符。
这是对 U+0020U+00A0U+1680U+2000U+2001U+2002U+2003U+2004U+2005U+2006U+2007U+2008U+2009U+200AU+202FU+205FU+3000U+2028U+2029TextU+000B\r\n\t
字符串的测试:
$text = "
Text\x0B\r\n\t";
$res = preg_replace("/\s/u", "_", $text);
echo $res; // => ___________________Text____
U+0020 SPACE
U+00A0 NO-BREAK SPACE
U+1680 OGHAM SPACE MARK
U+2000 EN QUAD
U+2001 EM QUAD
U+2002 EN SPACE
U+2003 EM SPACE
U+2004 THREE-PER-EM SPACE
U+2005 FOUR-PER-EM SPACE
U+2006 SIX-PER-EM SPACE
U+2007 FIGURE SPACE
U+2008 PUNCTUATION SPACE
U+2009 THIN SPACE
U+200A HAIR SPACE
U+202F NARROW NO-BREAK SPACE
U+205F MEDIUM MATHEMATICAL SPACE
U+3000 IDEOGRAPHIC SPACE
U+2028 LINE SEPARATOR
U+2029 PARAGRAPH SEPARATOR
U+000A LINE FEED
U+000B LINE TABULATION
U+000D CARRIAGE RETURN (CR)
U+0009 CHARACTER TABULATION