Perl 正则表达式可能会删除非 ASCII 字符

Question

我发现了一个带有正则表达式的代码，声称它去除了文本中的任何非 ASCII 字符。代码是用 Perl 编写的，执行它的代码部分是：

$sentence =~ tr/[=10=]0-13-46-71-53-7//d;

我想了解这个正则表达式的工作原理，为此我使用了 regexr. I found out that [=11=]0, 1, 3, 4, 6, 7, 1, 5, 3, 7 mean separate characters as NULL, TAB, VERTICAL TAB ... But I still do not get why "-" symbols are used in the regex. Do they really mean "dash symbol" as shown in regexr 或其他东西？这个正则表达式真的适合删除非 ASCII 字符吗？

Answer 1

这不是真正的正则表达式。破折号表示一个字符范围，就像在正则表达式字符 class [a-z].

中

该表达式也删除了一些 ASCII 字符（主要是空格）并保留了一定范围内的非 ASCII 字符；完整的 ASCII 范围只是 [=11=]0-7.

明确地说，d 标志表示删除不在第一对斜线之间的任何字符。进一步查看 documentation.

Perl 正则表达式可能会删除非 ASCII 字符

Perl regex presumably removing non ASCII characters

perl

transliteration