Perl 使用基于字符串的定界符拆分字符串

Perl split a string using a string based delimiter

我是 Perl 的新手,但根据我阅读的文档,Perl 中的 split 函数似乎要求使用正则表达式模式而不是字符串定界符作为第一个参数,但我发现使用 print +(split(' ', $string))[0] 之类的东西仍会正确拆分字符串。

基于此,我尝试使用可变定界符(例如 print +(split($var, $string))[0] where $var = ' '),但发现它不起作用。我做错了什么?

谢谢!

编辑: 抱歉这个可怕的问题。我是 运行 this 针对带有前导空格的字符串,发现 split 函数不喜欢前导空格。例如:

my $var = ' '; print +(split($var, ' abc ddddd'))[0] 给出空白输出。 $var 是否在拆分函数中被解释为 /$var/

对比

print +(split(' ', ' abc ddddd'))[0] 输出 abc

因此,当我阅读文档时,我假设我的变量将被视为文字字符串,而实际上它不是,因此未去除前导空格。

说明

当你拆分文字时 space

split ' '

您调用 the documentation 中描述的特殊情况。当你使用变量

my $var = ' ';
split $var;

这与将该变量放入正则表达式中相同:

split /$var/;

这将在单个白色上拆分space,不是一回事。例如,如果您有此代码:

my $string = "foo bar   baz";
my @literal = split ' ', $string;
my @space = split / /, $string;

然后 @literal 将包含 "foo", "bar", "baz",而 @space 将包含 "foo", "bar", "", "", "baz" -- 在单个 space 上拆分的空字段。


文档

文档是这样描述的:

As another special case, split emulates the default behavior of the command line tool awk when the PATTERN is either omitted or a literal string composed of a single space character (such as ' ' or "\x20" , but not e.g. / / ). In this case, any leading whitespace in EXPR is removed before splitting occurs, and the PATTERN is instead treated as if it were /\s+/ ; in particular, this means that any contiguous whitespace (not just a single space character) is used as a separator. However, this special treatment can be avoided by specifying the pattern / / instead of the string " " , thereby allowing only a single space character to be a separator. In earlier Perls this special case was restricted to the use of a plain " " as the pattern argument to split, in Perl 5.18.0 and later this special case is triggered by any expression which evaluates as the simple string " " .

解决方法

请注意,如果您正在寻找一种使用变量动态模拟 ' ' 拆分的方法,则可以改用 /\s+/。它不完全相同,因为它不会去除前导白色space,但除此之外应该按预期工作。

我认为你的代码工作正常

my $text = "botolo";
my $separator = "o";
print +(split($separator, $text))[0];  
#uglyness with + necessary because Perl

虽然,以额外一行为代价,我宁愿将最后一行写成:

my @parts = split($separator, $text);
print $parts[0];