在不使用 preg_* 函数的情况下重建字符串中可能重复字符的第一个实例的更好方法
Better way to reconstruct first instance of a possibly-repeated character within a string without using preg_* functions
我正在使用的一个函数需要检测字符的第一个实例,如果字符重复,则重建重复的子字符串。例如:
$x = 'fhdfhbc::::dcdcdcuttr482rdvcjv:ducvdk:::chjvdbj'; // ---> function should extract ::::
我不想使用任何 preg_*
功能,因为我尽可能避免使用这些功能(因为这些功能很慢)。我目前的解决方案是:
$char = ":"; // this would be set as necessary
$char_substring = str_repeat($char, strspn(strstr($x, $char), $char)); // yields ---> ::::
请注意,您不能在此处使用 strrpos
,因为(在这种情况下)字符串的另一端可能有冒号。您可以使用 explode
,然后 运行 一个 for
或 foreach
循环,连接空的,或者这个的一些变体:
$explode = explode($char, $x);
$substring = $char; // explode array should have 1 less empty member than the repeated character, so need to start with char count of 1
$emptyEncountered = false;
for($i = 0, $count = count($explode); $i < $count; $i++) {
if ($explode[$i]) {
if ($emptyEncountered) break;
} else {
$emptyEncountered = true;
$substring .= $char;
}
}
echo $substring; // ---> ::::
有没有比使用 preg_*、for/each 循环或 str_repeat(strspn(strstr())) 更好的方法?
适当的 preg_*
实施将优于您的 explode
方法,不仅在时间上而且在内存消耗和所需分配方面也是如此。
我能想到的唯一有效且符合您的约束的实现是 while
循环:
$substring = '';
$i = strpos($haystack, $needle);
do {
$substring .= $needle;
++$i;
}
while (isset($haystack{$i}) && $haystack{$i} === $needle);
return $substring;
但是,您自己已经有了最有效的实施:
return str_repeat($needle, strspn(strstr($haystack, $needle), $needle));
它在本质上也很实用。
Your implementations are missing error handling, so does my while
implementation. Adding it is definitely required in my opinion but I ignore it because you do.
在装有 Win 10 PHP TS x64 7.1 的 i7 机器上的结果:
$ bench 10000
0.0040609836578369 # str_repeat
0.0044500827789307 # preg_match
0.0046060085296631 # while
0.0050818920135498 # for
0.0052239894866943 # preg_match + preg_quote
0.0079050064086914 # explode
#!/usr/bin/env php
<?php
function bench(callable $cb): void {
global $argv;
$limit = 1000;
if (isset($argv[1]) && is_numeric($argv[1])) {
$limit = (int) $argv[1];
}
elseif (isset($_ENV['LOOP']) && is_numeric($_ENV['LOOP'])) {
$limit = (int) $_ENV['LOOP'];
}
gc_collect_cycles();
gc_disable();
$start = microtime(true);
for ($i = 0; $i < $limit; ++$i) {
$cb();
}
$end = microtime(true);
gc_enable();
gc_collect_cycles();
echo $end - $start, "\n";
}
$haystack = 'fhdfhbc::::dcdcdcuttr482rdvcjv:ducvdk:::chjvdbj';
$needle = ':';
bench(function () use ($haystack, $needle) {
return str_repeat($needle, strspn(strstr($haystack, $needle), $needle));
});
bench(function () use ($haystack, $needle) {
preg_match("/{$needle}{2,}/", $haystack, $matches);
return $matches[0] ?? '';
});
bench(function () use ($haystack, $needle) {
$substring = '';
$i = strpos($haystack, $needle);
do {
$substring .= $needle;
++$i;
}
while (isset($haystack{$i}) && $haystack{$i} === $needle);
return $substring;
});
bench(function () use ($haystack, $needle) {
$substring = '';
for ($i = strpos($haystack, $needle); isset($haystack{$i}) && $haystack{$i} === $needle; ++$i) {
$substring .= $needle;
}
return $substring;
});
bench(function () use ($haystack, $needle) {
$needle = preg_quote($needle, '/');
preg_match("/{$needle}{2,}/", $haystack, $matches);
return $matches[0] ?? '';
});
bench(function () use ($haystack, $needle) {
$explode = explode($needle, $haystack);
$substring = $needle;
$empty = false;
for ($i = 0, $count = count($explode); $i < $count; $i++) {
if ($explode[$i]) {
if ($empty) {
break;
}
}
else {
$empty = true;
$substring .= $needle;
}
}
return $substring;
});
我正在使用的一个函数需要检测字符的第一个实例,如果字符重复,则重建重复的子字符串。例如:
$x = 'fhdfhbc::::dcdcdcuttr482rdvcjv:ducvdk:::chjvdbj'; // ---> function should extract ::::
我不想使用任何 preg_*
功能,因为我尽可能避免使用这些功能(因为这些功能很慢)。我目前的解决方案是:
$char = ":"; // this would be set as necessary
$char_substring = str_repeat($char, strspn(strstr($x, $char), $char)); // yields ---> ::::
请注意,您不能在此处使用 strrpos
,因为(在这种情况下)字符串的另一端可能有冒号。您可以使用 explode
,然后 运行 一个 for
或 foreach
循环,连接空的,或者这个的一些变体:
$explode = explode($char, $x);
$substring = $char; // explode array should have 1 less empty member than the repeated character, so need to start with char count of 1
$emptyEncountered = false;
for($i = 0, $count = count($explode); $i < $count; $i++) {
if ($explode[$i]) {
if ($emptyEncountered) break;
} else {
$emptyEncountered = true;
$substring .= $char;
}
}
echo $substring; // ---> ::::
有没有比使用 preg_*、for/each 循环或 str_repeat(strspn(strstr())) 更好的方法?
适当的 preg_*
实施将优于您的 explode
方法,不仅在时间上而且在内存消耗和所需分配方面也是如此。
我能想到的唯一有效且符合您的约束的实现是 while
循环:
$substring = '';
$i = strpos($haystack, $needle);
do {
$substring .= $needle;
++$i;
}
while (isset($haystack{$i}) && $haystack{$i} === $needle);
return $substring;
但是,您自己已经有了最有效的实施:
return str_repeat($needle, strspn(strstr($haystack, $needle), $needle));
它在本质上也很实用。
Your implementations are missing error handling, so does my
while
implementation. Adding it is definitely required in my opinion but I ignore it because you do.
在装有 Win 10 PHP TS x64 7.1 的 i7 机器上的结果:
$ bench 10000
0.0040609836578369 # str_repeat
0.0044500827789307 # preg_match
0.0046060085296631 # while
0.0050818920135498 # for
0.0052239894866943 # preg_match + preg_quote
0.0079050064086914 # explode
#!/usr/bin/env php
<?php
function bench(callable $cb): void {
global $argv;
$limit = 1000;
if (isset($argv[1]) && is_numeric($argv[1])) {
$limit = (int) $argv[1];
}
elseif (isset($_ENV['LOOP']) && is_numeric($_ENV['LOOP'])) {
$limit = (int) $_ENV['LOOP'];
}
gc_collect_cycles();
gc_disable();
$start = microtime(true);
for ($i = 0; $i < $limit; ++$i) {
$cb();
}
$end = microtime(true);
gc_enable();
gc_collect_cycles();
echo $end - $start, "\n";
}
$haystack = 'fhdfhbc::::dcdcdcuttr482rdvcjv:ducvdk:::chjvdbj';
$needle = ':';
bench(function () use ($haystack, $needle) {
return str_repeat($needle, strspn(strstr($haystack, $needle), $needle));
});
bench(function () use ($haystack, $needle) {
preg_match("/{$needle}{2,}/", $haystack, $matches);
return $matches[0] ?? '';
});
bench(function () use ($haystack, $needle) {
$substring = '';
$i = strpos($haystack, $needle);
do {
$substring .= $needle;
++$i;
}
while (isset($haystack{$i}) && $haystack{$i} === $needle);
return $substring;
});
bench(function () use ($haystack, $needle) {
$substring = '';
for ($i = strpos($haystack, $needle); isset($haystack{$i}) && $haystack{$i} === $needle; ++$i) {
$substring .= $needle;
}
return $substring;
});
bench(function () use ($haystack, $needle) {
$needle = preg_quote($needle, '/');
preg_match("/{$needle}{2,}/", $haystack, $matches);
return $matches[0] ?? '';
});
bench(function () use ($haystack, $needle) {
$explode = explode($needle, $haystack);
$substring = $needle;
$empty = false;
for ($i = 0, $count = count($explode); $i < $count; $i++) {
if ($explode[$i]) {
if ($empty) {
break;
}
}
else {
$empty = true;
$substring .= $needle;
}
}
return $substring;
});