包含符号的嵌套括号的正则表达式
Regular expression for nested brackets that contain a symbol
我只需要替换 [with square brackets] 那些包含逗号的括号,无论它们在哪个嵌套级别。
原始字符串示例:
start (one, two, three(*)), some text (1,2,3), and (4, 5(*)), another
(four), interesting (five (6, 7)), text (six($)), here is (seven)
预期结果:
start [one, two, three(*)], some text [1,2,3], and [4, 5(*)], another
(four), interesting (five [6, 7]), text (six($)), here is (seven)
我能做的最好的就是不能处理带有嵌套括号的部分:
preg_replace('~ \( ( [^()]+ (\([^,]+\))? , [^()]+ )+ \) ~x', ' []', $string);
// start (one, two, three(*)), some text [1,2,3], and (4, 5(*)), another (four), interesting (five [6, 7]), text (six($)), here is (seven)
好的,这不是正则表达式,但是,如果您找不到正则表达式,下一个算法就是您的 B 计划,大量评论(它可能对某些人有用,这就是 Whosebug 的用途) ) :
$str = "start (one, two, three(\*)), some text (1,2,3), and (4, 5(*)), another " .
"(four), interesting (five (6, 7)), text (six($)), here is (seven)";
echo $str . "<br/>";
$PARs = array(); // ◄■ POSITIONS OF "(" AND COMMAS.
for ( $i = 0; $i < strlen( $str ); $i++ )
switch ( $str[ $i ] )
{
case "(" : array_push( $PARs, array($i,false) ); // ◄■ POSITION OF "(" (NO COMMA YET).
break;
case ")" : $POS = array_pop( $PARs ); // ◄■ [POSITION OF PREVIOUS "("][THERE'S COMMA]
if ( $POS[1] ) // ◄■ IF THERE WAS COMMA IN CURRENT "()"...
{
$str[ $POS[0] ] = "["; // ◄■ REPLACE "(".
$str[ $i ] = "]"; // ◄■ REPLACE ")".
}
break;
case "," : if ( ! empty( $PARs ) ) // ◄■ IGNORE COMMAS IF NOT IN "()".
$PARs[ count($PARs) - 1 ][1] = true; // COMMA FOUND.
}
echo $str . // ◄■ RESULT.
// COMPARE WITH EXPECTED ▼
"<br/>start [one, two, three(\*)], some text [1,2,3], and [4, 5(*)], another " .
"(four), interesting (five [6, 7]), text (six($)), here is (seven)";
编辑:修复了@trincot 发现的错误(感谢)。
我会将输入标记化,用逗号和括号将其拆分,同时保留这些分隔符作为结果。然后用递归算法检测某对括号是否出现逗号,并进行适当的替换。
这是一个函数:
function replaceWithBrackets($s) {
function recur(&$tokens) {
$comma = false;
$replaced = "";
while (true) {
$token = current($tokens);
next($tokens);
if ($token == ")" || $token === false) break;
if ($token == "(") {
[$substr, $subcomma] = recur($tokens);
$replaced .= $subcomma ? "[$substr]" : "($substr)";
} else {
$comma = $comma || $token == ",";
$replaced .= $token;
}
}
return [$replaced, $comma];
}
$tokens = preg_split("~([(),])~", $s, 0, PREG_SPLIT_DELIM_CAPTURE);
return recur($tokens)[0];
}
我只需要替换 [with square brackets] 那些包含逗号的括号,无论它们在哪个嵌套级别。
原始字符串示例:
start (one, two, three(*)), some text (1,2,3), and (4, 5(*)), another
(four), interesting (five (6, 7)), text (six($)), here is (seven)
预期结果:
start [one, two, three(*)], some text [1,2,3], and [4, 5(*)], another
(four), interesting (five [6, 7]), text (six($)), here is (seven)
我能做的最好的就是不能处理带有嵌套括号的部分:
preg_replace('~ \( ( [^()]+ (\([^,]+\))? , [^()]+ )+ \) ~x', ' []', $string);
// start (one, two, three(*)), some text [1,2,3], and (4, 5(*)), another (four), interesting (five [6, 7]), text (six($)), here is (seven)
好的,这不是正则表达式,但是,如果您找不到正则表达式,下一个算法就是您的 B 计划,大量评论(它可能对某些人有用,这就是 Whosebug 的用途) ) :
$str = "start (one, two, three(\*)), some text (1,2,3), and (4, 5(*)), another " .
"(four), interesting (five (6, 7)), text (six($)), here is (seven)";
echo $str . "<br/>";
$PARs = array(); // ◄■ POSITIONS OF "(" AND COMMAS.
for ( $i = 0; $i < strlen( $str ); $i++ )
switch ( $str[ $i ] )
{
case "(" : array_push( $PARs, array($i,false) ); // ◄■ POSITION OF "(" (NO COMMA YET).
break;
case ")" : $POS = array_pop( $PARs ); // ◄■ [POSITION OF PREVIOUS "("][THERE'S COMMA]
if ( $POS[1] ) // ◄■ IF THERE WAS COMMA IN CURRENT "()"...
{
$str[ $POS[0] ] = "["; // ◄■ REPLACE "(".
$str[ $i ] = "]"; // ◄■ REPLACE ")".
}
break;
case "," : if ( ! empty( $PARs ) ) // ◄■ IGNORE COMMAS IF NOT IN "()".
$PARs[ count($PARs) - 1 ][1] = true; // COMMA FOUND.
}
echo $str . // ◄■ RESULT.
// COMPARE WITH EXPECTED ▼
"<br/>start [one, two, three(\*)], some text [1,2,3], and [4, 5(*)], another " .
"(four), interesting (five [6, 7]), text (six($)), here is (seven)";
编辑:修复了@trincot 发现的错误(感谢)。
我会将输入标记化,用逗号和括号将其拆分,同时保留这些分隔符作为结果。然后用递归算法检测某对括号是否出现逗号,并进行适当的替换。
这是一个函数:
function replaceWithBrackets($s) {
function recur(&$tokens) {
$comma = false;
$replaced = "";
while (true) {
$token = current($tokens);
next($tokens);
if ($token == ")" || $token === false) break;
if ($token == "(") {
[$substr, $subcomma] = recur($tokens);
$replaced .= $subcomma ? "[$substr]" : "($substr)";
} else {
$comma = $comma || $token == ",";
$replaced .= $token;
}
}
return [$replaced, $comma];
}
$tokens = preg_split("~([(),])~", $s, 0, PREG_SPLIT_DELIM_CAPTURE);
return recur($tokens)[0];
}