preg_match_all 重叠变体 <a><b><c><d> 产生 <a><b>、<b><c> , <c><d>

preg_match_all overlapping variants <a><b><c><d> to produce <a><b>, <b><c>, <c><d>

我想将两个开始标签之间的任何单词括在 <span>anyword</span> 中。

例如给定这个

<li class="nav__league"><a href="/leagues" class="nav__link nav__dropdown">Leagues<span class="nav__link__helper nav__link__helper--left"><span>

之间<li class="nav__league"><a href="/leagues" class="nav__link nav__dropdown">没有字

<a href="/leagues" class="nav__link nav__dropdown">Leagues<span class="nav__link__helper nav__link__helper--left">之间有一个词"Leagues"

所以我想用 <span>Leagues</span> 将 Leagues 括起来并将其替换为 <a href="/leagues" class="nav__link nav__dropdown"><span>Leagues</span><span class="nav__link__helper nav__link__helper--left">

我正在使用表达式

preg_match_all('~<\w+[^>]*>([^><]*)<\w+[^>]*>~',$html,$matches);

其中

$html =<<<'EOD'
<li class="nav__league"><a href="/leagues" class="nav__link nav__dropdown">Leagues<span class="nav__link__helper nav__link__helper--left"><span>
EOD;

这个returns

array(
0   =>  <li class="nav__league"><a href="/leagues" class="nav__link nav__dropdown">
1   =>  <span class="nav__link__helper nav__link__helper--left"><span>
)

而不是

array(
0   =>  <li class="nav__league"><a href="/leagues" class="nav__link nav__dropdown">
1 => <a href="/leagues" class="nav__link nav__dropdown">Leagues<span class="nav__link__helper nav__link__helper--left">
1   =>  <span class="nav__link__helper nav__link__helper--left"><span>
)

拜托,我需要有人帮助我。

对于您要执行的操作,您需要使用 lookaheads
它匹配一个正则表达式,然后丢弃它以供后续匹配。
所以,稍微调整一下你的正则表达式 -

(<\w+[^>]*>)([^><]+)(?=<\w+[^>]*>)
^          ^      ^ ^^^          ^ Additions

基本上,我添加了一个新的捕获组(用于替换),添加了一个 + 而不是 * 和一个前瞻 - (?=...).

所以,如果有形式的文本 - <tag1>content1<tag2>content2<tag3>...

这将匹配 <tag1>content1<tag2>(捕获组将获得 content1 字符串)。因为,<tag2> 在正向超前匹配,下一次匹配将从 <tag2> 开始,而不是 content2

这是您想要实现的目标的示例 -

$html =<<<EOD
<li class="nav__league"><a href="/leagues" class="nav__link nav__dropdown">Leagues<span class="nav__link__helper nav__link__helper--left">Leagues2<span>
EOD;

$resp = preg_replace(
    "~(<\w+[^>]*>)([^><]+)(?=<\w+[^>]*>)~",
    "<span></span>",
    $html
);

var_dump($html);
var_dump($resp);

这输出 -

// Original String -
<li class="nav__league"><a href="/leagues" class="nav__link nav__dropdown">Leagues<span class="nav__link__helper nav__link__helper--left">Leagues2<span>

// Replaced string 
<li class="nav__league"><a href="/leagues" class="nav__link nav__dropdown"><span>Leagues</span><span class="nav__link__helper nav__link__helper--left"><span>Leagues2</span><span>