preg_match_all() 中的正则表达式无法按预期工作

Question

我有以下字符串：

"<h2>Define Vim is the greatest</h2> word processor, good <h3>Vi</h3>!".

我想 select h2 和 h3 使用正则表达式喜欢以下结构。

预期输出为：

array(
    0   =>  <h2>Define Vim is the greatviest</h2>
    1   =>  <h3>Vi</h3>
)

所以我按如下方式实现我的正则表达式：

preg_match_all("/(?:<h2>|<h3>).*vi.*(?:<\/h2>|<\/h3>)/i", $input, $matches)

但是它输出的不是上面想要的结果，而是下面的结果。

当前输出：

array(
    0 => <h2>Define Vim is the greatviest</h2> word prviocessor ever created <h3>Vi</h3>
)

如何更改我的 code/regex，以便获得上面预期输出中的标签？

Answer 1

你的问题是，你首先错过了 delimiters for your regex and second vi is case-sensitive, so you would have to add the i flag，因为不区分大小写。

因此您的代码可能看起来像这样（刚刚删除了正则表达式中的 vi，现在我只抓取 h1-6 标记之间的所有内容）：

<?php

    $input = '"<h2>Define Vim is the greatest</h2> word processor, good <h3>Vi</h3>!".';

    preg_match_all("/(?:<h[0-6]>).*?(?:<\/h[0-6]>)/", $input, $matches);
    print_r($matches);

?>

输出：

Array
(
    [0] => Array
        (
            [0] => <h2>Define Vim is the greatest</h2>
            [1] => <h3>Vi</h3>
        )

)

编辑：

从你更新的正则表达式开始，现在你的问题是，.* 是贪婪的，意味着它需要尽可能多的。要使其非贪婪，您必须在末尾添加 ? 。所以只需更改 .* -> .*?.

preg_match_all() 中的正则表达式无法按预期工作

regex in preg_match_all() doesn't work as expected

php

regex

string

preg-match-all