匹配 preg_replace 中的所有 <strong> 标签

Question

我们是 PHP 中的正则表达式 (preg_replace) 的新手，在让它完全按照我们的要求执行时遇到了一些麻烦。

例如，我们有这样的 HTML 代码：

<h2><strong>Automatic Writing</strong> <strong>– A Conduit For An Entity From Another World?</strong></h2>

我们想删除 H2 中的所有样式标签（甚至匹配 H3/H4/H5 标签）。

到目前为止我们已经构建了以下代码（我们正在与 Wordpress 集成）：

function removebolding($content)
        {
            $content =
                preg_replace('/(<h([1-6])[^>]*>)\s?<strong>(.*)?<\/strong>\s?(<\/h>)/', "", $content);
            return $content;
        }

        add_filter('the_content', 'removebolding');

这确实有效，但是，它只删除了第一个 'strong' 标签 - 我们剩下：

<h2>Automatic Writing <strong>– A Conduit For An Entity From Another World?</strong></h2>

我们如何match/remove所有'strong'标签？另外，也许我们可以简单地提取标题标签的内容，运行一个 strip_tags 函数，然后用输出替换？

提前感谢任何帮助、建议和代码示例。

非常感谢。

Answer 1

您可以使用

preg_replace_callback('~<h([1-6])>.*?</h>~is', function($m) { 
    return preg_replace('~</?strong>~i', '', $m[0]); }
, $s)

输出：<h2>Automatic Writing – A Conduit For An Entity From Another World?</h2>

正则表达式的性能可能会像这样得到增强：

'~<h([1-6])>[^<]*(?:<(?!/h>[^<]*)*</h>~i'

参见PHP demo。

~<h([1-6])>.*?</h>~s 匹配任何 h 标签，它们之间有任何文本
preg_replace('~</?strong>~i', '', $m[0]) 仅删除主正则表达式匹配值中的所有 <strong> 和 </strong> 标记，在 $m[0].

匹配 preg_replace 中的所有 <strong> 标签

Matching all <strong> tags in preg_replace

php

regex

wordpress

preg-replace