将没有 class 标签的标题 <hn> 替换为 <p>

Replacing Heading <hn> without class Tags with <p>

我想替换不包含 class 属性的 hn 标签。这个想法是匹配 hn 之后的任何内容,除了包含 class="something" + 并以 >

结尾的字符串

这是我的第一次尝试:

<?php

$content = <<<HTML
<h1 style="color:black">test1</h1>
<H2 class="green">test2</H2>
<h5 class="red">test</h5>
<h5 class="">test test</h5>
HTML;

$content = preg_replace('#<h([1-6])((?!class).)*?>(.*?)<\/h[1-6]>#si', '<p class="heading-" ></p>', $content);

echo ($content);

结果是:

<p class="heading-1" ">test1</p>
<H2 class="green">test2</H2>
<h5 class="red">test</h5>
<h5 class="">test test</h5>

应该是:

<p class="heading-1" style="color:black">test1</p>
<H2 class="green">test2</H2>
<h5 class="red">test</h5>
<h5 class="">test test</h5>

知道为什么 $2 映射到 " 值而不是 style="color:black"

您的捕获组必须添加到稍微不同的地方。

((?!class).)*? 替换为 ((?:(?!class).)*?)

使用

'#<h([1-6])\s*((?:(?!class).)*?)>(.*?)</h[1-6]>#si'

参见 regex proof

解释

--------------------------------------------------------------------------------
  <h                       '<h'
--------------------------------------------------------------------------------
  (                        group and capture to :
--------------------------------------------------------------------------------
    [1-6]                    any character of: '1' to '6'
--------------------------------------------------------------------------------
  )                        end of 
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (                        group and capture to :
--------------------------------------------------------------------------------
    (?:                      group, but do not capture (0 or more
                             times (matching the least amount
                             possible)):
--------------------------------------------------------------------------------
      (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
        class                    'class'
--------------------------------------------------------------------------------
      )                        end of look-ahead
--------------------------------------------------------------------------------
      .                        any character except \n
--------------------------------------------------------------------------------
    )*?                      end of grouping
--------------------------------------------------------------------------------
  )                        end of 
--------------------------------------------------------------------------------
  >                        '>'
--------------------------------------------------------------------------------
  (                        group and capture to :
--------------------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
--------------------------------------------------------------------------------
  )                        end of 
--------------------------------------------------------------------------------
  </h                      '</h'
--------------------------------------------------------------------------------
  [1-6]                    any character of: '1' to '6'
--------------------------------------------------------------------------------
  >                        '>'