RegEx 查找所有锚标记,包括带有图像的标记

RegEx find all anchor tags, including ones with images

我正在尝试查找所有 <a> 标签并在它们周围添加一个 div。如何更改 RegEx 以匹配 <a><img></a> 以及 <a>text</a><a> 标签内的任何标签。我有:

<php

    $a_pattern = '@<a\s*.*>.*(<.*>)?.*</a>@i';
    $out = preg_replace_callback($a_pattern,"match_callback",$html);
    function match_callback($matches)
    {
         var_dump($matches);
    }

?>

以下是使用 built-in PHP DOM 解析器的方法(使用一些伪造的 HTML,但您会明白的):

<?php
$doc = new DOMDocument('1.0', 'UTF-8');
$doc = DOMDocument::loadHTML('<body>
     <a href="somewere"><img src="www.foo.com/example.gif" class="foo" alt="..."><br></a>
     <a href="somewere again"><img src="www.bar.com/1.jpg" class="bar" alt="..."></a>
     <a href="somewere again and back">Text</a>
     </body>
', LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

foreach ($doc->getElementsByTagName('a') as $a_node) {
   $div = $a_node->ownerDocument->createElement('div');
   $node = $a_node->parentNode->insertBefore($div, $a_node);
   $node->appendChild($a_node);
}
echo $doc->saveHTML();

sample demo 的输出:

<body>
<div><a href="somewere"><img src="www.foo.com/example.gif" class="foo" alt="..."><br></a></div>
<div><a href="somewere%20again"><img src="www.bar.com/1.jpg" class="bar" alt="..."></a></div>
<div><a href="somewere%20again%20and%20back">Text</a></div>
</body>

您还可以借助以下方式添加属性:

$node->setAttribute('class', 'title');