PHP:从元素中删除超链接但保留文本和 class
PHP: Remove a hyperlink from element but retain the text and class
我需要处理 DOM 并删除指向特定站点的所有超链接,同时保留基础文本。因此,某些 ling <a href="abc.com">text</a>
变为 text
。根据 this thread 的提示,我写了这个:
$as = $dom->getElementsByTagName('a');
for ($i = 0; $i < $as->length; $i++) {
$node = $as->item($i);
$link_href = $node->getAttribute('href');
if (strpos($link_href,'offendinglink.com') !== false) {
$cl = $node->getAttribute('class');
$text = new DomText($node->nodeValue);
$node->parentNode->insertBefore($text, $node);
$node->parentNode->removeChild($node);
$i--;
}
}
除了我还需要保留归因于违规 <a>
标签的 class 并可能将其变成 <div>
或 <span>
之外,这工作正常。因此,我需要这个:
<a href="www.offendinglink.com" target="_blank" class="nice" id="nicer">text</a>
变成这样:
<div class="nice">text</div>
添加新元素后如何访问它(如在我的代码片段中)?
测试解决方案:
<?php
$str = "<b>Dummy</b> <a href='http://google.com' target='_blank' class='nice' id='nicer'>Google.com</a> <a href='http://yandex.ru' target='_blank' class='nice' id='nicer'>Yandex.ru</a>";
$doc = new DOMDocument();
$doc->loadHTML($str);
$anchors = $doc->getElementsByTagName('a');
$l = $anchors->length;
for ($i = 0; $i < $l; $i++) {
$anchor = $anchors->item(0);
$link = $doc->createElement('div', $anchor->nodeValue);
$link->setAttribute('class', $anchor->getAttribute('class'));
$anchor->parentNode->replaceChild($link, $anchor);
}
echo preg_replace(['/^\<\!DOCTYPE.*?<html><body>/si', '!</body></html>$!si'], '', $doc->saveHTML());
或参见runnable。
quote "How do I access the new element after it's been added (like in my code snippet)?" - 我认为你的元素在 $text 中..无论如何,我认为这应该有效,如果你需要保存 class 和 textContent,但没有别的
foreach($dom->getElementsByTagName('a') as $url){
if(parse_url($url->getAttribute("href"),PHP_URL_HOST)!=='badsite.com') {
continue;
}
$ele = $dom->createElement("div");
$ele->textContent = $url->textContent;
$ele->setAttribute("class",$url->getAttribute("class"));
$url->parentNode->insertBefore($ele,$url);
$url->parentNode->removeChild($url);
}
我需要处理 DOM 并删除指向特定站点的所有超链接,同时保留基础文本。因此,某些 ling <a href="abc.com">text</a>
变为 text
。根据 this thread 的提示,我写了这个:
$as = $dom->getElementsByTagName('a');
for ($i = 0; $i < $as->length; $i++) {
$node = $as->item($i);
$link_href = $node->getAttribute('href');
if (strpos($link_href,'offendinglink.com') !== false) {
$cl = $node->getAttribute('class');
$text = new DomText($node->nodeValue);
$node->parentNode->insertBefore($text, $node);
$node->parentNode->removeChild($node);
$i--;
}
}
除了我还需要保留归因于违规 <a>
标签的 class 并可能将其变成 <div>
或 <span>
之外,这工作正常。因此,我需要这个:
<a href="www.offendinglink.com" target="_blank" class="nice" id="nicer">text</a>
变成这样:
<div class="nice">text</div>
添加新元素后如何访问它(如在我的代码片段中)?
测试解决方案:
<?php
$str = "<b>Dummy</b> <a href='http://google.com' target='_blank' class='nice' id='nicer'>Google.com</a> <a href='http://yandex.ru' target='_blank' class='nice' id='nicer'>Yandex.ru</a>";
$doc = new DOMDocument();
$doc->loadHTML($str);
$anchors = $doc->getElementsByTagName('a');
$l = $anchors->length;
for ($i = 0; $i < $l; $i++) {
$anchor = $anchors->item(0);
$link = $doc->createElement('div', $anchor->nodeValue);
$link->setAttribute('class', $anchor->getAttribute('class'));
$anchor->parentNode->replaceChild($link, $anchor);
}
echo preg_replace(['/^\<\!DOCTYPE.*?<html><body>/si', '!</body></html>$!si'], '', $doc->saveHTML());
或参见runnable。
quote "How do I access the new element after it's been added (like in my code snippet)?" - 我认为你的元素在 $text 中..无论如何,我认为这应该有效,如果你需要保存 class 和 textContent,但没有别的
foreach($dom->getElementsByTagName('a') as $url){
if(parse_url($url->getAttribute("href"),PHP_URL_HOST)!=='badsite.com') {
continue;
}
$ele = $dom->createElement("div");
$ele->textContent = $url->textContent;
$ele->setAttribute("class",$url->getAttribute("class"));
$url->parentNode->insertBefore($ele,$url);
$url->parentNode->removeChild($url);
}