在到达字符串中的第一个 p 标签之前删除每个 li 标签
Removing every li tag before reaching the first p tag in string
假设我有一个包含一些 HTML 的字符串。我想在到达第一个 p
标签之前删除每个 li
标签。
如何实现这样的目标?
示例字符串:
$str = "<img src='something.png'/>some_text_here<li>needs_to_be_removed</li>
<li>also_needs_to_be_removed</li>some_other_text<p>finally</p>more_text_here
<li>this_should_not_be_removed</li>";`
需要删除前两个 li
标签。
你可以用PHP的DOMdocument
使用下面的遍历函数
$doc = new DOMDocument();
$doc->loadHTML($str);
$foundp = false;
showDOMNode($doc);
//now $doc contains the string you want
$newstr = $doc->saveHTML();
function showDOMNode(DOMNode &$domNode) {
global $foundp;
foreach ($domNode->childNodes as $node)
{
if ($node->nodeName == "li" && $foundp==false){
//delete this node
$domNode->removeChild($node);
}
else if ($node->nodeName == "p"){
//stop here
$foundp = true;
return;
}
else if($node->hasChildNodes() && $foundp==false) {
//recursively
showDOMNode($node);
}
}
}
我建议使用 php praser 库会更好更快。我个人在我的项目中使用这个 https://github.com/paquettg/php-html-parser。它提供类似
的 API
$child->nextSibling()
$content->innerHtml,
$content->firstChild()
还有更多可以派上用场的。
你可以为所有元素做一个 foreach 循环,在它们里面注册 "li" 标签,如果第三次出现,你找到一个 "p" 标签,你可以删除 $child->以前的兄弟姐妹();
这是您需要的。简单有效:
$mystring = "mystringwith<li>toberemovedstring</li><li>againremove</li><p>do not remove me</p>";//the string you provide
$findme = '<li>';//the string you want to search in $mystring
$findpee = '<p>';//haha pee also where to end it
$pos = strpos($mystring, $findme);//first position of <li>
$pospee = strpos($mystring, $findpee);// then position of pee.. get it :)
//Then we remove it
$result=substr_replace ( $mystring ,"" , $pos, ($pospee-$pos));
echo $result;
编辑:PHP 沙盒
http://sandbox.onlinephpfunctions.com/code/e534259e2312682a04b64c6e3aae1521422aacd2
你也可以在这里查看结果
使用 XPath:
$str = "<img src='something.png'/>some_text_here<li>needs_to_be_removed</li>
<li>also_needs_to_be_removed</li>some_other_text<p>finally</p>more_text_here
<li>this_should_not_be_removed</li>";
libxml_use_internal_errors(true);
$dom = new DOMDocument;
$dom->loadHTML('<div>' . $str .'</div>', LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
// ^---------------^----- add a root element
$xp = new DOMXPath($dom);
$lis = $xp->query('//p[1]/preceding-sibling::li');
foreach ($lis as $li) {
$li->parentNode->removeChild($li);
}
$result = '';
// add each child node of the root element to the result
foreach ($dom->getElementsByTagName('div')->item(0)->childNodes as $child) {
$result .= $dom->saveHTML($child);
}
假设我有一个包含一些 HTML 的字符串。我想在到达第一个 p
标签之前删除每个 li
标签。
如何实现这样的目标?
示例字符串:
$str = "<img src='something.png'/>some_text_here<li>needs_to_be_removed</li>
<li>also_needs_to_be_removed</li>some_other_text<p>finally</p>more_text_here
<li>this_should_not_be_removed</li>";`
需要删除前两个 li
标签。
你可以用PHP的DOMdocument
使用下面的遍历函数
$doc = new DOMDocument();
$doc->loadHTML($str);
$foundp = false;
showDOMNode($doc);
//now $doc contains the string you want
$newstr = $doc->saveHTML();
function showDOMNode(DOMNode &$domNode) {
global $foundp;
foreach ($domNode->childNodes as $node)
{
if ($node->nodeName == "li" && $foundp==false){
//delete this node
$domNode->removeChild($node);
}
else if ($node->nodeName == "p"){
//stop here
$foundp = true;
return;
}
else if($node->hasChildNodes() && $foundp==false) {
//recursively
showDOMNode($node);
}
}
}
我建议使用 php praser 库会更好更快。我个人在我的项目中使用这个 https://github.com/paquettg/php-html-parser。它提供类似
的 API $child->nextSibling()
$content->innerHtml,
$content->firstChild()
还有更多可以派上用场的。
你可以为所有元素做一个 foreach 循环,在它们里面注册 "li" 标签,如果第三次出现,你找到一个 "p" 标签,你可以删除 $child->以前的兄弟姐妹();
这是您需要的。简单有效:
$mystring = "mystringwith<li>toberemovedstring</li><li>againremove</li><p>do not remove me</p>";//the string you provide
$findme = '<li>';//the string you want to search in $mystring
$findpee = '<p>';//haha pee also where to end it
$pos = strpos($mystring, $findme);//first position of <li>
$pospee = strpos($mystring, $findpee);// then position of pee.. get it :)
//Then we remove it
$result=substr_replace ( $mystring ,"" , $pos, ($pospee-$pos));
echo $result;
编辑:PHP 沙盒
http://sandbox.onlinephpfunctions.com/code/e534259e2312682a04b64c6e3aae1521422aacd2
你也可以在这里查看结果
使用 XPath:
$str = "<img src='something.png'/>some_text_here<li>needs_to_be_removed</li>
<li>also_needs_to_be_removed</li>some_other_text<p>finally</p>more_text_here
<li>this_should_not_be_removed</li>";
libxml_use_internal_errors(true);
$dom = new DOMDocument;
$dom->loadHTML('<div>' . $str .'</div>', LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
// ^---------------^----- add a root element
$xp = new DOMXPath($dom);
$lis = $xp->query('//p[1]/preceding-sibling::li');
foreach ($lis as $li) {
$li->parentNode->removeChild($li);
}
$result = '';
// add each child node of the root element to the result
foreach ($dom->getElementsByTagName('div')->item(0)->childNodes as $child) {
$result .= $dom->saveHTML($child);
}