如何在 PHP 中使用 DOM 解析器追加文本(递增顺序)?
How to append the text (increasing order) using DOM parser in PHP?
$html='<h1 class="title">Exponents[label*exponents]</h1>
<h2 class="title">Exponents[label*exponents]</h2>
<h2 class="title">Exponents[label*exponents]</h2>
<h1 class="title">Exponents[label*exponents]</h1>
<h2 class="title">Exponents[label*exponents]</h2>
<h2 class="title">Exponents[label*exponents]</h2>
<h2 class="title">Exponents[label*exponents]</h2>
<h3 class="title">Exponents[label*exponents]</h3>
<h1 class="title">Exponents[label*exponents]</h1>
<h1 class="title">Exponents[label*exponents]</h1>
<h2 class="title">Exponents[label*exponents]</h2>';
预期输出:
<h1 class="title">1.1 Exponents[label*exponents]</h1>
<h2 class="title">1.1.1 Exponents[label*exponents]</h2>
<h2 class="title">1.1.2 Exponents[label*exponents]</h2>
<h1 class="title">1.2 Exponents[label*exponents]</h1>
<h2 class="title">1.2.1 Exponents[label*exponents]</h2>
<h2 class="title">1.2.2 Exponents[label*exponents]</h2>
<h2 class="title">1.2.3 Exponents[label*exponents]</h2>
<h3 class="title">1.2.3.1 Exponents[label*exponents]</h3>
<h1 class="title">1.3 Exponents[label*exponents]</h1>
<h1 class="title">1.4 Exponents[label*exponents]</h1>
<h2 class="title">1.4.1 Exponents[label*exponents]</h2>;
例如,如果附加文本是 1.1
,那么第一个 1
是章节号。第二
1
是第一个出现的h1
,如果是1.2.2
那么就是第几章。然后第二次出现 h1
和第二次出现 h2
。
在我学习的过程中 DOM parser class
我在下面尝试过类似的方法,
$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xp = new DOMXPath($dom);
$xp->registerNamespace("php", "http://php.net/xpath");
$className="title";
$hElement = $xp->query("//*[contains(@class, '$className')]");
$chapno=1;
$h1=0;
$h2=0;
$h3=0;
$h4=0;
foreach($hElement as $hNode) {
if($hElement==="h1"){
$h1++;
$append="$chapno.$h1 ";
$hnode->nodeValue =$append.$hNode->textContent;
$h2=0;$h3=0;$h24=0;
}
if($hElement==="h2"){
$h2++;
$append="$chapno.$h1.$h2 ";
$hnode->nodeValue =$append.$hNode->textContent;
}
if($hElement==="h3"){
$h3++;
$append="$chapno.$h1.$h2.$h3 ";
$hnode->nodeValue =$append.$hNode->textContent;
}
if($hElement==="h4"){
$h4++;
$append="$chapno.$h1.$h2.$h3.$h4 ";
$hnode->nodeValue =$append.$hNode->textContent;
}
}
我不知道获得预期输出的方法是否正确,如果正确,我不知道该怎么做。
我简化了你的代码,然后才能够看懂。我希望一切都清楚。
$dom = new DOMDocument;
$dom->loadHTML($html);
$xp = new DOMXPath($dom);
$xp->registerNamespace("php", "http://php.net/xpath");
$className="title";
$hElement = $xp->query("//*[contains(@class, '$className')]");
$chapno=1;
$h = array(0, 0, 0, 0);
foreach($hElement as $hNode) {
if ($hNode->nodeName[0] === 'h') { // 1st letter is tag name
$i = $hNode->nodeName[1]; // level number
$h[$i-1]++; // increase counter of level
$append = $chapno.".".implode('.', array_slice($h, 0, $i));
while($i < count($h)) $h[$i++] = 0; // set lower levels with start value
$hNode->nodeValue = $append." ".$hNode->nodeValue; // Don't forget to change title text
}
}
echo $dom->saveHTML();
结果
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><h1 class="title">1.1 Exponents[label*exponents]</h1>
<h2 class="title">1.1.1 Exponents[label*exponents]</h2>
<h2 class="title">1.1.2 Exponents[label*exponents]</h2>
<h1 class="title">1.2 Exponents[label*exponents]</h1>
<h2 class="title">1.2.1 Exponents[label*exponents]</h2>
<h2 class="title">1.2.2 Exponents[label*exponents]</h2>
<h2 class="title">1.2.3 Exponents[label*exponents]</h2>
<h3 class="title">1.2.3.1 Exponents[label*exponents]</h3>
<h1 class="title">1.3 Exponents[label*exponents]</h1>
<h1 class="title">1.4 Exponents[label*exponents]</h1>
<h2 class="title">1.4.1 Exponents[label*exponents]</h2></body></html>
$html='<h1 class="title">Exponents[label*exponents]</h1>
<h2 class="title">Exponents[label*exponents]</h2>
<h2 class="title">Exponents[label*exponents]</h2>
<h1 class="title">Exponents[label*exponents]</h1>
<h2 class="title">Exponents[label*exponents]</h2>
<h2 class="title">Exponents[label*exponents]</h2>
<h2 class="title">Exponents[label*exponents]</h2>
<h3 class="title">Exponents[label*exponents]</h3>
<h1 class="title">Exponents[label*exponents]</h1>
<h1 class="title">Exponents[label*exponents]</h1>
<h2 class="title">Exponents[label*exponents]</h2>';
预期输出:
<h1 class="title">1.1 Exponents[label*exponents]</h1>
<h2 class="title">1.1.1 Exponents[label*exponents]</h2>
<h2 class="title">1.1.2 Exponents[label*exponents]</h2>
<h1 class="title">1.2 Exponents[label*exponents]</h1>
<h2 class="title">1.2.1 Exponents[label*exponents]</h2>
<h2 class="title">1.2.2 Exponents[label*exponents]</h2>
<h2 class="title">1.2.3 Exponents[label*exponents]</h2>
<h3 class="title">1.2.3.1 Exponents[label*exponents]</h3>
<h1 class="title">1.3 Exponents[label*exponents]</h1>
<h1 class="title">1.4 Exponents[label*exponents]</h1>
<h2 class="title">1.4.1 Exponents[label*exponents]</h2>;
例如,如果附加文本是 1.1
,那么第一个 1
是章节号。第二
1
是第一个出现的h1
,如果是1.2.2
那么就是第几章。然后第二次出现 h1
和第二次出现 h2
。
在我学习的过程中 DOM parser class
我在下面尝试过类似的方法,
$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xp = new DOMXPath($dom);
$xp->registerNamespace("php", "http://php.net/xpath");
$className="title";
$hElement = $xp->query("//*[contains(@class, '$className')]");
$chapno=1;
$h1=0;
$h2=0;
$h3=0;
$h4=0;
foreach($hElement as $hNode) {
if($hElement==="h1"){
$h1++;
$append="$chapno.$h1 ";
$hnode->nodeValue =$append.$hNode->textContent;
$h2=0;$h3=0;$h24=0;
}
if($hElement==="h2"){
$h2++;
$append="$chapno.$h1.$h2 ";
$hnode->nodeValue =$append.$hNode->textContent;
}
if($hElement==="h3"){
$h3++;
$append="$chapno.$h1.$h2.$h3 ";
$hnode->nodeValue =$append.$hNode->textContent;
}
if($hElement==="h4"){
$h4++;
$append="$chapno.$h1.$h2.$h3.$h4 ";
$hnode->nodeValue =$append.$hNode->textContent;
}
}
我不知道获得预期输出的方法是否正确,如果正确,我不知道该怎么做。
我简化了你的代码,然后才能够看懂。我希望一切都清楚。
$dom = new DOMDocument;
$dom->loadHTML($html);
$xp = new DOMXPath($dom);
$xp->registerNamespace("php", "http://php.net/xpath");
$className="title";
$hElement = $xp->query("//*[contains(@class, '$className')]");
$chapno=1;
$h = array(0, 0, 0, 0);
foreach($hElement as $hNode) {
if ($hNode->nodeName[0] === 'h') { // 1st letter is tag name
$i = $hNode->nodeName[1]; // level number
$h[$i-1]++; // increase counter of level
$append = $chapno.".".implode('.', array_slice($h, 0, $i));
while($i < count($h)) $h[$i++] = 0; // set lower levels with start value
$hNode->nodeValue = $append." ".$hNode->nodeValue; // Don't forget to change title text
}
}
echo $dom->saveHTML();
结果
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><h1 class="title">1.1 Exponents[label*exponents]</h1>
<h2 class="title">1.1.1 Exponents[label*exponents]</h2>
<h2 class="title">1.1.2 Exponents[label*exponents]</h2>
<h1 class="title">1.2 Exponents[label*exponents]</h1>
<h2 class="title">1.2.1 Exponents[label*exponents]</h2>
<h2 class="title">1.2.2 Exponents[label*exponents]</h2>
<h2 class="title">1.2.3 Exponents[label*exponents]</h2>
<h3 class="title">1.2.3.1 Exponents[label*exponents]</h3>
<h1 class="title">1.3 Exponents[label*exponents]</h1>
<h1 class="title">1.4 Exponents[label*exponents]</h1>
<h2 class="title">1.4.1 Exponents[label*exponents]</h2></body></html>