Php,DOMX 路径。无效项目 (x) return

Php, DOMXpath. Invalid item(x) return

简单的事情但是......我们有这样的php代码

$oPath = new \DOMXPath($this->oHtmlProperty);
$oNode = $oPath->query('//div[@class="product-spec__body"]');

foreach ($oNode as $oNodeProperty) {
    $oListTitle = $oPath->query('h2[@class="title title_size_22"]', $oNodeProperty);

    // ### VARIANT 1 (error with message 'Trying to get property of non-object')

    // $aPropertyGroup = [
    //     'title' => $oListTitle->item(0)->textContent,
    //     'property' => []
    // ];

    // ### VARIANT 2

    foreach ($oListTitle as $oListTitleItem){
        $aPropertyGroup = [
             'title' => $oListTitleItem->textContent,
             'property' => []
        ];

        break; // we need only first item
   }

// ....

所以最重要的是 $oListTitle 总是有 ->item(0) 个节点,仅此而已。当我们尝试获取它时,我们得到错误 with message 'Trying to get property of non-object' 但这个节点存在!当我们做同样的事情但通过迭代(return 相同的节点 class 作为我们调用 ->item(x))时,我们得到了我们需要的东西。

谁能告诉我为什么? XD

已添加:

$oListTitle 是:

object(DOMNodeList)#340 (1) { ["length"]=> int(1) } 

已添加:

var_dump($oListTitle->item(0));return这个

object(DOMElement)#338 (18) { ["tagName"]=> string(2) "h2" ["schemaTypeInfo"]=> NULL ["nodeName"]=> string(2) "h2" ["nodeValue"]=> string(45) "ОÑновные характериÑтики" ["nodeType"]=> int(1) ["parentNode"]=> string(22) "(object value omitted)" ["childNodes"]=> string(22) "(object value omitted)" ["firstChild"]=> string(22) "(object value omitted)" ["lastChild"]=> string(22) "(object value omitted)" ["previousSibling"]=> NULL ["nextSibling"]=> string(22) "(object value omitted)" ["attributes"]=> string(22) "(object value omitted)" ["ownerDocument"]=> string(22) "(object value omitted)" ["namespaceURI"]=> NULL ["prefix"]=> string(0) "" ["localName"]=> string(2) "h2" ["baseURI"]=> NULL ["textContent"]=> string(45) "ОÑновные характериÑтики" } 

又一个词不空且存在

我无法使用 php 5.6.3/win32 和以下代码(您的代码 + 一些样板)重现问题

<?php
$foo = new Foo;
var_export($foo->bar());

class Foo {

    public function __construct() {
        $this->oHtmlProperty = new DOMDocument;
        $this->oHtmlProperty->loadhtml('<html><head><title>...</title></head><body>
    <div class="product-spec__body">
        <h2 class="title title_size_22">h2_1</h2>
        <h2 class="title title_size_22">h2_2</h2>
    </div>
    <div></div>
    <div class="product-spec__body">
        <h2 class="title title_size_22">h2_3</h2>
        <h2 class="title title_size_22">h2_4</h2>
    </div>
</body></html>');
    }

    public function bar() {
        $retval = array(); $aPropertyGroup = array();
        $oPath = new \DOMXPath($this->oHtmlProperty);
        $oNode = $oPath->query('//div[@class="product-spec__body"]');

        foreach ($oNode as $oNodeProperty) {
            $oListTitle = $oPath->query('h2[@class="title title_size_22"]', $oNodeProperty);
            // ### VARIANT 1 (error with message 'Trying to get property of non-object')
            if ( !is_object($oListTitle) ) die('$oListTitle is not an object');
            if ( ! ($oListTitle instanceof DOMNodeList) ) die('$oListTitle is not a DOMNodeList');
            if ( $oListTitle->length < 1 ) die('oListTitle->length < 1');
            $node = $oListTitle->item(0);
            if ( is_null($node) ) die('$node is NULL');
            if ( !is_object($node) ) die('$node is not an object');
            if ( ! ($node instanceof DOMNode) ) die('$node is not a DOMNode');

            $aPropertyGroup = [
                'title' => $oListTitle->item(0)->textContent,
                'property' => []
            ];

            if ( !empty($aPropertyGroup) ) {
                $retval[] = $aPropertyGroup;
                $aPropertyGroup = array();
            }
        } 

        return $retval;
    }
}

输出是

array (
  0 => 
  array (
    'title' => 'h2_1',
    'property' => 
    array (
    ),
  ),
  1 => 
  array (
    'title' => 'h2_3',
    'property' => 
    array (
    ),
  ),
)

符合预期。
但也许 libxml_get_last_error() 可以告诉你更多....

你有两个表达式,所以如果第一个匹配项有多个项。根据外部匹配的不同,内部匹配可能有不同的结果。您只设置了一个变量,因此如果所需结果在其中一个外部匹配项中,它将填充该变量。

您没有提供HTML,所以不可能真正重现错误。

但是如果您使用的是 DOMNodelist::item(),您应该始终验证 return 值是一个节点。

这里有两个可能的优化:

  1. 将结果限制在第一个节点:
    h2[@class="title title_size_22"][1]
  2. 获取第一个节点的文本内容作为字符串(仅适用于DOMXPath::evaluate()):
    string(h2[@class="title title_size_22"])

示例

$html = <<<'HTML'
<html><head><title>...</title></head><body>
    <div class="product-spec__body">
        <h2 class="title title_size_22">h2_1</h2>
        <h2 class="title title_size_22">h2_2</h2>
    </div>
    <div></div>
    <div class="product-spec__body">
    </div>
</body></html>
HTML;

$dom = new DOMDocument();
$dom->loadHtml($html);
$xpath = new DOMXpath($dom);

foreach ($xpath->evaluate('//div[@class="product-spec__body"]') as $index => $spec) {
  echo "Run #", $index, "\n";
  // all h2 with the class
  var_dump($xpath->evaluate('h2[@class="title title_size_22"]', $spec));
  // first h2 with the class
  var_dump($xpath->evaluate('h2[@class="title title_size_22"][1]', $spec));
  // first h2 with the class as string
  var_dump($xpath->evaluate('string(h2[@class="title title_size_22"])', $spec));
  echo "\n\n";
}

输出 - 比较两次运行的结果:

Run #0
object(DOMNodeList)#9 (1) {
  ["length"]=>
  int(2)
}
object(DOMNodeList)#8 (1) {
  ["length"]=>
  int(1)
}
string(4) "h2_1"


Run #1
object(DOMNodeList)#8 (1) {
  ["length"]=>
  int(0)
}
object(DOMNodeList)#8 (1) {
  ["length"]=>
  int(0)
}
string(0) ""