Php,DOMX 路径。无效项目 (x) return
Php, DOMXpath. Invalid item(x) return
简单的事情但是......我们有这样的php代码
$oPath = new \DOMXPath($this->oHtmlProperty);
$oNode = $oPath->query('//div[@class="product-spec__body"]');
foreach ($oNode as $oNodeProperty) {
$oListTitle = $oPath->query('h2[@class="title title_size_22"]', $oNodeProperty);
// ### VARIANT 1 (error with message 'Trying to get property of non-object')
// $aPropertyGroup = [
// 'title' => $oListTitle->item(0)->textContent,
// 'property' => []
// ];
// ### VARIANT 2
foreach ($oListTitle as $oListTitleItem){
$aPropertyGroup = [
'title' => $oListTitleItem->textContent,
'property' => []
];
break; // we need only first item
}
// ....
所以最重要的是 $oListTitle
总是有 ->item(0)
个节点,仅此而已。当我们尝试获取它时,我们得到错误 with message 'Trying to get property of non-object'
但这个节点存在!当我们做同样的事情但通过迭代(return 相同的节点 class 作为我们调用 ->item(x))时,我们得到了我们需要的东西。
谁能告诉我为什么? XD
已添加:
$oListTitle 是:
object(DOMNodeList)#340 (1) { ["length"]=> int(1) }
已添加:
var_dump($oListTitle->item(0));
return这个
object(DOMElement)#338 (18) { ["tagName"]=> string(2) "h2" ["schemaTypeInfo"]=> NULL ["nodeName"]=> string(2) "h2" ["nodeValue"]=> string(45) "ОÑновные характериÑтики" ["nodeType"]=> int(1) ["parentNode"]=> string(22) "(object value omitted)" ["childNodes"]=> string(22) "(object value omitted)" ["firstChild"]=> string(22) "(object value omitted)" ["lastChild"]=> string(22) "(object value omitted)" ["previousSibling"]=> NULL ["nextSibling"]=> string(22) "(object value omitted)" ["attributes"]=> string(22) "(object value omitted)" ["ownerDocument"]=> string(22) "(object value omitted)" ["namespaceURI"]=> NULL ["prefix"]=> string(0) "" ["localName"]=> string(2) "h2" ["baseURI"]=> NULL ["textContent"]=> string(45) "ОÑновные характериÑтики" }
又一个词不空且存在
我无法使用 php 5.6.3/win32 和以下代码(您的代码 + 一些样板)重现问题
<?php
$foo = new Foo;
var_export($foo->bar());
class Foo {
public function __construct() {
$this->oHtmlProperty = new DOMDocument;
$this->oHtmlProperty->loadhtml('<html><head><title>...</title></head><body>
<div class="product-spec__body">
<h2 class="title title_size_22">h2_1</h2>
<h2 class="title title_size_22">h2_2</h2>
</div>
<div></div>
<div class="product-spec__body">
<h2 class="title title_size_22">h2_3</h2>
<h2 class="title title_size_22">h2_4</h2>
</div>
</body></html>');
}
public function bar() {
$retval = array(); $aPropertyGroup = array();
$oPath = new \DOMXPath($this->oHtmlProperty);
$oNode = $oPath->query('//div[@class="product-spec__body"]');
foreach ($oNode as $oNodeProperty) {
$oListTitle = $oPath->query('h2[@class="title title_size_22"]', $oNodeProperty);
// ### VARIANT 1 (error with message 'Trying to get property of non-object')
if ( !is_object($oListTitle) ) die('$oListTitle is not an object');
if ( ! ($oListTitle instanceof DOMNodeList) ) die('$oListTitle is not a DOMNodeList');
if ( $oListTitle->length < 1 ) die('oListTitle->length < 1');
$node = $oListTitle->item(0);
if ( is_null($node) ) die('$node is NULL');
if ( !is_object($node) ) die('$node is not an object');
if ( ! ($node instanceof DOMNode) ) die('$node is not a DOMNode');
$aPropertyGroup = [
'title' => $oListTitle->item(0)->textContent,
'property' => []
];
if ( !empty($aPropertyGroup) ) {
$retval[] = $aPropertyGroup;
$aPropertyGroup = array();
}
}
return $retval;
}
}
输出是
array (
0 =>
array (
'title' => 'h2_1',
'property' =>
array (
),
),
1 =>
array (
'title' => 'h2_3',
'property' =>
array (
),
),
)
符合预期。
但也许 libxml_get_last_error() 可以告诉你更多....
你有两个表达式,所以如果第一个匹配项有多个项。根据外部匹配的不同,内部匹配可能有不同的结果。您只设置了一个变量,因此如果所需结果在其中一个外部匹配项中,它将填充该变量。
您没有提供HTML,所以不可能真正重现错误。
但是如果您使用的是 DOMNodelist::item()
,您应该始终验证 return 值是一个节点。
这里有两个可能的优化:
- 将结果限制在第一个节点:
h2[@class="title title_size_22"][1]
- 获取第一个节点的文本内容作为字符串(仅适用于
DOMXPath::evaluate()
):
string(h2[@class="title title_size_22"])
示例
$html = <<<'HTML'
<html><head><title>...</title></head><body>
<div class="product-spec__body">
<h2 class="title title_size_22">h2_1</h2>
<h2 class="title title_size_22">h2_2</h2>
</div>
<div></div>
<div class="product-spec__body">
</div>
</body></html>
HTML;
$dom = new DOMDocument();
$dom->loadHtml($html);
$xpath = new DOMXpath($dom);
foreach ($xpath->evaluate('//div[@class="product-spec__body"]') as $index => $spec) {
echo "Run #", $index, "\n";
// all h2 with the class
var_dump($xpath->evaluate('h2[@class="title title_size_22"]', $spec));
// first h2 with the class
var_dump($xpath->evaluate('h2[@class="title title_size_22"][1]', $spec));
// first h2 with the class as string
var_dump($xpath->evaluate('string(h2[@class="title title_size_22"])', $spec));
echo "\n\n";
}
输出 - 比较两次运行的结果:
Run #0
object(DOMNodeList)#9 (1) {
["length"]=>
int(2)
}
object(DOMNodeList)#8 (1) {
["length"]=>
int(1)
}
string(4) "h2_1"
Run #1
object(DOMNodeList)#8 (1) {
["length"]=>
int(0)
}
object(DOMNodeList)#8 (1) {
["length"]=>
int(0)
}
string(0) ""
简单的事情但是......我们有这样的php代码
$oPath = new \DOMXPath($this->oHtmlProperty);
$oNode = $oPath->query('//div[@class="product-spec__body"]');
foreach ($oNode as $oNodeProperty) {
$oListTitle = $oPath->query('h2[@class="title title_size_22"]', $oNodeProperty);
// ### VARIANT 1 (error with message 'Trying to get property of non-object')
// $aPropertyGroup = [
// 'title' => $oListTitle->item(0)->textContent,
// 'property' => []
// ];
// ### VARIANT 2
foreach ($oListTitle as $oListTitleItem){
$aPropertyGroup = [
'title' => $oListTitleItem->textContent,
'property' => []
];
break; // we need only first item
}
// ....
所以最重要的是 $oListTitle
总是有 ->item(0)
个节点,仅此而已。当我们尝试获取它时,我们得到错误 with message 'Trying to get property of non-object'
但这个节点存在!当我们做同样的事情但通过迭代(return 相同的节点 class 作为我们调用 ->item(x))时,我们得到了我们需要的东西。
谁能告诉我为什么? XD
已添加:
$oListTitle 是:
object(DOMNodeList)#340 (1) { ["length"]=> int(1) }
已添加:
var_dump($oListTitle->item(0));
return这个
object(DOMElement)#338 (18) { ["tagName"]=> string(2) "h2" ["schemaTypeInfo"]=> NULL ["nodeName"]=> string(2) "h2" ["nodeValue"]=> string(45) "ОÑновные характериÑтики" ["nodeType"]=> int(1) ["parentNode"]=> string(22) "(object value omitted)" ["childNodes"]=> string(22) "(object value omitted)" ["firstChild"]=> string(22) "(object value omitted)" ["lastChild"]=> string(22) "(object value omitted)" ["previousSibling"]=> NULL ["nextSibling"]=> string(22) "(object value omitted)" ["attributes"]=> string(22) "(object value omitted)" ["ownerDocument"]=> string(22) "(object value omitted)" ["namespaceURI"]=> NULL ["prefix"]=> string(0) "" ["localName"]=> string(2) "h2" ["baseURI"]=> NULL ["textContent"]=> string(45) "ОÑновные характериÑтики" }
又一个词不空且存在
我无法使用 php 5.6.3/win32 和以下代码(您的代码 + 一些样板)重现问题
<?php
$foo = new Foo;
var_export($foo->bar());
class Foo {
public function __construct() {
$this->oHtmlProperty = new DOMDocument;
$this->oHtmlProperty->loadhtml('<html><head><title>...</title></head><body>
<div class="product-spec__body">
<h2 class="title title_size_22">h2_1</h2>
<h2 class="title title_size_22">h2_2</h2>
</div>
<div></div>
<div class="product-spec__body">
<h2 class="title title_size_22">h2_3</h2>
<h2 class="title title_size_22">h2_4</h2>
</div>
</body></html>');
}
public function bar() {
$retval = array(); $aPropertyGroup = array();
$oPath = new \DOMXPath($this->oHtmlProperty);
$oNode = $oPath->query('//div[@class="product-spec__body"]');
foreach ($oNode as $oNodeProperty) {
$oListTitle = $oPath->query('h2[@class="title title_size_22"]', $oNodeProperty);
// ### VARIANT 1 (error with message 'Trying to get property of non-object')
if ( !is_object($oListTitle) ) die('$oListTitle is not an object');
if ( ! ($oListTitle instanceof DOMNodeList) ) die('$oListTitle is not a DOMNodeList');
if ( $oListTitle->length < 1 ) die('oListTitle->length < 1');
$node = $oListTitle->item(0);
if ( is_null($node) ) die('$node is NULL');
if ( !is_object($node) ) die('$node is not an object');
if ( ! ($node instanceof DOMNode) ) die('$node is not a DOMNode');
$aPropertyGroup = [
'title' => $oListTitle->item(0)->textContent,
'property' => []
];
if ( !empty($aPropertyGroup) ) {
$retval[] = $aPropertyGroup;
$aPropertyGroup = array();
}
}
return $retval;
}
}
输出是
array (
0 =>
array (
'title' => 'h2_1',
'property' =>
array (
),
),
1 =>
array (
'title' => 'h2_3',
'property' =>
array (
),
),
)
符合预期。
但也许 libxml_get_last_error() 可以告诉你更多....
你有两个表达式,所以如果第一个匹配项有多个项。根据外部匹配的不同,内部匹配可能有不同的结果。您只设置了一个变量,因此如果所需结果在其中一个外部匹配项中,它将填充该变量。
您没有提供HTML,所以不可能真正重现错误。
但是如果您使用的是 DOMNodelist::item()
,您应该始终验证 return 值是一个节点。
这里有两个可能的优化:
- 将结果限制在第一个节点:
h2[@class="title title_size_22"][1]
- 获取第一个节点的文本内容作为字符串(仅适用于
DOMXPath::evaluate()
):string(h2[@class="title title_size_22"])
示例
$html = <<<'HTML'
<html><head><title>...</title></head><body>
<div class="product-spec__body">
<h2 class="title title_size_22">h2_1</h2>
<h2 class="title title_size_22">h2_2</h2>
</div>
<div></div>
<div class="product-spec__body">
</div>
</body></html>
HTML;
$dom = new DOMDocument();
$dom->loadHtml($html);
$xpath = new DOMXpath($dom);
foreach ($xpath->evaluate('//div[@class="product-spec__body"]') as $index => $spec) {
echo "Run #", $index, "\n";
// all h2 with the class
var_dump($xpath->evaluate('h2[@class="title title_size_22"]', $spec));
// first h2 with the class
var_dump($xpath->evaluate('h2[@class="title title_size_22"][1]', $spec));
// first h2 with the class as string
var_dump($xpath->evaluate('string(h2[@class="title title_size_22"])', $spec));
echo "\n\n";
}
输出 - 比较两次运行的结果:
Run #0
object(DOMNodeList)#9 (1) {
["length"]=>
int(2)
}
object(DOMNodeList)#8 (1) {
["length"]=>
int(1)
}
string(4) "h2_1"
Run #1
object(DOMNodeList)#8 (1) {
["length"]=>
int(0)
}
object(DOMNodeList)#8 (1) {
["length"]=>
int(0)
}
string(0) ""