PHP DomXPath 未选择空文本节点

Question

我正在尝试 select 不包含任何文本的节点。这段 php 代码跳过示例 xml 中的空节点。但是，当我尝试使用在线测试仪（例如 http://freeformatter.com/xpath-tester.html）时，它没有任何问题。

这是 PHP 的事情吗？

我的php代码：

    $path = "//RecipeSteps/RecipeStep[not(text())]";
    $stepsQuery = $this->xpath->query($path);
    $numResults = $stepsQuery->length;

我的样本xml:

<?xml version="1.0" encoding="utf-8"?>
<Recipes>
    <RecipeSteps>
      <RecipeStep number="1">Dummy content</RecipeStep>
      <RecipeStep number="2">Dummy content</RecipeStep>
      <RecipeStep number="3">Dummy content</RecipeStep>
      <RecipeStep number="4">Dummy content</RecipeStep>
      <RecipeStep number="5">Dummy content</RecipeStep>
      <RecipeStep number="6"></RecipeStep>
      <RecipeStep number="7">Variations</RecipeStep>
      <RecipeStep number="8">Some variation content..</RecipeStep>
    </RecipeSteps>
</Recipes>

Answer 1

如果您正在寻找 XPATH 解决方案，请使用 //RecipeSteps/(RecipeStep[string-length() = 0])。例如

$path = "//RecipeSteps/(RecipeStep[string-length() = 0])";
$stepsQuery = $this->xpath->query($path);
$numResults = $stepsQuery->length;

Answer 2

选择完整的路径时，它可以正常工作：

$xmlString = '<?xml version="1.0" encoding="utf-8"?>
<Recipes>
    <RecipeSteps>
      <RecipeStep number="1">Dummy content</RecipeStep>
      <RecipeStep number="2">Dummy content</RecipeStep>
      <RecipeStep number="3">Dummy content</RecipeStep>
      <RecipeStep number="4">Dummy content</RecipeStep>
      <RecipeStep number="5">Dummy content</RecipeStep>
      <RecipeStep number="6"></RecipeStep>
      <RecipeStep number="7">Variations</RecipeStep>
      <RecipeStep number="8">Some variation content..</RecipeStep>
    </RecipeSteps>
</Recipes>';

$dom = new DOMDocument();
$dom->loadXML($xmlString);
$xpath = new DOMXpath($dom);
# it works also well: //RecipeSteps/RecipeStep[not(text())]
$query = $xpath->query('//Recipes/RecipeSteps/RecipeStep[not(text())]');
//returns "6"
print 'RecipeStep number: ' . $query->item(0)->getAttribute('number');

此外，选择“//RecipeSteps/RecipeStep[not(text())]”也很有效。所以很可能你做错了什么。

Answer 3

路径表达式//RecipeStep[not(text())]和//RecipeStep[string-length() = 0]的意思不一样，但是把你显示的文档作为输入，它们return完全一样.在这两种情况下，一个 RecipeStep 节点被 selected 作为结果：

<RecipeStep number="6"/>

//RecipeStep[not(text())] 用简单的英语表示：

Select element nodes called RecipeStep anywhere in the document, but only if they do not have any immediate child text nodes.

另一方面，//RecipeStep[string-length() = 0]表示

Select element nodes called RecipeStep anywhere in the document, but only if the length of their string value (the concatenation of all descendant text nodes) is equal to 0.

只有当第 6 步的食谱看起来像这样时，差异才会明显

<RecipeStep number="6"><child>text</child></RecipeStep>

那么，//RecipeStep[not(text())] 仍然会 select 这个节点，而 //RecipeStep[string-length() = 0] 不会 return 任何东西。

（为了清楚起见：我省略的前导 //RecipeSteps 不会改变任何东西。）

因此，您原来的 XPath 表达式是正确的 - 并且接受的答案与您原来的答案完全相同。 XPath 在这里没有错。

PHP DomXPath 未选择空文本节点

PHP DomXPath not selecting empty text nodes

php

xml

xpath

domxpath