使用 XPATH，如何 select 包含特定字符串的任何节点

Question

假设我有一个这样的 XML 文件：

<books>
  <book>
    <title>John is alive</title>
    <abstract>
        A man is found alive after having disappeared for 10 years.
    </abstract>
    <description>
        <en> John disappeared 10 years ago. Lorem ipsum dolor sit amet ...</en>
        <fr> Il y a 10 ans, John disparaissait. Lorem ipsum dolor sit amet ...</fr>
    </description>
    <notes>First book in the series, where the character is introduced</notes>
  </book>
  <book>
    <title>The disappearance of John</title>
    <abstract>
        A prequel to the book "John is alive".
    </abstract>
    <description>
        <en> He lead an ordinary life, but then ... lorem ipsum dolor sit amet ...</en>
        <fr> Sa vie était tout à fait ordinaire, mais ... lorem ipsum dolor sit amet ...</fr>
    </description>
    <notes>Second book in the "John" series, but first in chronological order</notes>
  </book>
</books>

我的问题很简单：如何使用 XPATH 获取包含单词 John 的所有节点的集合？

显然，我可以指定一系列节点并且工作正常：

(//title | //abstract | //description/* | //notes)[contains(lower-case(text()),"john")]

但是如果我的 XML 增长（而且它会！），随着新元素被添加到结构中的各个级别，我不想经常返回并调整我的 XPATH。

我不明白的是为什么像

这样的通用语句

//*[contains(lower-case(text()),"john")]

失败并显示此错误消息 Required cardinality of first argument of lower-case() is one or zero。

然而，并非所有带有星号的语句都会失败。

例如：

//books/book/*[contains(lower-case(text()),"john")] 失败并显示上述错误消息

而

//books/book/*/*[contains(lower-case(text()),"john")] 成功并从第一个 <description> 元素

中检索 <en> 和 <fr> 节点

如果不可能，好吧，我会在我的 XPATH 中列出所有元素，但我仍然想清楚地了解 * 选择器在 [=22 的上下文中的行为=] 操作.

Answer 1

术语节点（请参阅 ) and the term contains (see ）不够精确，但以下 XPath 之一可能会满足您的需求：

所有节点，其string value包含子串，"John":
```
//node()[contains(.,"John")]
```
所有这样的元素:
```
//*[contains(.,"John")]
```
所有这样的属性:
```
//@*[contains(.,"John")]
```
所有这样的文本节点:
```
//text()[contains(.,"John")]
```
所有元素具有包含子字符串 "John":
的文本节点子项
```
//*[text()[contains(.,"John")]]
```

请注意，#1 将包含 books，但 #5 将排除它。参见。

如果您使用的是 XPath 2.0，则可以将上述任何 XPath 中的 contains(.,"John") 替换为 contains(lower-case(.),"john")。另见 Case insensitive XPath contains() possible?

使用 XPATH，如何 select 包含特定字符串的任何节点

Using XPATH, how to select ANY node that contains a certain string

xml

xpath

contains