在 Groovy 中将 XML 命名空间与 XmlSlurper 一起使用 - 如何正确查询路径？

Question

我有以下示例 xml:

<root>

<table xmlns:h="http://www.w3.org/TR/html4/">
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</table>

<table xmlns:f="https://www.w3schools.com/furniture">
  <name>African Coffee Table</name>
  <width>80</width>
  <length>120</length>
</table>

</root>

def slurper = new XmlSlurper().parseText(someXMLText)
def hNs = new groovy.xml.Namespace(
                    "http://www.w3.org/TR/html4/", 'h')
def fNs = new groovy.xml.Namespace(
                    "https://www.w3schools.com/furniture", 'h')
println slurper.root[hNs.table].tr.td //not giving any response

因为有两个 table 标签有不同的标签。如何使用命名空间使用 gpath 获取标签下的 Apples 值。

Answer 1

您对 XML 文档的使用不正确。当你定义一个像 xmlns:h="http://www.w3.org/TR/html4/" 这样的命名空间时，你创建了一个必须明确使用的前缀。否则，如果没有分配给任何节点，则无法使用此前缀查询文档。您需要将它分配给至少一个 table 标签才能使用它。

<h:table xmlns:h="http://www.w3.org/TR/html4/">
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</h:table>

但是，如果要为每个 table 节点（及其子节点）创建一个默认命名空间，则需要跳过前缀并定义一个没有它的命名空间。

<table xmlns="http://www.w3.org/TR/html4/">
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</table>

发现细微差别 - 在第二个示例中，我们使用 xmlns 属性定义命名空间，而不是前一个例子中的 xmlns:h 属性。

当您使用默认命名空间时，您可以使用declareNamespace 方法为默认命名空间定义前缀。这允许您使用像 h:table 这样的选择器，它引用由声明的命名空间映射中的 h 前缀定义的命名空间中的 table 标记。考虑以下示例：

def source = '''<root>

<table xmlns="http://www.w3.org/TR/html4/">
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</table>

<table xmlns="https://www.w3schools.com/furniture">
  <name>African Coffee Table</name>
  <width>80</width>
  <length>120</length>
</table>

</root>'''

def root = new XmlSlurper().parseText(source).declareNamespace([
    h: "http://www.w3.org/TR/html4/", 
    f: "https://www.w3schools.com/furniture"
])

assert root."h:table".tr.td.first().text() == "Apples"
assert root."h:table".tr.td.last().text() == "Bananas"
assert root."f:table".width.toInteger() == 80

在此示例中，我们使用 XML 文档，该文档为 table 标签定义了两个不同的默认命名空间。使用 declareNamespace 方法，我们可以为这些命名空间定义前缀，以便我们可以在标签选择器中使用前缀。

如果出于某种原因，您需要在 table 节点级别定义带有前缀的命名空间，您至少需要在顶层使用此前缀。

def source = '''<root>

<h:table xmlns:h="http://www.w3.org/TR/html4/">
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</h:table>

<f:table xmlns:f="https://www.w3schools.com/furniture">
  <name>African Coffee Table</name>
  <width>80</width>
  <length>120</length>
</f:table>

</root>'''

def root = new XmlSlurper().parseText(source).declareNamespace([
    h: "http://www.w3.org/TR/html4/",
    f: "https://www.w3schools.com/furniture"
])

assert root."h:table".tr.td.first().text() == "Apples"
assert root."h:table".tr.td.last().text() == "Bananas"
assert root."f:table".width.toInteger() == 80

希望对您有所帮助。

在 Groovy 中将 XML 命名空间与 XmlSlurper 一起使用 - 如何正确查询路径？

Using XML namespace with XmlSlurper in Groovy - how to query path correctly?

groovy

xml-namespaces

xml-parsing