使用元素树从 XML 文档获取子属性

Getting child attributes from an XML document using element tree

我有一个 xml pom 文件,如下所示:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
    <groupId>com.amirsys</groupId>
    <artifactId>components-parent</artifactId>
    <version>RELEASE</version>
</parent>
<artifactId>statdxws</artifactId>
<version>6.5.0-16</version>
<packaging>war</packaging>
<dependencies>
    <dependency>
        <groupId>org.postgresql</groupId>
        <artifactId>postgresql</artifactId>
        <version>9.4-1200-jdbc41</version>
        <scope>provided</scope>
        <exclusions>
            <exclusion>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-simple</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <groupId>com.amirsys</groupId>
        <artifactId>referencedb</artifactId>
        <version>5.0.0-1</version>
        <exclusions>
            <exclusion>
                <groupId>com.amirsys</groupId>
                <artifactId>jig</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
</dependencies>

我正在尝试使用元素树提取 groupId、artifactId 和版本来创建依赖对象,但找不到依赖标签。到目前为止,这是我的代码:

tree = ElementTree.parse('pomFile.xml')
root = tree.getroot()
namespace = '{http://maven.apache.org/POM/4.0.0}'
for dependency in root.iter(namespace+'dependency'):
    groupId = dependency.get('groupId')
    artifactId = dependency.get('artifactId')
    version = dependency.get('version')
    print groupId, artifactId, version

这没有任何输出,而且我不明白为什么代码没有遍历依赖项标记。任何帮助将不胜感激。

您的 XML 有一个小错误。应该有一个结束标记 </project>,您可能在问题中遗漏了它。

以下对我有用:

from xml.etree import ElementTree
tree = ElementTree.parse('pomFile.xml')
root = tree.getroot()
namespace = '{http://maven.apache.org/POM/4.0.0}'
for dependency in root.iter(namespace+'dependency'):
    groupId = dependency.find(namespace+'groupId').text
    artifactId = dependency.find(namespace+'artifactId').text
    version = dependency.find(namespace+'version').text
    print groupId, artifactId, version

$ python -i a.py
org.postgresql postgresql 9.4-1200-jdbc41
com.amirsys referencedb 5.0.0-1

您对 .get() 的用法是错误的。查看 .get() 的工作原理。假设您的 xml 是:

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

然后你编写 python 代码如下:

import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
for country in root.findall('country'):
   rank = country.find('rank').text
   name = country.get('name')
   print rank, name

这将打印:

Liechtenstein 1
Singapore 4
Panama 68

如您所见,.get() 为您提供了属性值。 docs 对此非常清楚。