获得 element.tagName 的问题。使用 Python 和 xml.dom.minidom 解析 XML

Question

我正在用 Python (xml.dom.minidom) 解析 XML，但我无法获取节点的 tagName。

解释器正在返回：

AttributeError: Text instance has no attribute 'tagName'

当我尝试从节点中提取（例如）字符串 'format' 时：

<format>DVD</format>

我在 Starckoverflow 中找到了几个非常相似的帖子，但我仍然找不到解决方案。

我知道可能有替代模块来处理这个问题，但我的目的是了解为什么它会失败。

在此先致谢并致以最诚挚的问候，

这是我的代码：

from xml.dom.minidom import parse
import xml.dom.minidom

# Open XML document
xml = xml.dom.minidom.parse("movies.xml")

# collection Node
collection_node = xml.firstChild

# movie Nodes
movie_nodes = collection_node.childNodes

for m in movie_nodes:

    if len(m.childNodes) > 0:
        print '\nMovie:', m.getAttribute('title')

        for tag in m.childNodes:
            print tag.tagName  # AttributeError: Text instance has no attribute 'tagName'
            for text in tag.childNodes:
                print text.data

这里是 XML:

<collection shelf="New Arrivals">
<movie title="Enemy Behind">
   <type>War, Thriller</type>
   <format>DVD</format>
   <year>2003</year>
   <rating>PG</rating>
   <stars>10</stars>
   <description>Talk about a US-Japan war</description>
</movie>
<movie title="Transformers">
   <type>Anime, Science Fiction</type>
   <format>DVD</format>
   <year>1989</year>
   <rating>R</rating>
   <stars>8</stars>
   <description>A schientific fiction</description>
</movie>
</collection>

类似的帖子：

Get node name with minidom

Element.tagName for python not working

Answer 1

错误是由于元素节点之间的换行被认为是 TEXT_NODE 类型（参见 Node.nodeType）和 TEXT_NODE 没有 tagName 属性。

您可以添加节点类型检查以避免从文本节点打印 tagName :

if tag.nodeType != tag.TEXT_NODE:
    print tag.tagName

Answer 2

这是经过用户上述修改后的代码：har07.

for tag in m.childNodes:
        if tag.nodeType != tag.TEXT_NODE:
        for text in tag.childNodes:
            print tag.tagName, ':', text.data

它现在就像一个魅力。

获得 element.tagName 的问题。使用 Python 和 xml.dom.minidom 解析 XML

Problems to get element.tagName. Parsing an XML with Python and xml.dom.minidom

python

xml

parsing

tagname