使用带有自闭合标记错误的 ElementTree 解析 XML

Question

我有一些 xml 正在用 ElementTree 解析，我不相信 structure/content 除了我提供的 xml 行是相关的，所以我省略了它。

我将其解析为：Rwy.find('Special').text

而当 xml 行是：<Special> </Special>

然后一切都按预期解析，但是当 xml 行改为：

<Special/>

它产生错误：TypeError: must be str, not NoneType 这让我相信由于自关闭标签而不是前面带有结束标签的示例存在一些差异。

如何正确解析带有自闭标签的元素？

Answer 1

白色 space 在 XML 中很重要。

比较：

>>> from xml.etree import ElementTree as et
>>> s = '<test><Special>   </Special></test>'
>>> tree = et.fromstring(s)
>>> tree.find('Special')
<Element 'Special' at 0x000001A7E9B154F8>
>>> tree.find('Special').text
'   '

对比：

>>> s = '<test><Special/></test>'
>>> tree = et.fromstring(s)
>>> tree.find('Special')
<Element 'Special' at 0x000001A7E9B1F638>
>>> tree.find('Special').text
>>>

第return个str。第二个 returns None。 self-closing标签中没有.text内容

在使用前检查 .text 的 return 值。

使用带有自闭合标记错误的 ElementTree 解析 XML

Parsing XML with ElementTree with self closing tag error

python

xml

elementtree