Python 属性解析 returns None 为 xml:id

Question

我正在尝试使用以下代码从 tei 文件中提取一些信息：

tree = ET.parse(path)
root = tree.getroot()
body = root.find("{http://www.tei-c.org/ns/1.0}text/{http://www.tei-c.org/ns/1.0}body")  
for s in body.iter("{http://www.tei-c.org/ns/1.0}s"):
    for w in s.iter("{http://www.tei-c.org/ns/1.0}w"):
        wordpart = w.find("{http://www.tei-c.org/ns/1.0}seg")
        word = ''.join(wordpart.itertext())
        type = w.get('type')
        xml = w.get('xml:id') 
        print(type)             
        print(xml)

type 的输出是正确的，它打印出例如"noun"。但是对于 xml:id 我只能得到 None。这是我需要解析的 xml 文件的摘录：

<w type="noun" xml:id="w.4940"><seg type="orth">sloterheighe</seg>...

Answer 1

要获取 xml:id 属性的值，您需要像这样指定命名空间 URI（有关详细信息，请参阅）：

xml = w.attrib['{http://www.w3.org/XML/1998/namespace}id']

或

xml = w.get('{http://www.w3.org/XML/1998/namespace}id')

此外，请注意 type 是 Python 中的内置方法，因此请避免将其用作变量名。

Python 属性解析 returns None 为 xml:id

Python attribute parsing returns None for xml:id

python

attributes

elementtree

xml-parsing

python-3.x