XML: 打印 findall() 函数的前一个元素

XML: Print previous Element of the findall() function

我正在使用一个 xml 语料库,它看起来像这样:

<corpus>
  <dialogue speaker="A">
    <sentence tag1="attribute1" tag2="attribute2"> Hello </sentence>
  </dialogue>
  <dialogue speaker="B">
    <sentence tag1="different_attribute1" tag2= "different_attribute2"> How are you </sentence>
  </dialogue>
</corpus>

我使用 root.findall() 搜索“different_attribute2”的所有实例,但我不仅想打印包含该属性的父元素,还想打印它之前的元素:

{'speaker': 'A'}
Hello
{'speaker':'B'}
How are you

我在编码方面很新,所以我尝试了一堆 for 循环和 if 语句但没有结果。我开始于:

for words in root.findall('.//sentence[@tag2="different_attribute2"]'):
    for speaker in root.findall('.//sentence[@tag2="different_attribute2"]...'):
        print(speaker.attrib)
        print(words.text)

但是我完全不知道如何检索扬声器 A。有人可以帮助我吗?

使用 lxml 并使用单个 xpath 查找所有元素:

>>> from lxml import etree
>>> tree = etree.parse('/home/lmc/tmp/test.xml')
>>> for e in tree.xpath('//sentence[@tag2="different_attribute2"]/parent::dialogue/@speaker | //sentence[@tag2="different_attribute2"]/text() | //dialogue[following-sibling::dialogue/sentence[@tag2="different_attribute2"]]/sentence/text() | //dialogue[following-sibling::dialogue/sentence[@tag2="different_attribute2"]]/@speaker'):
...      print(e)
... 
A
 Hello 
B
 How are you 

Xpath 详细信息

查找speaker B
//sentence[@tag2="different_attribute2"]/parent::dialogue/@speaker

找到 B
sentence //sentence[@tag2="different_attribute2"]/text()

给定 B
找到 A 的 sentence //dialogue[following-sibling::dialogue/sentence[@tag2="different_attribute2"]]/sentence/text()

给定 B
speaker=A //dialogue[following-sibling::dialogue/sentence[@tag2="different_attribute2"]]/@speaker'