我如何在 xml 文件中查找标签并找到它的祖父母?

How do I look for a tag in xml file and find it's grandparent?

我有一个 XML 文件,我想对其进行解析并查找其中存在的某些关键字。 XML 文件如下

...
...
<OBJECT data="file://localhost//var/tmp/autoclean/derive/TheGeometry//Descartes-TheGeometry.djvu" height="3143" type="image/x.djvu" usemap="Descartes-TheGeometry_0269.djvu" width="2077">
    <PARAM name="PAGE" value="Descartes-TheGeometry_0269.djvu"/>
    <PARAM name="DPI" value="400"/>
    <HIDDENTEXT>
        <PAGECOLUMN>
            <REGION>
                <PARAGRAPH>
                    <LINE>
                        <WORD coords="653,237,937,202,236">CATALOGUE</WORD>
                        <WORD coords="962,238,1022,205,237">OF</WORD>
                        <WORD coords="1045,240,1208,205,238">DOVER</WORD>
                        <WORD coords="1231,239,1389,205,238">BOOKS</WORD>
                    </LINE>
                    ...
                </PARAGRAPH>
                ...
                ...
    <HIDDENTEXT>
</OBJECT>
...
...

现在我想在 <WORD> 标签中搜索关键字并检查第一个 <PARAM> 标签的值属性对应于直接 parent <OBJECT> 。 例如,假设我搜索关键字 BOOKS 然后我想从这个标签中获取值 <PARAM name="PAGE" value="Descartes-TheGeometry_0269.djvu"/>

尝试这样的事情:

import lxml.html as lh
books = """[your code]"""
doc = lh.fromstring(books)
vals = doc.xpath('//object/param[following-sibling::hiddentext//word="books"][1]/@value')
for val in vals:
    print(val)

输出:

descartes-thegeometry_0269.djvu