XML 用 ElementTree 解析

XML parsing with ElementTree

我想知道是否可以使用标签中的现有文本来获取 XML 树中下一个标签上的文本,考虑到以下 XML 文件:

...
<link>
   <description>document</description>
   <url>https://www.../doc/file.pdf</url>
</link>
<link>
   <description>document1</description>
   <url>https://www.../doc1/file1.pdf</url>
</link>
<link>
   <description>document2</description>
   <url>https://www.../doc2/file2.pdf</url>
</link>
...                     
    
    for item in tree.findall('.//subChapter//document//link//'):
        if item.tag == 'description':
            if item.text == 'document':
                **THEN GET THE TEXT ON THE NEXT TAG <url>...</url>**
                **e.g: https://www.../doc/file.pdf**
                print(NEXT TAG)
            elif item.text == 'document1':
                **THEN GET THE TEXT ON THE NEXT TAG <url>...</url>**
                **e.g: https://www..../doc/file1.pdf**
                print(NEXT TAG)
            elif item.text == 'document2':
                **THEN GET THE TEXT ON THE NEXT TAG <url>...</url>**
                **e.g: https://www.../doc/file2.pdf**
                print(NEXT TAG)

谢谢!

使用 lxml 解析器时,可以通过使用 getnext() 函数来实现。使用 ElementTree 时,这可以通过更改循环来实现:

# iterate over link elements
for link in tree.findall('.//subChapter//document/link'):
    # keep reference to link child elements
    children = list(link)
    for item in children:
        if item.tag == 'description':
            if item.text == 'document':
                # acess necessary link child by index
                next_tag = children[1]
                print(next_tag.text)