如何使用 python 从 xml 格式的响应中检索相同标签的所有值?

How to retrieve all values of same tag from a response in xml format using python?

我使用的是 DBpedia 的 Lookup api,它以 xml 格式返回响应,如下所示:

<ArrayOfResults>
    <Result>
        <Label>China</Label>
        <URI>http://dbpedia.org/resource/China</URI>
        <Description>China .... administrative regions of Hong Kong and Macau.</Description>
        <Classes>
            <Class>
                <Label>Place</Label>
                <URI>http://dbpedia.org/ontology/Place</URI>
            </Class>
            <Class>
                <Label>Country</Label>
                <URI>http://dbpedia.org/ontology/Country</URI>
            </Class>
        </Classes>
        <Categories>
            <Category>
                <URI>http://dbpedia.org/resource/Category:Member_states_of_the_United_Nations</URI>
            </Category>
            <Category>
                <URI>http://dbpedia.org/resource/Category:Republics</URI>
            </Category>
        </Categories>
        <Refcount>12789</Refcount>
    </Result>
    <Result>
        <Label>Theatre of China</Label>
        <URI>http://dbpedia.org/resource/Theatre_of_China</URI>
        <Description>Theatre of China ... the 20th century.</Description>
        <Classes/>
        <Categories>
            <Category>
                <URI>http://dbpedia.org/resource/Category:Asian_drama</URI>
            </Category>
            <Category>
                <URI>http://dbpedia.org/resource/Category:Chinese_performing_arts</URI>
            </Category>
        </Categories>
        <Refcount>23</Refcount>
    </Result>
</ArrayOfResults>

我把它缩短了。可以找到完整的回复 in this link

现在,我需要检索 <Label><URI> 标签下的所有值。

这是我目前所做的:

import requests
import xml.etree.ElementTree as ET

response = requests.get('https://lookup.dbpedia.org/api/search?query=China')
response_body = response.content

response_xml = ET.fromstring(response_body)

root = ET.fromstring(response_body)
for child in root:
    print(child.tag)
    for grandchild in child:
        print(f"\t {grandchild.tag}")
        label = grandchild.find('Label')
        uri = grandchild.find('URI')
        print(f"\t required label = {label}")
        print(f"\t required uri = {uri}")

但是labeluri的值在每种情况下都是None。我怎样才能解决这个问题,以便我可以获得 <Result><Label> 标签下的所有值(like China, Theater of China etc)和<URI>标签下呢?

你其实嵌套太深了。您需要在 child(这是一个 <Result> 元素)上调用 find

for child in root:
    label = child.find('Label').text
    uri = child.find('URI').text

您好,我不知道您是否需要知道哪些 URL 连接到哪些标签,但这是获取所有 URL 的一种非常简单的方法

import requests

url = 'https://lookup.dbpedia.org/api/search?query=China'

soup = BeautifulSoup(requests.get(url).text,'xml').find('Result')

labels = [label.text for label in soup.find_all('Label')]

URI= [uri.text for uri in soup.find_all('URI')]