如何在 xml 内容(原子提要)中获取 href 属性值?
How to get a href attribute value in xml content (atom feed)?
我将获取请求中的内容(atom 提要/xml 内容)保存为 content = response.text
,内容如下所示:
<feed xmlns="http://www.w3.org/2005/Atom">
<title type="text">title-a</title>
<subtitle type="text">content: application/abc</subtitle>
<updated>2021-08-05T16:29:20.202Z</updated>
<id>tag:tag-a,2021-08:27445852</id>
<generator uri="uri-a" version="v-5.1.0.3846329218047">abc</generator>
<author>
<name>name-a</name>
<email>email-a</email>
</author>
<link href="url-a" rel="self"/>
<link href="url-b" rel="next"/>
<link href="url-c" rel="previous"/>
</feed>
如何使用 rel="next" 获取 href 属性的值“url-b”?
我用 ElementTree 模块试了一下,例如:
from xml.etree import ElementTree
response = requests.get("myurl", headers={"Authorization": f"Bearer {my_access_token}"})
content = response.text
tree = ElementTree.fromstring(content)
tree.find('.//link[@rel="next"]')
// or
tree.find('./link').attrib['href']
但这没有用。
非常感谢您的帮助,并提前致谢。
如果有更简单、更简单的解决方案(也许是 feedparser),我也欢迎。
您可以使用这个 XPath-1.0 表达式:
./*[local-name()="feed"]/*[local-name()="link" and @rel="next"]/@href
这应该导致“url-b”。
How can I get the value "url-b" of the href attribute with rel="next" ?
见下文
from xml.etree import ElementTree as ET
xml = '''<feed xmlns="http://www.w3.org/2005/Atom">
<title type="text">title-a</title>
<subtitle type="text">content: application/abc</subtitle>
<updated>2021-08-05T16:29:20.202Z</updated>
<id>tag:tag-a,2021-08:27445852</id>
<generator uri="uri-a" version="v-5.1.0.3846329218047">abc</generator>
<author>
<name>name-a</name>
<email>email-a</email>
</author>
<link href="url-a" rel="self"/>
<link href="url-b" rel="next"/>
<link href="url-c" rel="previous"/>
</feed>'''
root = ET.fromstring(xml)
links = root.findall('.//{http://www.w3.org/2005/Atom}link[@rel="next"]')
for link in links:
print(f'{link.attrib["href"]}')
输出
url-b
我将获取请求中的内容(atom 提要/xml 内容)保存为 content = response.text
,内容如下所示:
<feed xmlns="http://www.w3.org/2005/Atom">
<title type="text">title-a</title>
<subtitle type="text">content: application/abc</subtitle>
<updated>2021-08-05T16:29:20.202Z</updated>
<id>tag:tag-a,2021-08:27445852</id>
<generator uri="uri-a" version="v-5.1.0.3846329218047">abc</generator>
<author>
<name>name-a</name>
<email>email-a</email>
</author>
<link href="url-a" rel="self"/>
<link href="url-b" rel="next"/>
<link href="url-c" rel="previous"/>
</feed>
如何使用 rel="next" 获取 href 属性的值“url-b”?
我用 ElementTree 模块试了一下,例如:
from xml.etree import ElementTree
response = requests.get("myurl", headers={"Authorization": f"Bearer {my_access_token}"})
content = response.text
tree = ElementTree.fromstring(content)
tree.find('.//link[@rel="next"]')
// or
tree.find('./link').attrib['href']
但这没有用。
非常感谢您的帮助,并提前致谢。
如果有更简单、更简单的解决方案(也许是 feedparser),我也欢迎。
您可以使用这个 XPath-1.0 表达式:
./*[local-name()="feed"]/*[local-name()="link" and @rel="next"]/@href
这应该导致“url-b”。
How can I get the value "url-b" of the href attribute with rel="next" ?
见下文
from xml.etree import ElementTree as ET
xml = '''<feed xmlns="http://www.w3.org/2005/Atom">
<title type="text">title-a</title>
<subtitle type="text">content: application/abc</subtitle>
<updated>2021-08-05T16:29:20.202Z</updated>
<id>tag:tag-a,2021-08:27445852</id>
<generator uri="uri-a" version="v-5.1.0.3846329218047">abc</generator>
<author>
<name>name-a</name>
<email>email-a</email>
</author>
<link href="url-a" rel="self"/>
<link href="url-b" rel="next"/>
<link href="url-c" rel="previous"/>
</feed>'''
root = ET.fromstring(xml)
links = root.findall('.//{http://www.w3.org/2005/Atom}link[@rel="next"]')
for link in links:
print(f'{link.attrib["href"]}')
输出
url-b