如何绕过元素树不匹配标签错误?

How to bypass Element Tree mismatched tag error?

所以只是一点上下文,我目前正在使用 Element Tree 来抓取几个加密新闻提要以获取最新的文章标题。下面的代码适用于大多数网站,但在某些提要中我收到以下错误,例如:

xml.etree.ElementTree.ParseError:标签不匹配:第 134 行,第 2 列

我猜这是因为该网站的 XML 代码有误。我正在寻找一种方法来绕过此错误并无论如何都拉出最后一个标题,希望对此有所帮助:) 代码如下:

import xml.etree.ElementTree as ET
import requests

r = requests.get('https://cointelegraph.com/feed')
root = ET.fromstring(r.text)

headline = root.find('channel/item/title').text


print(headline)

您可能正在访问 Cloudflare 验证码页面。尝试在 HTTP headers:

中指定 User-Agent
import xml.etree.ElementTree as ET
import requests

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0"
}
r = requests.get("https://cointelegraph.com/feed", headers=headers)
root = ET.fromstring(r.text)
headline = root.find("channel/item/title").text
print(headline)

打印:

Why is XRP seeing a monster rally when Ripple is worth just B on the secondary market?