Python lxml.html xpath 没有 return 任何元素

Question

我使用带有 lxml 的请求从我的网站获取一些内容，但有时它没有 return 它应该的元素。我刚刚在维基百科页面上尝试过，有 20% 的时间，它不起作用，这是重现“错误”的代码：

import requests
import lxml.html
url= "https://en.wikipedia.org/w/index.php?title=Web_crawler&action=edit&section=2"
resp = requests.get(url)
print(resp.text[:500]) #print <title> tag
tree = lxml.html.fromstring(resp.text)
title = tree.xpath('//title') #returns an empty list []

正如您在此处看到的，当我从请求库中打印 HTML 时，我看到以下内容：

<!DOCTYPE html>
<html class="client-nojs" lang="en" dir="ltr">
<head>
<meta charset="UTF-8"/>
<title>Editing Web crawler (section) - Wikipedia</title>
<script>document.documentElement.className="client-js";RLCONF={"wgBreakFrames":!0,"wgSeparatorTransformTable":["",""],"
...

你可以很清楚地看到 <title> 标签，但 xpath //title LXML 似乎无法正确捕获它。当我打印 title 时，我得到一个空列表 [] 此代码适用于像这个这样的其他一些 URL https://en.wikipedia.org/wiki/Web_crawler 有什么想法吗？

Answer 1

感谢@jackFeeting 的评论，我更新了 lxml，我的代码工作得很好。 pip3 install --upgrade lxml 从版本 4.4.1 更新到 4.6.2

Python lxml.html xpath 没有 return 任何元素

Python lxml.html xpath doesn't return any element

python

xpath

lxml

python-requests