Python 中经过身份验证的页面上的 xpath

Question

我正在使用以下代码从页面中提取内容。但是我现在想在经过身份验证的页面中的页面上使用它。有什么办法可以在 python 内做到这一点吗？

下面是我正在使用的示例代码。

from lxml import html
import requests
page = requests.get('http://www.thesiteurl.com/')
tree = html.fromstring(page.text)
logo = tree.xpath('//*[@id="wraper"]/div[3]/header/div[1]/div[2]/div[1]/a/img//@src')
print logo

Answer 1

我假设你的意思是你想使用 requests get 一个经过身份验证的页面（因为你可以在获取 html 之后做任何你想做的事）？

如果是，则取决于页面的身份验证方式。 requests 文档在此处讨论了各种这样做的方法：link。使用相当简单的语法支持最简单的方案（用户名、密码）：

>>> requests.get('https://api.github.com/user', auth=('user', 'pass'))
<Response [200]>

Python 中经过身份验证的页面上的 xpath

xpath on authenticated page in Python

python

authentication

xpath

lxml

web-scraping