Python LXML 从 Steam 捆绑包页面获取数据 - 列出索引错误
Python LXML Getting Data from Steam Bundle Page - List out of the index error
我正在开发 python 程序,在它获得 Steam 捆绑包的 ID 后 - 它 returns 当前价格.
程序正在使用 requests 和 lxml。
最终价格有两条路径:
- /html/body/div[1]/div[7]/div[4]/div[1]/div[2]/div/div[2]/div[10]/div[3]/div
- //*[@id="game_area_purchase"]/div/div/div/div[1]/div/div/div[2]
使用示例:https://store.steampowered.com/bundle/16140
这是一个代码:
import requests
import lxml.html
#example URL for steam bundle
URL = "https://store.steampowered.com/bundle/16140"
html = requests.get(URL)
doc = lxml.html.fromstring(html.content)
#xpath to price location
price = doc.xpath('/html/body/div[1]/div[7]/div[4]/div[1]/div[2]/div/div[2]/div[10]/div[3]/div/text()')
print(price)
编程returns这个:
[]
或这个
Traceback (most recent call last):
File <path-to-program>, line 9, in <module>
price = doc.xpath('/html/body/div[1]/div[7]/div[4]/div[1]/div[2]/div/div[2]/div[10]/div[3]/div/text()')[0]
IndexError: list index out of range
两个选项都出错。
我应该怎么做才能解决它?
要获取所需的页面 HTML,您需要使用 birthtime
cookie 添加请求,“告诉”服务器您的年龄允许您访问包含 sexual/nudity 内容的页面:
import requests
import lxml.html
URL = "https://store.steampowered.com/bundle/16140"
session = requests.Session()
r1 = session.get(URL)
r1.cookies['birthtime']='439423201' # this is date in seconds since "epoch" (January 1, 1970)
r2 = session.get(URL, cookies=r1.cookies)
doc = lxml.html.fromstring(r2.content)
print(doc.xpath('//div[contains(@class, "discount_final_price")]/text()')[0])
我正在开发 python 程序,在它获得 Steam 捆绑包的 ID 后 - 它 returns 当前价格.
程序正在使用 requests 和 lxml。
最终价格有两条路径:
- /html/body/div[1]/div[7]/div[4]/div[1]/div[2]/div/div[2]/div[10]/div[3]/div
- //*[@id="game_area_purchase"]/div/div/div/div[1]/div/div/div[2]
使用示例:https://store.steampowered.com/bundle/16140
这是一个代码:
import requests
import lxml.html
#example URL for steam bundle
URL = "https://store.steampowered.com/bundle/16140"
html = requests.get(URL)
doc = lxml.html.fromstring(html.content)
#xpath to price location
price = doc.xpath('/html/body/div[1]/div[7]/div[4]/div[1]/div[2]/div/div[2]/div[10]/div[3]/div/text()')
print(price)
编程returns这个:
[]
或这个
Traceback (most recent call last):
File <path-to-program>, line 9, in <module>
price = doc.xpath('/html/body/div[1]/div[7]/div[4]/div[1]/div[2]/div/div[2]/div[10]/div[3]/div/text()')[0]
IndexError: list index out of range
两个选项都出错。 我应该怎么做才能解决它?
要获取所需的页面 HTML,您需要使用 birthtime
cookie 添加请求,“告诉”服务器您的年龄允许您访问包含 sexual/nudity 内容的页面:
import requests
import lxml.html
URL = "https://store.steampowered.com/bundle/16140"
session = requests.Session()
r1 = session.get(URL)
r1.cookies['birthtime']='439423201' # this is date in seconds since "epoch" (January 1, 1970)
r2 = session.get(URL, cookies=r1.cookies)
doc = lxml.html.fromstring(r2.content)
print(doc.xpath('//div[contains(@class, "discount_final_price")]/text()')[0])