如何修复 302 重定向 Scrapy?
How to fix 302 redirect Scrapy?
我正在尝试抓取 https://howlongtobeat.com,但我不断收到 302 重定向。我发现该网站正在使用来自网络监视器的ajax。
我的代码:
class HltbSpider(scrapy.Spider):
name = 'hltb'
def start_requests(self):
for i in list(range(1,2)):
url = f'https://howlongtobeat.com/search_results?page={i}'
payload = "queryString=&t=games&sorthead=popular&sortd=0&plat=&length_type=main&length_min=&length_max=&v=&f=&g=&detail=&randomize=0"
headers = {
"content-type":"application/x-www-form-urlencoded",
"user-agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Mobile Safari/537.36"
}
yield scrapy.Request(url, meta = {'dont_redirect': True,'handle_httpstatus_list': [302]}, method='POST', body=payload, headers=headers, callback=self.parse)
def parse(self, response):
cards = response.css('div[class="search_list_details"]')
for card in cards:
game_name = card.css('a[class=text_white]::attr(title)').get()
game_dict = {"Game_name":game_name}
yield game_dict
它以前工作时突然停止工作,我一直收到 302 重定向。似乎是什么问题?
尝试设置参考您的请求headers:"referer": "https://howlongtobeat.com/"
我正在尝试抓取 https://howlongtobeat.com,但我不断收到 302 重定向。我发现该网站正在使用来自网络监视器的ajax。
我的代码:
class HltbSpider(scrapy.Spider):
name = 'hltb'
def start_requests(self):
for i in list(range(1,2)):
url = f'https://howlongtobeat.com/search_results?page={i}'
payload = "queryString=&t=games&sorthead=popular&sortd=0&plat=&length_type=main&length_min=&length_max=&v=&f=&g=&detail=&randomize=0"
headers = {
"content-type":"application/x-www-form-urlencoded",
"user-agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Mobile Safari/537.36"
}
yield scrapy.Request(url, meta = {'dont_redirect': True,'handle_httpstatus_list': [302]}, method='POST', body=payload, headers=headers, callback=self.parse)
def parse(self, response):
cards = response.css('div[class="search_list_details"]')
for card in cards:
game_name = card.css('a[class=text_white]::attr(title)').get()
game_dict = {"Game_name":game_name}
yield game_dict
它以前工作时突然停止工作,我一直收到 302 重定向。似乎是什么问题?
尝试设置参考您的请求headers:"referer": "https://howlongtobeat.com/"