如何修复 302 重定向 Scrapy？

Question

我正在尝试抓取 https://howlongtobeat.com，但我不断收到 302 重定向。我发现该网站正在使用来自网络监视器的ajax。

我的代码：

class HltbSpider(scrapy.Spider):
    name = 'hltb'
    def start_requests(self):
        for i in list(range(1,2)):
            
            url = f'https://howlongtobeat.com/search_results?page={i}'
            payload = "queryString=&t=games&sorthead=popular&sortd=0&plat=&length_type=main&length_min=&length_max=&v=&f=&g=&detail=&randomize=0"
            headers = {
                "content-type":"application/x-www-form-urlencoded",
                "user-agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Mobile Safari/537.36"
            }

            yield scrapy.Request(url, meta = {'dont_redirect': True,'handle_httpstatus_list': [302]}, method='POST', body=payload, headers=headers, callback=self.parse)

           

    def parse(self, response):
        cards = response.css('div[class="search_list_details"]')

        for card in cards: 
            game_name = card.css('a[class=text_white]::attr(title)').get()
            game_dict =  {"Game_name":game_name}
           
            yield game_dict

它以前工作时突然停止工作，我一直收到 302 重定向。似乎是什么问题？

Answer 1

尝试设置参考您的请求headers："referer": "https://howlongtobeat.com/"

如何修复 302 重定向 Scrapy？

How to fix 302 redirect Scrapy?

python

scrapy