扭曲的关键:延迟中未处理的错误:
twisted CRITICAL: Unhandled error in Deferred:
我正在使用 scrapy-splash 来抓取这个网站,蜘蛛给出了“[twisted] CRITICAL:Deferred 中未处理的错误:”
尝试了堆栈溢出和其他网站上的所有内容
我的蜘蛛代码
class DarazspidySpider(scrapy.Spider):
name = 'darazspidy'
def start_requests(self):
url = 'https://www.daraz.pk/smartphones/'
SplashRequest(url=url, callback=self.parse,
endpoint='render.html', args={'wait': 0.5})
def parse(self, response):
for phone in response.xpath('//div[@class="c5TXIP"]'):
yield {
'Name',
phone.xpath('.//*[contains(concat( " ", @class, " " ), concat( " ", "c16H9d", " " ))]//a').extract(),
'price',
phone.xpath('.//*[contains(concat( " ", @class, " " ), concat( " ", "c13VH6", " " ))]').extract(),
}
您生成的是集合,而不是字典。你能尝试生成一本字典吗?
您的集合创建将失败,因为您无法将列表添加到集合中。
试试这样的方法:
def parse(self, response):
for phone in response.xpath('//div'):
yield {
'Name': phone.xpath('.//*[contains(concat( " ", @class, " " ), concat( " ", "c16H9d", " " ))]//a').extract(),
'price': phone.xpath('.//*[contains(concat( " ", @class, " " ), concat( " ", "c13VH6", " " ))]').extract(),
}
您可能还需要提交启动请求:
yield SplashRequest(url=url, callback=self.parse,
endpoint='render.html', args={'wait': 0.5})
我正在使用 scrapy-splash 来抓取这个网站,蜘蛛给出了“[twisted] CRITICAL:Deferred 中未处理的错误:”
尝试了堆栈溢出和其他网站上的所有内容
我的蜘蛛代码
class DarazspidySpider(scrapy.Spider):
name = 'darazspidy'
def start_requests(self):
url = 'https://www.daraz.pk/smartphones/'
SplashRequest(url=url, callback=self.parse,
endpoint='render.html', args={'wait': 0.5})
def parse(self, response):
for phone in response.xpath('//div[@class="c5TXIP"]'):
yield {
'Name',
phone.xpath('.//*[contains(concat( " ", @class, " " ), concat( " ", "c16H9d", " " ))]//a').extract(),
'price',
phone.xpath('.//*[contains(concat( " ", @class, " " ), concat( " ", "c13VH6", " " ))]').extract(),
}
您生成的是集合,而不是字典。你能尝试生成一本字典吗?
您的集合创建将失败,因为您无法将列表添加到集合中。
试试这样的方法:
def parse(self, response):
for phone in response.xpath('//div'):
yield {
'Name': phone.xpath('.//*[contains(concat( " ", @class, " " ), concat( " ", "c16H9d", " " ))]//a').extract(),
'price': phone.xpath('.//*[contains(concat( " ", @class, " " ), concat( " ", "c13VH6", " " ))]').extract(),
}
您可能还需要提交启动请求:
yield SplashRequest(url=url, callback=self.parse,
endpoint='render.html', args={'wait': 0.5})