如何在解析时更改 scrapy closespider itemcount
How to change scrapy closespider itemcount while parsing
我是 scrapy 的新手。
是否可以在蜘蛛 运行 时更改 CLOSESPIDER_ITEMCOUNT?
class TestSpider(scrapy.Spider):
name = 'tester'
custom_settings = {'CLOSESPIDER_ITEMCOUNT': 100,}
def start_requests(self):
urls = ['https://google.com', 'https://amazon.com']
for url in urls:
yield scrapy.Request(url, callback=self.parse)
def parse(self, response):
if response.xpath('//*[id="content"]') or True: # only for testing
# set CLOSESPIDER_ITEMCOUNT to 300
# rest of code
我希望能够更改 parse
方法中“if 条件”的值
您可以访问爬虫设置对象,解冻设置,更改值,然后再次冻结设置对象。请注意,由于文档中没有记录,因此可能会产生意想不到的效果。
class TestSpider(scrapy.Spider):
name = 'tester'
custom_settings = {'CLOSESPIDER_ITEMCOUNT': 100,}
def start_requests(self):
urls = ['https://google.com', 'https://amazon.com']
for url in urls:
yield scrapy.Request(url, callback=self.parse)
def parse(self, response):
if response.xpath('//*[id="content"]') or True: # only for testing
self.crawler.settings.frozen = False
self.crawler.settings.set("CLOSESPIDER_ITEMCOUNT", 300)
self.crawler.settings.frozen = True
# add the rest of the code
我是 scrapy 的新手。 是否可以在蜘蛛 运行 时更改 CLOSESPIDER_ITEMCOUNT?
class TestSpider(scrapy.Spider):
name = 'tester'
custom_settings = {'CLOSESPIDER_ITEMCOUNT': 100,}
def start_requests(self):
urls = ['https://google.com', 'https://amazon.com']
for url in urls:
yield scrapy.Request(url, callback=self.parse)
def parse(self, response):
if response.xpath('//*[id="content"]') or True: # only for testing
# set CLOSESPIDER_ITEMCOUNT to 300
# rest of code
我希望能够更改 parse
方法中“if 条件”的值
您可以访问爬虫设置对象,解冻设置,更改值,然后再次冻结设置对象。请注意,由于文档中没有记录,因此可能会产生意想不到的效果。
class TestSpider(scrapy.Spider):
name = 'tester'
custom_settings = {'CLOSESPIDER_ITEMCOUNT': 100,}
def start_requests(self):
urls = ['https://google.com', 'https://amazon.com']
for url in urls:
yield scrapy.Request(url, callback=self.parse)
def parse(self, response):
if response.xpath('//*[id="content"]') or True: # only for testing
self.crawler.settings.frozen = False
self.crawler.settings.set("CLOSESPIDER_ITEMCOUNT", 300)
self.crawler.settings.frozen = True
# add the rest of the code