ReactorNotRestartable - 扭曲和刮擦

Question

在你 link 我回答与此相关的其他答案之前，请注意我已经阅读了它们但仍然有点困惑。好了，开始了。

所以我在 Django 中创建了一个网络应用程序。我正在导入最新的 scrapy 库来抓取网站。我没有用celery（我对它知之甚少，但是在其他与此相关的话题中看到了）。

我们网站的 url 之一，/crawl/，用于启动爬虫运行。这是我们站点中唯一需要使用 scrapy 的 url。这是访问 url 时调用的函数：

def crawl(request):
  configure_logging({'LOG_FORMAT': '%(levelname)s: %(message)s'})
  runner = CrawlerRunner()

  d = runner.crawl(ReviewSpider)
  d.addBoth(lambda _: reactor.stop())
  reactor.run() # the script will block here until the crawling is finished

  return render(request, 'index.html')

您会注意到这是对他们网站上的 scrapy 教程的改编。服务器启动运行时第一次访问此 url 时，一切正常。第二次及以后，抛出 ReactorNotRestartable 异常。我知道当已经停止的反应堆发出重新启动的命令时会发生此异常，这是不可能的。

查看示例代码，我假设行 "runner = CrawlerRunner()" 将 return 一个 ~new~ 反应器供每次访问此 url 时使用。但是我相信也许我对扭曲反应器的理解还不是很清楚。

每次访问此 url 时，我将如何获得并运行一个新反应堆？

非常感谢

Answer 1

一般来说，你不能有一个新的反应器。有一个全球性的。这显然是一个错误，也许将来会得到纠正，但这是目前的情况。

您可以使用 Crochet 在单独的线程中管理单个反应器运行（在整个进程的生命周期内 - 无需重复启动和停止）。

考虑 the example from the Crochet docs:

#!/usr/bin/python
"""
Do a DNS lookup using Twisted's APIs.
"""
from __future__ import print_function

# The Twisted code we'll be using:
from twisted.names import client

from crochet import setup, wait_for
setup()


# Crochet layer, wrapping Twisted's DNS library in a blocking call.
@wait_for(timeout=5.0)
def gethostbyname(name):
    """Lookup the IP of a given hostname.

    Unlike socket.gethostbyname() which can take an arbitrary amount of time
    to finish, this function will raise crochet.TimeoutError if more than 5
    seconds elapse without an answer being received.
    """
    d = client.lookupAddress(name)
    d.addCallback(lambda result: result[0][0].payload.dottedQuad())
    return d


if __name__ == '__main__':
    # Application code using the public API - notice it works in a normal
    # blocking manner, with no event loop visible:
    import sys
    name = sys.argv[1]
    ip = gethostbyname(name)
    print(name, "->", ip)

这为您提供了一个使用 Twisted API 实现的阻塞 gethostbyname 函数。该实现使用 twisted.names.client，它仅依赖于能够导入全局反应器。

请注意，没有 reactor.run 或 reactor.stop 调用 - 只有 Crochet setup 调用。

ReactorNotRestartable - 扭曲和刮擦

ReactorNotRestartable - Twisted and scrapy

django

web-applications

twisted

reactor

scrapy