WSL 中的 Geckodriver 卡在 'interactive' 的就绪状态

Geckodriver in WSL stuck in readyState of 'interactive'

我最近换了电脑,我有一个脚本来抓取我试图移植的网站,但它不工作。我是 运行 Mozilla Firefox 93.0geckodriver 0.30.0 (d372710b98a6 2021-09-16 10:29 +0300) Python 3.8.10 Windows Linux

的子系统

geckodriver.log 如下:

1635724994219   geckodriver INFO    Listening on 127.0.0.1:34993
1635724994224   mozrunner::runner   INFO    Running command: "/usr/bin/firefox" "--marionette" "--headless" "--remote-debugging-port" "45483" "-no-remote" "-profile" "/tmp/rust_mozprofileKcEU8P"
*** You are running in headless mode.
1635724994420   Marionette  INFO    Marionette enabled
[GFX1-]: RenderCompositorSWGL failed mapping default framebuffer, no dt
console.warn: SearchSettings: "get: No settings file exists, new profile?" (new NotFoundError("Could not open the file at /tmp/rust_mozprofileKcEU8P/search.json.mozlz4", (void 0)))
DevTools listening on ws://localhost:45483/devtools/browser/ef680f3f-d655-4d3d-86be-7287f5731e16
1635724995327   Marionette  INFO    Listening on port 38833
JavaScript error: resource://services-settings/Attachments.jsm, line 391: TypeError: / is not a valid URL.
1635724995445   RemoteAgent WARN    TLS certificate errors will be ignored for this session
1635724995448   RemoteAgent INFO    Proxy settings initialised: {"proxyType":"manual","httpProxy":"127.0.0.1:46485","sslProxy":"127.0.0.1:46485"}
1635724996122   Marionette  WARN    Ignoring event 'pageshow' because document has an invalid readyState of 'interactive'.
1635725002780   Marionette  WARN    Ignoring event 'pageshow' because document has an invalid readyState of 'interactive'.
[GFX1-]: Receive IPC close with reason=AbnormalShutdown
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.

当程序在命令行上重复抛出以下错误时。

refresh_site() 482 https://www.xkcd.com/
Message: Reached error page: about:neterror?e=nssFailure2&u=https%3A//www.xkcd.com/&c=UTF-8&d=The%20connection%20to%20the%20server%20was%20reset%20while%20the%20page%20was%20loading.
Stacktrace:
WebDriverError@chrome://remote/content/shared/webdriver/Errors.jsm:181:5
UnknownError@chrome://remote/content/shared/webdriver/Errors.jsm:488:5
checkReadyState@chrome://remote/content/marionette/navigate.js:64:24
onNavigation@chrome://remote/content/marionette/navigate.js:312:39
emit@resource://gre/modules/EventEmitter.jsm:160:20
receiveMessage@chrome://remote/content/marionette/actors/MarionetteEventsParent.jsm:42:25

我从以前的经验中熟悉相当多的 geckodriver 错误,并且通常能够通过重新安装具有匹配版本的 firefox 和 geckodriver 来修复它们,但这对我来说是一个新错误,我不知道我应该怎么做才能继续。想法?

编辑:

郑重声明,我可以毫无错误地初始化 webdriver,但是当我取消注释 lin self.driver.get(self.user_site) 时,每次都会抛出错误。

编辑 2:

我怀疑它与 firefox 的命令有关,因为在它工作的计算机上,日志显示它发送的命令为 "/usr/bin/firefox" "--marionette" "--headless" "-foreground" "-no-remote" "-profile" "/tmp/rust_mozprofiledAb1T0",这与我的新计算机正在做的不同,但我不知道足够的 Selenium 来解决这个问题。

编辑 3:

我认为这是一个安全证书问题。我 运行 以下内容作为 python 脚本并且运行良好。

from selenium import webdriver
 
driver = webdriver.Firefox()
driver.get("https://dev.to")
 
driver.find_element_by_id("nav-search").send_keys("Selenium")

当我换入我真正关心的 url 时它仍然有效,但是当我尝试使用我的生产代码以非无头模式打开它时出现安全错误。

编辑4:

此代码重现了该问题并表明可以通过将 seleniumwire 更改为 selenium 来解决该问题

from seleniumwire import webdriver

class Foo:

    def __init__(self):
        self.web_options = webdriver.FirefoxOptions()
        self.driver = webdriver.Firefox(options=self.web_options)
    
    def bar(self):
        self.driver.get("https://xkcd.com")
        print(self.driver.current_url)

Foo().bar()

停止 seleniumwire 的实际错误是 AttributeError: module 'lib' has no attribute 'SSL_CTX_get0_param' 错误,这是由于 https://pypi.org/project/cryptography/#history 被安装为已过时两年的 2.8 版。这似乎是我现在正在研究的最终答案。

原来我的 cryptography 包已经过时了。我 运行 pip install pyopenssl 解决了这个问题。