Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix, : Unexpected CURL error: getaddrinfo() thread failed to start

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix, : Unexpected CURL error: getaddrinfo() thread failed to start

我在尝试使用 H2O 的 h2o.automl 功能时一直遇到错误。我正在尝试反复 运行 这个模型。它似乎在 5 或 10 运行 秒后完全失败。

Error in .h2o.__checkConnectionHealth() : 
  H2O connection has been severed. Cannot connect to instance at http://localhost:54321/
getaddrinfo() thread failed to start

In addition: There were 13 warnings (use warnings() to see them)
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix,  : 
  Unexpected CURL error: getaddrinfo() thread failed to start

我已更新 java 以响应:https://h2o-release.s3.amazonaws.com/h2o/rel-wolpert/4/docs-website/h2o-docs/faq/r.html(即使我使用的是 linux 虚拟机)。 我添加了 h2o.removeall()gc() 作为对 R h2o server CURL error, kind of repeatable 的回应 我没有尝试对内存进行任何更改,因为我的集群有 16+ GB,我在 RStudio 中看到的最高读数是 1.6 GiB。

H2O 在 Ubuntu 20.04 虚拟机上的 R/Rstudio 服务器中 运行ning。虚拟盒子软件会不会阻塞什么东西?

我的 H2O 集群的详细信息如下:

openjdk version "11.0.11" 2021-04-20
OpenJDK Runtime Environment (build 11.0.11+9-Ubuntu-0ubuntu2.20.04)
OpenJDK 64-Bit Server VM (build 11.0.11+9-Ubuntu-0ubuntu2.20.04, mixed mode, sharing)

Starting H2O JVM and connecting: ... Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         1 seconds 896 milliseconds 
    H2O cluster timezone:       America/Chicago 
    H2O data parsing timezone:  UTC 
    H2O cluster version:        3.35.0.2 
    H2O cluster version age:    19 hours and 24 minutes  
    H2O cluster name:           H2O_started_from_R_jholderieath_glq667 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   19.84 GB 
    H2O cluster total cores:    12 
    H2O cluster allowed cores:  12 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4 
    R Version:                  R version 4.1.1 (2021-08-10) 

我想我也遇到过这个问题,虽然是在 macOS 12.1 上。 我试着调试它,发现有时我也会得到另一个错误:

Unexpected CURL error: Failed to connect to 127.0.0.1 port 54321: Connection reset by peer

我发现只有当我针对 curl 7.68.0 及更高版本编译 RCurl 时才会出现此问题。

降级到 curl 7.67.0 为我解决了这个问题,但后来我遇到了 RStudio 的一些问题(分段错误),所以我进一步调查了这个问题。

而且我发现用 --disable-socketpair 编译最新版本的 curl 也为我解决了这个问题。

我正在监视打开的文件和套接字 (lsof),在我看来 R 进程用完了它可以创建的套接字,然后 RCurl 因其中一个套接字而失败错误。 运行 gc() 在 R 中经常有帮助(我在每次请求后调用它)但在 gc() 之后打开套接字的最小数量仍然缓慢但单调增加这让我相信可能有一些泄漏。我将此作为一个可能的错误报告给了 RCurl 维护者。

对于任何使用 macOS 和自制软件的人来说,这可以通过 运行 以下内容来完成:

$ brew edit curl # add --disable-socketpair to args list
$ brew install --build-from-source curl # using reinstall might be needed instead of install

$ export RCURL_PATH="usr/local/opt/curl@7.81.0" # can be found using `brew info curl`
$ export PATH="$RCURL_PATH/bin:$PATH" # for curl-config
$ export LDFLAGS="-L$RCURL_PATH//lib"
$ export CPPFLAGS="-I$RCURL_PATH/include"
$ export PKG_CONFIG_PATH="$RCURL_PATH/lib/pkgconfig"

$ R -e "chooseCRANmirror(graphics=FALSE, ind=1);install.packages('RCurl', type = 'source')"
$ R -e "RCurl::curlVersion()$version" # check if RCurl is using the proper version of curl

查看 ubuntu 20.04 中的 curl 版本,即 7.68.0(根据 https://packages.ubuntu.com/focal/curl) I think you won't be able to use the following as the --disable-socketpair was added in curl 7.73.0 but since you are using a virtual machine it might be easier to just use ubuntu 18.04 since it's still supported 并且使用的是足够旧的 curl 版本 (7.58.0) .

我有一段时间没有使用 ubuntu 但至少我可以提供一些 pseudo-code 应该做同样的事情:

$ sudo apt install devscripts
$ # make sure source repositories are enabled (uncommented in /etc/apt/s
$ apt-get source curl
$ sudo apt-get build-dep curl
$ cd curl
$ nano debian/rules # add the --disable-socketpair configure option
$ dch -i # bump the version
$ debuild -us -uc -b # build the package
$ dpkg -i ../curl-some_version.dpkg

$ export PATH="$RCURL_PATH/bin:$PATH" # for curl-config
$ export LDFLAGS="-L$RCURL_PATH//lib"
$ export CPPFLAGS="-I$RCURL_PATH/include"
$ export PKG_CONFIG_PATH="$RCURL_PATH/lib/pkgconfig"

$ R -e "chooseCRANmirror(graphics=FALSE, ind=1);install.packages('RCurl', type = 'source')"
$ R -e "RCurl::curlVersion()$version" # check if RCurl is using the proper version of curl