通过 R 设置多节点 h2o 集群:加入第二个节点时出现问题

Setting up a multinode h2o cluster via R: Trouble when joining with the second node

我试图通过 h2o 建立一个只有 2 个节点的本地集群。我尝试在终端中设置计算机,就像在 :https://h2o-release.s3.amazonaws.com/h2o/rel-lambert/5/docs-website/deployment/multinode.html 下指定的那样。尽管我的第二个节点在短时间内说他加入了,但由于"Attempting to join an H2O cloud that is no longer accepting new H2O nodes from"

,他很快就存在了

非常感谢您的帮助,因为我在这方面还很陌生。

运行 MacOS 10.15.4, r 版本 4.00, spark 版本 请查看终端输出:

Cannot load library from path lib/osx_64/libxgboost4j_gpu.dylib
Cannot load library from path lib/libxgboost4j_gpu.dylib
Failed to load library from both native path and jar!
Cannot load library from path lib/osx_64/libxgboost4j_omp.dylib
Cannot load library from path lib/libxgboost4j_omp.dylib
Failed to load library from both native path and jar!
05-21 22:40:23.640 192.168.1.168:54321   2272         main  INFO water.default: ----- H2O started  -----
05-21 22:40:23.641 192.168.1.168:54321   2272         main  INFO water.default: Build git branch: master
05-21 22:40:23.641 192.168.1.168:54321   2272         main  INFO water.default: Build git hash: d3d24b7c6059f15c6b6333a84ccb70e70bc5d3dc
05-21 22:40:23.641 192.168.1.168:54321   2272         main  INFO water.default: Build git describe: jenkins-master-5076-39-gd3d24b7
05-21 22:40:23.642 192.168.1.168:54321   2272         main  INFO water.default: Build project version: 3.31.0.5077
05-21 22:40:23.642 192.168.1.168:54321   2272         main  INFO water.default: Build age: 16 hours and 8 minutes
05-21 22:40:23.642 192.168.1.168:54321   2272         main  INFO water.default: Built by: 'jenkins'
05-21 22:40:23.642 192.168.1.168:54321   2272         main  INFO water.default: Built on: '2020-05-21 06:31:36'
05-21 22:40:23.642 192.168.1.168:54321   2272         main  INFO water.default: Found H2O Core extensions: [XGBoost, KrbStandalone]
05-21 22:40:23.642 192.168.1.168:54321   2272         main  INFO water.default: Processed H2O arguments: [-flatfile, flatfile.txt, -port, 54321]
05-21 22:40:23.643 192.168.1.168:54321   2272         main  INFO water.default: Java availableProcessors: 8
05-21 22:40:23.643 192.168.1.168:54321   2272         main  INFO water.default: Java heap totalMemory: 123,0 MB
05-21 22:40:23.643 192.168.1.168:54321   2272         main  INFO water.default: Java heap maxMemory: 17,78 GB
05-21 22:40:23.643 192.168.1.168:54321   2272         main  INFO water.default: Java version: Java 1.8.0_251 (from Oracle Corporation)
05-21 22:40:23.643 192.168.1.168:54321   2272         main  INFO water.default: JVM launch parameters: [-Xmx20g]
05-21 22:40:23.643 192.168.1.168:54321   2272         main  INFO water.default: JVM process id: 2272@Mihais-iMac-2.local
05-21 22:40:23.643 192.168.1.168:54321   2272         main  INFO water.default: OS version: Mac OS X 10.15.4 (x86_64)
05-21 22:40:23.644 192.168.1.168:54321   2272         main  INFO water.default: Machine physical memory: 8,00 GB
05-21 22:40:23.644 192.168.1.168:54321   2272         main  INFO water.default: Machine locale: de_DE
05-21 22:40:23.644 192.168.1.168:54321   2272         main  INFO water.default: X-h2o-cluster-id: 1590093617567
05-21 22:40:23.644 192.168.1.168:54321   2272         main  INFO water.default: User name: 'Max'
05-21 22:40:23.644 192.168.1.168:54321   2272         main  INFO water.default: IPv6 stack selected: false
05-21 22:40:23.645 192.168.1.168:54321   2272         main  INFO water.default: Network address/interface is not reachable in 150ms: /fe80:0:0:0:8045:3098:f44:71b2%utun1/name:utun1 (utun1)
05-21 22:40:23.645 192.168.1.168:54321   2272         main  INFO water.default: Network address/interface is not reachable in 150ms: /fe80:0:0:0:2118:84f2:fc7a:7ea1%utun0/name:utun0 (utun0)
05-21 22:40:23.645 192.168.1.168:54321   2272         main  INFO water.default: Possible IP Address: llw0 (llw0), fe80:0:0:0:6084:87ff:fe44:8adf%llw0
05-21 22:40:23.645 192.168.1.168:54321   2272         main  INFO water.default: Network address/interface is not reachable in 150ms: /fe80:0:0:0:6084:87ff:fe44:8adf%awdl0/name:awdl0 (awdl0)
05-21 22:40:23.645 192.168.1.168:54321   2272         main  INFO water.default: Possible IP Address: en1 (en1), fe80:0:0:0:95:3412:ed28:553f%en1
05-21 22:40:23.645 192.168.1.168:54321   2272         main  INFO water.default: Possible IP Address: en1 (en1), 192.168.1.87
05-21 22:40:23.646 192.168.1.168:54321   2272         main  INFO water.default: Possible IP Address: en0 (en0), fe80:0:0:0:8ca:d39f:9e33:5b4d%en0
05-21 22:40:23.646 192.168.1.168:54321   2272         main  INFO water.default: Possible IP Address: en0 (en0), 192.168.1.168
05-21 22:40:23.646 192.168.1.168:54321   2272         main  INFO water.default: Possible IP Address: lo0 (lo0), fe80:0:0:0:0:0:0:1%lo0
05-21 22:40:23.646 192.168.1.168:54321   2272         main  INFO water.default: Possible IP Address: lo0 (lo0), 0:0:0:0:0:0:0:1%lo0
05-21 22:40:23.646 192.168.1.168:54321   2272         main  INFO water.default: Possible IP Address: lo0 (lo0), 127.0.0.1
05-21 22:40:23.646 192.168.1.168:54321   2272         main  WARN water.default: Multiple local IPs detected:
05-21 22:40:23.647 192.168.1.168:54321   2272         main  WARN water.default:   /192.168.1.87  /192.168.1.168
05-21 22:40:23.647 192.168.1.168:54321   2272         main  WARN water.default: Attempting to determine correct address...
05-21 22:40:23.647 192.168.1.168:54321   2272         main  WARN water.default: Using /192.168.1.168
05-21 22:40:23.647 192.168.1.168:54321   2272         main  INFO water.default: H2O node running in unencrypted mode.
05-21 22:40:23.648 192.168.1.168:54321   2272         main  INFO water.default: Internal communication uses port: 54322
05-21 22:40:23.649 192.168.1.168:54321   2272         main  INFO water.default: Listening for HTTP and REST traffic on http://192.168.1.168:54321/
05-21 22:40:23.653 192.168.1.168:54321   2272         main  WARN water.default: -flatfile specified but not found: flatfile.txt
05-21 22:40:23.653 192.168.1.168:54321   2272         main  INFO water.default: H2O cloud name: 'Max' on /192.168.1.168:54321, static configuration based on -flatfile flatfile.txt
05-21 22:40:23.654 192.168.1.168:54321   2272         main  INFO water.default: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555):
05-21 22:40:23.654 192.168.1.168:54321   2272         main  INFO water.default:   1. Open a terminal and run 'ssh -L 55555:localhost:54321 Max@192.168.1.168'
05-21 22:40:23.654 192.168.1.168:54321   2272         main  INFO water.default:   2. Point your browser to http://localhost:55555
05-21 22:40:24.102 192.168.1.168:54321   2272         main  INFO water.default: Log dir: '/tmp/h2o-Max/h2ologs'
05-21 22:40:24.102 192.168.1.168:54321   2272         main  INFO water.default: Cur dir: '/Users/Max/Downloads/h2o-3.31.0.5077'
05-21 22:40:24.108 192.168.1.168:54321   2272         main  INFO water.default: Subsystem for distributed import from HTTP/HTTPS successfully initialized
05-21 22:40:24.108 192.168.1.168:54321   2272         main  INFO water.default: HDFS subsystem successfully initialized
05-21 22:40:24.111 192.168.1.168:54321   2272         main  INFO water.default: S3 subsystem successfully initialized
05-21 22:40:24.122 192.168.1.168:54321   2272         main  INFO water.default: GCS subsystem successfully initialized
05-21 22:40:24.123 192.168.1.168:54321   2272         main  INFO water.default: Flow dir: '/Users/Max/h2oflows'
05-21 22:40:24.134 192.168.1.168:54321   2272         main  INFO water.default: Cloud of size 1 formed [/192.168.1.168:54321]
05-21 22:40:24.141 192.168.1.168:54321   2272         main  INFO water.default: Registered parsers: [GUESS, ARFF, XLS, SVMLight, AVRO, PARQUET, CSV]
05-21 22:40:24.141 192.168.1.168:54321   2272         main  INFO water.default: XGBoost extension initialized
05-21 22:40:24.142 192.168.1.168:54321   2272         main  INFO water.default: KrbStandalone extension initialized
05-21 22:40:24.142 192.168.1.168:54321   2272         main  INFO water.default: Registered 2 core extensions in: 300ms
05-21 22:40:24.143 192.168.1.168:54321   2272         main  INFO water.default: Registered H2O core extensions: [XGBoost, KrbStandalone]
05-21 22:40:24.272 192.168.1.168:54321   2272         main  INFO hex.tree.xgboost.XGBoostExtension: Found XGBoost backend with library: xgboost4j_minimal
05-21 22:40:24.272 192.168.1.168:54321   2272         main  WARN hex.tree.xgboost.XGBoostExtension: Your system supports only minimal version of XGBoost (no GPUs, no multithreading)!
05-21 22:40:24.350 192.168.1.168:54321   2272         main  INFO water.default: Registered: 214 REST APIs in: 207ms
05-21 22:40:24.351 192.168.1.168:54321   2272         main  INFO water.default: Registered REST API extensions: [Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4]
05-21 22:40:24.474 192.168.1.168:54321   2272         main  INFO water.default: Registered: 284 schemas in 123ms
05-21 22:40:24.474 192.168.1.168:54321   2272         main  INFO water.default: H2O started in 6901ms
05-21 22:40:24.474 192.168.1.168:54321   2272         main  INFO water.default: 
05-21 22:40:24.474 192.168.1.168:54321   2272         main  INFO water.default: Open H2O Flow in your web browser: http://192.168.1.168:54321
05-21 22:40:24.474 192.168.1.168:54321   2272         main  INFO water.default: 
05-21 22:40:40.217 192.168.1.168:54321   2272   1.91:54321 ERROR water.default: Attempting to join an H2O cloud that is no longer accepting new H2O nodes from /192.168.1.91:54321
05-21 22:40:40.220 192.168.1.168:54321   2272   1.91:54321 FATAL water.default: Exiting.```

您正在关注的文档是 2.6 版的(底部的版权是 2013 年!),但您运行 是 3.31 版。

我通常在这里开始与 H2O 相关的搜索:http://docs.h2o.ai/h2o/latest-stable/h2o-docs/index.html

为了设置多节点集群,我使用了 flatfile approach,然后从命令行启动它们中的每一个。在尝试连接到集群中的任何节点之前,请确保所有节点都已启动并找到彼此(您可以通过查看日志看到这一点),否则您将看到收到的消息。