正在 Dataproc 集群上安装 Datalab/Jupyter

Installing Datalab/Jupyter on Dataproc cluster

我正在尝试在我的 Dataproc 集群上安装 Jupyter notebook/Datalab,但无济于事。

我遵循这个教程:https://cloud.google.com/dataproc/docs/tutorials/dataproc-datalab

循序渐进:

  1. 我创建了一个名为 datalab-init-bucket-001 的新 GS Bucket 并上传了来自 GitHub https://github.com/GoogleCloudPlatform/dataproc-initialization-actions/blob/master/datalab/datalab.shdatalab.sh 脚本
  2. 然后通过 gcloud 命令和 --initialization-actions 'gs://datalab-init-bucket-001/datalab.sh' 启动 Dataproc,整个命令如下所示:

    gcloud dataproc create cluster-test --subnet default --zone "" --master-machine-type n1-standard-4 --master-boot-disk-size 10 --num-workers 2 --worker-machine-type n1-standard-2 --worker-boot-disk-size 10 --initialization-action-timeout "10h" --initialization-actions 'gs://datalab-init-bucket-001/datalab.sh'

这里,第一个问题出现了:

查看日志:

OK > Downloading script [gs://datalab-init-bucket-001/datalab.sh] to [/etc/google-dataproc/startup-scripts/dataproc-initialization-script-0]

OK > Running script [/etc/google-dataproc/startup-scripts/dataproc-initialization-script-0] and saving output in [/var/log/dataproc-initialization-script-0.log]

OK > DIR* completeFile: /user/spark/eventlog/.cc2b1d00-4968-4008-87d7-eac090b09e56 is closed by DFSClient_NONMAPREDUCE_1150019196_1

ERROR > AgentRunner startup failed: com.google.cloud.hadoop.services.agent.AgentException: Initialization action failed to start (error=2, No such file or directory). Failed action 'gs://datalab-init-bucket-001/datalab.sh' (TASK_FAILED) at com.google.cloud.hadoop.services.agent.AgentException$Builder.build(AgentException.java:83) at com.google.cloud.hadoop.services.agent.AgentException$Builder.buildAndThrow(AgentException.java:79) at com.google.cloud.hadoop.services.agent.BootstrapActionRunner.throwInitActionFailureException(BootstrapActionRunner.java:236) at com.google.cloud.hadoop.services.agent.BootstrapActionRunner.runSingleCustomInitializationScriptWithTimeout(BootstrapActionRunner.java:146) at com.google.cloud.hadoop.services.agent.BootstrapActionRunner.runCustomInitializationActions(BootstrapActionRunner.java:126) at com.google.cloud.hadoop.services.agent.AbstractAgentRunner.runCustomInitializationActionsIfFirstRun(AbstractAgentRunner.java:150) at com.google.cloud.hadoop.services.agent.MasterAgentRunner.initialize(MasterAgentRunner.java:165) at com.google.cloud.hadoop.services.agent.AbstractAgentRunner.start(AbstractAgentRunner.java:68) at com.google.cloud.hadoop.services.agent.MasterAgentRunner.start(MasterAgentRunner.java:36) at com.google.cloud.hadoop.services.agent.AgentMain.lambda$boot[=17=](AgentMain.java:63) at com.google.cloud.hadoop.services.agent.AgentStatusReporter.runWith(AgentStatusReporter.java:52) at com.google.cloud.hadoop.services.agent.AgentMain.boot(AgentMain.java:59) at com.google.cloud.hadoop.services.agent.AgentMain.main(AgentMain.java:46) Caused by: java.io.IOException: Cannot run program "/etc/google-dataproc/startup-scripts/dataproc-initialization-script-0": error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) at com.google.cloud.hadoop.services.agent.util.NativeAsyncProcessWrapperFactory.startAndWrap(NativeAsyncProcessWrapperFactory.java:33) at com.google.cloud.hadoop.services.agent.util.NativeAsyncProcessWrapperFactory.startAndWrap(NativeAsyncProcessWrapperFactory.java:27) at com.google.cloud.hadoop.services.agent.BootstrapActionRunner.createRunner(BootstrapActionRunner.java:349) at com.google.cloud.hadoop.services.agent.BootstrapActionRunner.runScriptAndPipeOutputToGcs(BootstrapActionRunner.java:301) at com.google.cloud.hadoop.services.agent.BootstrapActionRunner.runSingleCustomInitializationScriptWithTimeout(BootstrapActionRunner.java:142) ... 9 more Suppressed: java.io.IOException: Cannot run program "/etc/google-dataproc/startup-scripts/dataproc-initialization-script-0": error=2, No such file or directory ... 15 more Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.(UNIXProcess.java:247) at java.lang.ProcessImpl.start(ProcessImpl.java:134) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ... 14 more Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.(UNIXProcess.java:247) at java.lang.ProcessImpl.start(ProcessImpl.java:134) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ... 14 more undefinedE AgentRunner startup failed:

  1. "Manual" 在主节点 VM 上安装也失败:

我以某种方式设法在 single-node 集群上启动了 Datalab。但是我无法在那里启动 (py)Spark 会话。

我 运行 最新的 Dataproc 映像版本 (1.2),但例如 1.1 也不起作用。我有免费积分帐户,但我想这应该不会造成问题。

知道如何更新 datalab.sh 脚本来完成这项工作吗?

看来失败的原因是磁盘不够大。我将每个节点上的磁盘大小从 10 GB 切换到 50 GB,突然它就可以工作了。