由于网络无法访问 googleapis.com,Dataproc 配置超时

Dataproc provisioning timeout due to network unreachable to googleapis.com

我正在尝试在 GCP 项目中创建一个基本的(我使用默认值)dataproc 集群,VM 已创建但集群仍处于预配状态直到超时

在所有这些情况下,我都出现以下错误(在 /var/log/google-dataproc-agent.0.log 上找到 SSHing master)

Network is unreachable: dataproccontrol-europe-west1.googleapis.com/2a00:1450:400c:c04:0:0:0:5f:443

完整的错误跟踪:

ul 24, 2021 11:02:53 AM com.google.cloud.hadoop.services.repackaged.com.google.cloud.hadoop.util.ResilientOperation nextSleep INFO: Transient exception caught. Sleeping for 1120, then retrying.
com.google.cloud.hadoop.services.repackaged.io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 29.974635818s. [buffered_nanos=30006131805, waiting_for_connection]
        at com.google.cloud.hadoop.services.repackaged.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:244)
        at com.google.cloud.hadoop.services.repackaged.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:225)
        at com.google.cloud.hadoop.services.repackaged.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:142)
        at com.google.cloud.dataproc.control.v1.AgentServiceGrpc$AgentServiceBlockingStub.createAgent(AgentServiceGrpc.java:735)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.call(AgentApiAsyncUpdater.java:238)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.call(AgentApiAsyncUpdater.java:235)
        at com.google.cloud.hadoop.services.repackaged.com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:67)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.executeWithBackoff(AgentApiAsyncUpdater.java:345)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.createAgent(AgentApiAsyncUpdater.java:234)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.getOrCreateAgent(AgentApiAsyncUpdater.java:203)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.run(AgentApiAsyncUpdater.java:183)
        at com.google.cloud.hadoop.services.repackaged.com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator$NeverSuccessfulListenableFutureTask.run(MoreExecutors.java:679)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access1(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Jul 24, 2021 11:03:23 AM com.google.cloud.hadoop.services.repackaged.com.google.cloud.hadoop.util.ResilientOperation nextSleep INFO: Transient exception caught. Sleeping for 1958, then retrying.
com.google.cloud.hadoop.services.repackaged.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
        at com.google.cloud.hadoop.services.repackaged.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:244)
        at com.google.cloud.hadoop.services.repackaged.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:225)
        at com.google.cloud.hadoop.services.repackaged.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:142)
        at com.google.cloud.dataproc.control.v1.AgentServiceGrpc$AgentServiceBlockingStub.createAgent(AgentServiceGrpc.java:735)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.call(AgentApiAsyncUpdater.java:238)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.call(AgentApiAsyncUpdater.java:235)
        at com.google.cloud.hadoop.services.repackaged.com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:67)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.executeWithBackoff(AgentApiAsyncUpdater.java:345)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.createAgent(AgentApiAsyncUpdater.java:234)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.getOrCreateAgent(AgentApiAsyncUpdater.java:203)
        at com.google.cloud.hadoop.services.agent.protocol.AgentApiAsyncUpdater.run(AgentApiAsyncUpdater.java:183)
        at com.google.cloud.hadoop.services.repackaged.com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator$NeverSuccessfulListenableFutureTask.run(MoreExecutors.java:679)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access1(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: com.google.cloud.hadoop.services.repackaged.io.netty.channel.AbstractChannel$AnnotatedSocketException: Network is unreachable: dataproccontrol-europe-west1.googleapis.com/2a00:1450:400c:c04:0:0:0:5f:443
Caused by: java.net.SocketException: Network is unreachable
        at sun.nio.ch.Net.connect0(Native Method)
        at sun.nio.ch.Net.connect(Net.java:482)
        at sun.nio.ch.Net.connect(Net.java:474)
        at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:647)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.internal.SocketUtils.run(SocketUtils.java:91)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.internal.SocketUtils.run(SocketUtils.java:88)
        at java.security.AccessController.doPrivileged(Native Method)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.internal.SocketUtils.connect(SocketUtils.java:88)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:315)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:248)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1342)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:548)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:533)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.ChannelDuplexHandler.connect(ChannelDuplexHandler.java:54)
        at com.google.cloud.hadoop.services.repackaged.io.grpc.netty.WriteBufferingAndExceptionHandler.connect(WriteBufferingAndExceptionHandler.java:150)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:548)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.AbstractChannelHandlerContext.access00(AbstractChannelHandlerContext.java:61)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.AbstractChannelHandlerContext.run(AbstractChannelHandlerContext.java:538)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
        at com.google.cloud.hadoop.services.repackaged.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.concurrent.SingleThreadEventExecutor.run(SingleThreadEventExecutor.java:989)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.internal.ThreadExecutorMap.run(ThreadExecutorMap.java:74)
        at com.google.cloud.hadoop.services.repackaged.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)

请帮忙!

提前致谢

编辑 :我的防火墙和 VPC

集群配置:

根据错误消息 Network is unreachable: dataproccontrol-europe-west1.googleapis.com/2a00:1450:400c:c04:0:0:0:5f:443 和您的网络设置,您似乎缺少互联网路由。

您可以通过使用 --next-hop-gateway=default-internet-gateway 为 IPv4 添加到 0.0.0.0/0 的路由和为 IPv6 添加到 ::/0 的路由来解决此问题,请参阅此 doc. The route should have been automatically created for a new VPC network but I guess you deleted it, see this doc 中的更多详细信息。

需要路由的原因是 VM 上的 Dataproc 代理需要访问 Dataproc 控件 API 以获取作业和报告状态。 API 域名 dataproccontrol-<region>.googleapis.com 被解析为外部 IP,因此虚拟机需要有一个到 Internet 的路由(或 IP ranges), but when Private Google Access is enabled, the traffic won't leave Google data centers. The recommendation is to always have a route to the internet, and use firewall rules for more granular access control. Also note that VMs without external IPs are not able to access the internet by default, even if routes and firewall rules allow it, see this doc if you want a solution. BTW, You can use the Connectivity Test 工具进行故障排除。