GKE Con​​nect 成功启动,但集群未显示在 GCP 控制台

GKE Connect succesfully starts but the cluster is not shown at GCP consoles

早上好!

我最近一直在玩 GKE Con​​nect,我一直在尝试在 GCP 和 AWS VM 上注册我的“remote”-kops 生成的集群,这样我可以在 GCP 控制台上监控它们。

如果您还没有阅读过 GKE Con​​nect,您可以找到官方文档 here

现在的问题是,在学习了多个教程并尝试了所有方法之后,GKE Con​​nect 代理似乎在我的 k8s 集群上 运行 正确,但它们从未在我的 GCP 控制台中显示为远程集群- 你可以找到关于我在这方面采取的步骤的指导 repository

基本上我采取的步骤如下:

  1. 启用所需的 GCP APIs
  2. 为目标集群创建一个服务帐户
  3. gkehub.connect角色分配给创建的SA
  4. 生成 SA 的私钥
  5. 使用以下命令启动代理:
gcloud alpha container hub register-cluster ${CLUSTER_NAME} \
  --context=${CLUSTER_NAME} \
  --service-account-key-file=/var/lib/jenkins/gke-connect/${SERVICE_ACC}-gke-connect-creds.json \
  --project=${CLOUD_PROJECT}

代理部署在我的集群,容器日志显示如下:

2019/12/13 08:57:03.403373 dialer.go:261: dialer: dial: connecting to gkeconnect.googleapis.com:443...
2019/12/13 08:57:03.515452 dialer.go:272: dialer: dial: connected to gkeconnect.googleapis.com:443
2019/12/13 08:57:03.515483 tunnel.go:314: serve: opening egress stream...
2019/12/13 08:57:03.515545 tunnel.go:322: serve: registering project_number="681949624886", connection_id="db3fb4d9-1d7f-11ea-927b-0218619c9f84" connection_class="DEFAULT" agent_version="20191206-03-00" ...
2019/12/13 08:57:03.515592 dialer.go:222: Dial successful, current connections: 3
2019/12/13 08:57:08.515779 tunnel.go:374: serve: serving requests...

附带说明一下,API 请求似乎花费了很长时间 - GCP 的 API 控制台显示平均响应时间为 8 分钟。你们有过类似的经历吗?

谢谢!

编辑 1 添加更多信息

不确定这是否是它的工作方式,因为它没有在任何地方记录,但 GKE Con​​nect 代理似乎正在处理 3 个连接器,这些连接器在 5 到 8 分钟后断开连接,跟踪模式如下:

2019/12/13 11:04:30.519779 dialer.go:277: dialer: dial: connection to gkeconnect.googleapis.com:443 closed after 8m1.174074486s
2019/12/13 11:04:30.519831 dialer.go:204: dialer: connection done: <nil>
2019/12/13 11:04:30.519839 dialer.go:305: dialer: backoff: reset
2019/12/13 11:04:30.519847 dialer.go:236: dialer: dial interval was 5m0.950672921s
2019/12/13 11:04:30.519859 dialer.go:180: dialer: waiting for next event, outstanding connections=2

编辑 2 连通性

在我的集群上部署的容器中,与所需端点的连接似乎也很好:

/usr/src/app # ping oauth2.googleapis.com
PING oauth2.googleapis.com (172.217.21.234): 56 data bytes
64 bytes from 172.217.21.234: seq=0 ttl=48 time=1.169 ms
64 bytes from 172.217.21.234: seq=1 ttl=48 time=1.165 ms

--- oauth2.googleapis.com ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 1.165/1.167/1.169 ms

/usr/src/app # ping gkeconnect.googleapis.com
PING gkeconnect.googleapis.com (172.217.22.42): 56 data bytes
64 bytes from 172.217.22.42: seq=0 ttl=48 time=1.115 ms
64 bytes from 172.217.22.42: seq=1 ttl=48 time=1.201 ms

--- gkeconnect.googleapis.com ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 1.115/1.158/1.201 ms
/usr/src/app # ping gkehub.googleapis.com
PING gkehub.googleapis.com (216.58.206.10): 56 data bytes
64 bytes from 216.58.206.10: seq=0 ttl=48 time=1.374 ms
64 bytes from 216.58.206.10: seq=1 ttl=48 time=1.428 ms

--- gkehub.googleapis.com ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 1.374/1.401/1.428 ms

/usr/src/app # ping www.googleapis.com
PING www.googleapis.com (172.217.16.202): 56 data bytes
64 bytes from 172.217.16.202: seq=0 ttl=48 time=1.357 ms
64 bytes from 172.217.16.202: seq=1 ttl=48 time=1.382 ms

--- www.googleapis.com ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 1.357/1.369/1.382 ms

/usr/src/app # ping accounts.google.com
PING accounts.google.com (172.217.23.141): 56 data bytes
64 bytes from 172.217.23.141: seq=0 ttl=48 time=1.447 ms
64 bytes from 172.217.23.141: seq=1 ttl=48 time=1.400 ms

--- accounts.google.com ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 1.400/1.423/1.447 ms

/usr/src/app # ping gcr.io
PING gcr.io (173.194.76.82): 56 data bytes
64 bytes from 173.194.76.82: seq=0 ttl=32 time=10.311 ms
64 bytes from 173.194.76.82: seq=1 ttl=32 time=10.386 ms

--- gcr.io ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 10.311/10.348/10.386 ms

编辑 3 进一步测试

感谢Armando的评论,我又看了看官方Anthos workshop. Also found these codelabs,基本上都是讲同一个故事

他们似乎声称集群注册需要白名单服务帐户,但他们从未真正说明“白名单”是什么过程就像。

查看 GKE Con​​nect 脚本,this one 几乎完成了我自己正在做的事情:创建服务帐户,提供所需的权限,注册我的集群并生成一个 KSA,我可以使用其密钥在 GCP 控制台访问集群。

现在有关于 白名单 过程的粗略线条,这可能是解决此问题的关键,但令我惊讶的是我无法找到任何参考资料说的过程。

Anthos by Google Cloud 需要付费订阅才能使用。您正在审阅的文档适用于现有订阅。要试用或购买 Anthos,您需要联系销售人员。这些链接位于此处的 Anthos 主页 https://cloud.google.com/anthos/