IBM Cloud Private 2.1.0.3 安装后无法访问端口 8443

IBM Cloud Private 2.1.0.3 unable to access port 8443 after installation

我目前在 ICP 2.1.0.3 上遇到这个问题,安装后,所有 pods 都已启动且 运行 但端口 8443 未侦听且平台-ui 容器似乎无法连接到 10.0.0.25,这是 icp-management-ingress 的 ClusterIP 服务。 ICP 安装在 iptables 和 ufw 都处于非活动状态的新虚拟机上。

以下是容器重启后的日志。

root@icpmaster:/opt/ibm-cloud-private-ce-2.1.0.3/cluster# docker logs cbaaebabd4c9
[2019-03-06T04:12:38.429] [INFO] [platform-ui] [server] [pid 1] [env production] started.
[HPM] Proxy created: /  ->  https://icp-management-ingress:8443
[HPM] Proxy rewrite rule created: "^/catalog/api/proxy" ~> ""
[2019-03-06T04:13:33.455] [INFO] [platform-ui] [server] Starting express server.
[2019-03-06T04:13:33.653] [INFO] [platform-ui] [server] Platform UI listening on http port 3000.
[2019-03-06T04:13:46.373] [ERROR] [platform-ui] [service-watcher] Error making request: Error: connect ECONNREFUSED 10.0.0.25:8443
GET https://icp-management-ingress:8443/kubernetes/api/v1/services?labelSelector=inmenu%3Dtrue HTTP/1.1
Accept: application/json
Authorization: Bearer ***

Error: connect ECONNREFUSED 10.0.0.25:8443

[编辑] 所以我重新启动了 kubelet 和 docker。然后在 kubelet 的日志中发现这个 cgroups about file or directory not found,我想知道它是否与 Docker 有关,但是我的 Docker 在支持范围内版本 17.12.1-ce

kubelet的日志(修剪了日志)

-- Logs begin at Tue 2019-03-05 22:52:15 +08, end at Thu 2019-03-07 21:36:31 +08. --
Error while processing event ("/sys/fs/cgroup/cpu,cpuacct/system.slice/var-lib-docker-overlay2- xxx -merged.mount: no such file or directory
Error while processing event ("/sys/fs/cgroup/blkio/system.slice/var-lib-docker-overlay2- xxx -merged.mount": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/blkio/system.slice/var-lib-docker-overlay2- xxx -merged.mount: no such file or directory
Error while processing event ("/sys/fs/cgroup/memory/system.slice/var-lib-docker-overlay2- xxx -merged.mount": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/system.slice/var-lib-docker-overlay2- xxx-merged.mount: no such file or directory  Error while processing event ("/sys/fs/cgroup/devices/system.slice/var-lib-docker-overlay2- xxx -merged.mount: no such file or directory
Error while processing event ("/sys/fs/cgroup/cpu,cpuacct/system.slice/var-lib-docker-containers- xxx -shm.mount: no such file or directory
Error while processing event ("/sys/fs/cgroup/blkio/system.slice/var-lib-docker-containers- xxx -shm.mount: no such file or directory
Error while processing event ("/sys/fs/cgroup/memory/system.slice/var-lib-docker-containers- xxx -shm.mount": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/system.slice/var-lib-docker-containers- xxx-shm.mount: no such file or directory

其他人的状态pods

root@icpmaster:/opt# kubectl --kubeconfig=/var/lib/kubelet/kubelet-config get pods -n kube-system -o wide

NAME                                                  READY     STATUS    RESTARTS   AGE       IP            NODE
auth-apikeys-tj57w                                    1/1       Running   2          1d        10.1.9.5      10.113.64.6
auth-idp-c84cb                                        2/3       Running   15         1d        10.1.9.8      10.113.64.6
auth-pap-28tnl                                        1/1       Running   2          1d        10.1.9.32     10.113.64.6
auth-pdp-bwn9k                                        1/1       Running   2          1d        10.1.9.30     10.113.64.6
calico-kube-controllers-759f7fc556-bfnn8              1/1       Running   0          57m       10.113.64.6   10.113.64.6
calico-node-bdnrc                                     2/2       Running   46         1d        10.113.64.6   10.113.64.6
calico-node-h8jnd                                     2/2       Running   4          1d        10.113.64.8   10.113.64.8
catalog-ui-7ctqv                                      1/1       Running   2          1d        10.1.9.14     10.113.64.6
default-backend-7c6d6df9d5-j4pl9                      1/1       Running   0          57m       10.1.9.22     10.113.64.6
heapster-5649f84695-vfjjw                             2/2       Running   0          1h        10.1.9.4      10.113.64.6
helm-api-76c8d8bc7-8qjxf                              2/2       Running   3          57m       10.1.9.2      10.113.64.6
helm-repo-7455d96-bg2td                               1/1       Running   0          58m       10.1.9.19     10.113.64.6
icp-management-ingress-xgp95                          1/1       Running   3          1d        10.1.9.62     10.113.64.6
icp-mongodb-0                                         1/1       Running   23         1d        10.1.9.10     10.113.64.6
image-manager-0                                       2/2       Running   6          1d        10.113.64.6   10.113.64.6
k8s-etcd-10.113.64.6                                  1/1       Running   2          1d        10.113.64.6   10.113.64.6
k8s-master-10.113.64.6                                3/3       Running   6          1d        10.113.64.6   10.113.64.6
k8s-proxy-10.113.64.6                                 1/1       Running   2          1d        10.113.64.6   10.113.64.6
k8s-proxy-10.113.64.8                                 1/1       Running   3          1d        10.113.64.8   10.113.64.8
kube-dns-ltdb4                                        3/3       Running   37         1d        10.1.9.34     10.113.64.6
logging-elk-client-65745dcd68-b69wb                   2/2       Running   0          1h        10.1.9.44     10.113.64.6
logging-elk-data-0                                    1/1       Running   0          56m       10.1.9.16     10.113.64.6
logging-elk-filebeat-ds-7cb78                         1/1       Running   2          1d        10.1.214.67   10.113.64.8
logging-elk-filebeat-ds-vmfbk                         1/1       Running   2          1d        10.1.9.36     10.113.64.6
logging-elk-logstash-76c548744b-n24c5                 1/1       Running   0          1h        10.1.9.17     10.113.64.6
logging-elk-master-686fbdd984-kpt7s                   1/1       Running   0          1h        10.1.9.56     10.113.64.6
mariadb-0                                             1/1       Running   8          1d        10.113.64.6   10.113.64.6
metrics-server-7f4fdb695f-7rsd5                       1/1       Running   7          1h        10.1.9.20     10.113.64.6
nginx-ingress-controller-gjnnb                        1/1       Running   9          1d        10.1.9.9      10.113.64.6
platform-api-dq4p8                                    1/1       Running   2          1d        10.1.9.7      10.113.64.6
platform-deploy-6kzds                                 1/1       Running   2          1d        10.1.9.11     10.113.64.6
platform-ui-6kzzn                                     0/1       Running   46         1d        10.1.9.35     10.113.64.6
rescheduler-g85d5                                     1/1       Running   2          1d        10.113.64.6   10.113.64.6
service-catalog-apiserver-rlfj5                       1/1       Running   6          1d        10.1.9.28     10.113.64.6
service-catalog-controller-manager-5b654dc8b8-jfj64   1/1       Running   5          57m       10.1.9.15     10.113.64.6
tiller-deploy-c59888d97-l7rhk                         1/1       Running   3          57m       10.113.64.6   10.113.64.6
unified-router-zmxwh                                  1/1       Running   2          1d        10.1.9.24     10.113.64.6
update-secrets-cpp6j                                  1/1       Running   10         57m       10.1.9.18     10.113.64.6

auth-idp-c84cb pod 未完全 运行(2/3 而不是 3/3)这将导致 platform-ui 容器连接到 10.0.0.25:8443 时出现问题.这就是 platform-ui pod 未完全 运行 的原因(platform-ui-6kzzn 0/1)。

auth-idp pods 没有完全启动的原因往往是ICP cluster/environment 没有足够的资源。请re-install你的ICP集群增加资源后满足这些硬件需求https://www.ibm.com/support/knowledgecenter/SSBS6K_2.1.0.3/supported_system_config/hardware_reqs.html

特别注意CPU、RAM、磁盘的数量space。