Openshift_control_plane : 报告控制平面错误

Openshift_control_plane : Report control plane errors

我正在尝试使用 Ansible 安装 Openshift Origin。我在执行 deploy_cluster.yml 时遇到问题,错误是:

TASK [openshift_control_plane : Report control plane errors] ***********************************************************************************************************
fatal: [masterserver.srv.com]: FAILED! => {"changed": false, "msg": "Control plane pods didn't come up"}

NO MORE HOSTS LEFT *****************************************************************************************************************************************************

PLAY RECAP *************************************************************************************************************************************************************
localhost                  : ok=11   changed=0    unreachable=0    failed=0    skipped=5    rescued=0    ignored=0
masterserver.srv.com       : ok=295  changed=44   unreachable=0    failed=1    skipped=233  rescued=0    ignored=4
nodeserver.srv.com         : ok=103  changed=16   unreachable=0    failed=0    skipped=88   rescued=0    ignored=0


INSTALLER STATUS *******************************************************************************************************************************************************
Initialization              : Complete (0:02:49)
Health Check                : Complete (0:00:36)
Node Bootstrap Preparation  : Complete (0:09:55)
etcd Install                : Complete (0:02:05)
Master Install              : In Progress (0:42:42)
        This phase can be restarted by running: playbooks/openshift-master/config.yml


Failure summary:


  1. Hosts:    masterserver.srv.com
     Play:     Configure masters
     Task:     Report control plane errors
     Message:  Control plane pods didn't come up

关于我的环境的描述:

我执行的步骤:

  1. ansible-playbook openshift-ansible/playbooks/prerequisites.yml(成功)
  2. ansible-剧本openshift-ansible/playbooks/deploy_cluster.yml

额外的:

[root@masterserver ~]# cat /etc/ansible/hosts
[OSEv3:children]
masters
nodes
etcd

[OSEv3:vars]
ansible_ssh_user=origin
ansible_become=true
openshift_deployment_type=origin
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
openshift_master_default_subdomain=apps-masterserver.srv.com
openshift_disable_check=disk_availability,docker_storage,memory_availability,docker_image_availability
openshift_master_api_port=8443
openshift_master_console_port=8443
osm_etcd_image=registry.access.redhat.com/rhel7/etcd:3.2.22

[masters]
masterserver.srv.com

[etcd]
masterserver.srv.com

[nodes]
masterserver.srv.com openshift_node_group_name='node-config-master-infra'
nodeserver.srv.com openshift_node_group_name='node-config-compute'
```

[root@masterserver ~]# hostname
masterserver.srv.com

[root@masterserver ~]# oc get nodes
The connection to the server masterserver.srv.com:8443 was refused - did you specify the right host or port?

[root@masterserver ~]# netstat -ntlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:8444            0.0.0.0:*               LISTEN      1700/openshift
tcp        0      0 127.0.0.1:44642         0.0.0.0:*               LISTEN      1407/hyperkube
tcp        0      0 192.168.43.50:2379      0.0.0.0:*               LISTEN      1647/etcd
tcp        0      0 192.168.43.50:2380      0.0.0.0:*               LISTEN      1647/etcd
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      1/systemd
tcp        0      0 172.17.0.1:53           0.0.0.0:*               LISTEN      1024/dnsmasq
tcp        0      0 192.168.43.50:53        0.0.0.0:*               LISTEN      1024/dnsmasq
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      1029/sshd
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      1166/master
tcp6       0      0 :::10250                :::*                    LISTEN      1407/hyperkube
tcp6       0      0 :::111                  :::*                    LISTEN      1/systemd
tcp6       0      0 fe80::a00:27ff:fee8::53 :::*                    LISTEN      1024/dnsmasq
tcp6       0      0 :::22                   :::*                    LISTEN      1029/sshd
tcp6       0      0 ::1:25                  :::*                    LISTEN      1166/master

[root@masterserver ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.43.51   nodeserver.srv.com
192.168.43.50   masterserver.srv.com

控制平面 pods 没有出现,所以安装卡住了。这是 ansible 为 运行 时的错误过程之一:

[WARNING]: Module invocation had junk after the JSON data: Error in atexit._run_exitfuncs: Traceback (most recent call last):   File "/usr/lib64/python2.7/atexit.py",
line 24, in _run_exitfuncs     func(*targs, **kargs)   File "/tmp/ansible_oc_obj_payload_h6RqDy/ansible_oc_obj_payload.zip/ansible/modules/oc_obj.py", line 1257, in
cleanup AttributeError: 'NoneType' object has no attribute 'path' Error in sys.exitfunc: Traceback (most recent call last):   File "/usr/lib64/python2.7/atexit.py",
line 24, in _run_exitfuncs     func(*targs, **kargs)   File "/tmp/ansible_oc_obj_payload_h6RqDy/ansible_oc_obj_payload.zip/ansible/modules/oc_obj.py", line 1257, in
cleanup AttributeError: 'NoneType' object has no attribute 'path'

谁能帮我解决这个问题?谢谢。

已解决!将我的环境提升到更高的规格。我看到一些日志显示我之前使用的资源 1vcpu 和 RAM 2GB (Master + Infra1, Compute 1) => Recording NodeHasSufficientResources in /var/log/messages.

目前,我使用 2vcpu 和 RAM 8GB(Master + Infra 1,Compute 1)并且运行良好!