运行 带有 Vagrant 和 Ansible 的 OpenShift OKD 3.10 - 连接被拒绝

Running OpenShift OKD 3.10 with Vagrant and Ansible - Connection refused

几天来我一直在尝试在 Vagrant 设置的单个虚拟机上安装 Openshift 运行ning 并使用 ansible 进行安装。我经历了很多 openshift-ansible 的 github 问题,但最后还是没有运气。这就是我的进展。所以我从我的 Vagrantfile 开始,并使用 CentosOS/7 作为框。因为他们似乎从 xfs 切换了文件系统,所以这是我遇到的第一个错误,因为 docker 不想工作。所以我查看了 VM-Box Centos 的变更日志并降级到 V1804.02。这就是我现在拥有的 Vagrantfile

Vagrantfile

$lab_openshift = <<SCRIPT
yum -y update
yum install -y epel-release git docker httpd-tools java-1.8.0-openjdk-headless
yum install -y ansible python-passlib
systemctl start docker
systemctl enable docker
git clone -b release-3.10 https://github.com/openshift/openshift-ansible /root/openshift-ansible
ssh-keygen -f /root/.ssh/id_rsa -N ''
cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys
ssh-keyscan 172.24.0.11 >> .ssh/known_hosts
cp .ssh/known_hosts /root/.ssh/known_hosts
ssh-copy-id -f -i /root/.ssh/id_rsa root@172.24.0.11
cp /home/vagrant/etc.ansible.hosts /etc/ansible/hosts
cp /home/vagrant/etc.selinux.config /etc/selinux/config

reboot
SCRIPT

Vagrant.configure(2) do |config|
 config.vm.define "openshift" do |conf|
    # conf.vm.box = "peru/my_centos-7-x86_64"
    # conf.vm.box_version = "20181211.01"
    conf.vm.box = "centos/7"
    config.vm.box_version = "1804.02"
    conf.vm.hostname = 'openshift.example.com'
    conf.vm.network "private_network", ip: "172.24.0.11"
    conf.vm.provision "file", source: "./etc.ansible.hosts", destination: "~/etc.ansible.hosts"
    conf.vm.provision "file", source: "./etc.selinux.config", destination: "~/etc.selinux.config"
    conf.vm.provider "virtualbox" do |v|
        v.memory = 6144
        v.cpus = 2
    end
    conf.vm.provision "shell", inline: $lab_openshift
 end
end

由于您会遇到强制或禁用 SELinux 的错误,因此这是 SELinux 的配​​置

SELinux 会议

SELINUX=permissive

现在我开始使用清单中的 ansible hosts.localhost。我得到的第一个错误是 docker_image_availability 检查。在 GitHub 上,有人说你应该禁用它。所以这是第一个变化。但是您仍然 运行 通过安装遇到一些问题(错误消息控制平面 pods 没有出现)。因此,下一个更改是根据一些 github 问题将 osm_etcd_image 更改为 osm_etcd_image=registry.access.redhat.com/rhel7/etcd,这让我有了当前的 ansible 主机文件

主机

#bare minimum hostfile

[OSEv3:children]
masters
nodes
etcd

[OSEv3:vars]

osm_etcd_image=registry.access.redhat.com/rhel7/etcd
openshift_deployment_type=origin
openshift_release=v3.10
openshift_portal_net=172.30.0.0/16
openshift_disable_check=disk_availability,memory_availability,docker_image_availability

openshift_node_groups=[{'name': 'node-config-all-in-one', 'labels': ['node-role.kubernetes.io/master=true', 'node-role.kubernetes.io/infra=true', 'node-role.kubernetes.io/compute=true']}]


[masters]
172.24.0.11 ansible_connection=local

[etcd]
172.24.0.11 ansible_connection=local

[nodes]
# openshift_node_group_name should refer to a dictionary with matching key of name in list openshift_node_groups.
172.24.0.11 ansible_connection=local openshift_node_group_name="node-config-all-in-one"

不幸的是,我仍然无法让集群达到 运行,因为它一遍又一遍地重试失败

/bin/oc 获取 pod master-etcd-openshift.example.com -o json -n kube-system"

似乎拒绝连接并给我留下这条错误消息

The connection to the server openshift.example.com:8443 was refused - did you specify the right host or port?\n"

现在我刚刚发现关于该错误的一些未解决的 github 问题,并且最终卡住了。也许有人知道我做错了什么。

[编辑]

哦,我的 etc/hosts 扩展了 172.24.0.11 openshift.example.com 并 ping 172.24.0.11 和 openshift.example.com 成功

还有一件事值得一提,docker conttainer list -a 还会告诉我有一个容器不断尝试重启,但没有成功

ab4651c81600 96f98d080ffd "/bin/bash -c '#!/..." About a minute ago Exited (255) 35 seconds ago k8s_api_master-api-openshift.example.com_kube-system_fabe879b27fee405485858817f14f32f_9

所以这几乎就是本期中描述的内容 https://github.com/openshift/openshift-ansible/issues/9894 但是我无法弄清楚我的配置有什么问题

[/edit]

[edit2]

The log of the k8s_api_master container is also not really helping me out

I1224 11:46:42.874204       1 plugins.go:84] Registered admission plugin "NamespaceLifecycle"
I1224 11:46:42.874390       1 plugins.go:84] Registered admission plugin "Initializers"
I1224 11:46:42.874400       1 plugins.go:84] Registered admission plugin "ValidatingAdmissionWebhook"
I1224 11:46:42.874408       1 plugins.go:84] Registered admission plugin "MutatingAdmissionWebhook"
I1224 11:46:42.874420       1 plugins.go:84] Registered admission plugin "AlwaysAdmit"
I1224 11:46:42.874425       1 plugins.go:84] Registered admission plugin "AlwaysPullImages"
I1224 11:46:42.874432       1 plugins.go:84] Registered admission plugin "LimitPodHardAntiAffinityTopology"
I1224 11:46:42.874440       1 plugins.go:84] Registered admission plugin "DefaultTolerationSeconds"
I1224 11:46:42.874446       1 plugins.go:84] Registered admission plugin "AlwaysDeny"
I1224 11:46:42.874458       1 plugins.go:84] Registered admission plugin "EventRateLimit"
I1224 11:46:42.874465       1 plugins.go:84] Registered admission plugin "DenyEscalatingExec"
I1224 11:46:42.874470       1 plugins.go:84] Registered admission plugin "DenyExecOnPrivileged"
I1224 11:46:42.874477       1 plugins.go:84] Registered admission plugin "ExtendedResourceToleration"
I1224 11:46:42.874483       1 plugins.go:84] Registered admission plugin "OwnerReferencesPermissionEnforcement"
I1224 11:46:42.874495       1 plugins.go:84] Registered admission plugin "ImagePolicyWebhook"
I1224 11:46:42.874503       1 plugins.go:84] Registered admission plugin "InitialResources"
I1224 11:46:42.874509       1 plugins.go:84] Registered admission plugin "LimitRanger"
I1224 11:46:42.874517       1 plugins.go:84] Registered admission plugin "NamespaceAutoProvision"
I1224 11:46:42.874524       1 plugins.go:84] Registered admission plugin "NamespaceExists"
I1224 11:46:42.874530       1 plugins.go:84] Registered admission plugin "NodeRestriction"
I1224 11:46:42.874538       1 plugins.go:84] Registered admission plugin "PersistentVolumeLabel"
I1224 11:46:42.874544       1 plugins.go:84] Registered admission plugin "PodNodeSelector"
I1224 11:46:42.874552       1 plugins.go:84] Registered admission plugin "PodPreset"
I1224 11:46:42.874559       1 plugins.go:84] Registered admission plugin "PodTolerationRestriction"
I1224 11:46:42.874566       1 plugins.go:84] Registered admission plugin "ResourceQuota"
I1224 11:46:42.874573       1 plugins.go:84] Registered admission plugin "PodSecurityPolicy"
I1224 11:46:42.874579       1 plugins.go:84] Registered admission plugin "Priority"
I1224 11:46:42.874590       1 plugins.go:84] Registered admission plugin "SecurityContextDeny"
I1224 11:46:42.874598       1 plugins.go:84] Registered admission plugin "ServiceAccount"
I1224 11:46:42.874604       1 plugins.go:84] Registered admission plugin "DefaultStorageClass"
I1224 11:46:42.874611       1 plugins.go:84] Registered admission plugin "PersistentVolumeClaimResize"
I1224 11:46:42.874619       1 plugins.go:84] Registered admission plugin "StorageObjectInUseProtection"
F1224 11:47:12.886869       1 start_api.go:68] dial tcp 127.0.0.1:2379: connect: connection refused

[/edit2]

好吧,费了好大的功夫,终于找到问题所在了。基本上,您的 localhosts 必须指向您在 vagrantfile 中定义的 ip。否则它不会工作。这是在etc/hosts中控制的。我也在使用鹰派指标。如果你不想,你也不需要安装 java-18.0-openjdk-headless

Vagrantfile

$lab_openshift = <<SCRIPT
yum -y update

yum install -y epel-release
echo "==================Installing PYTHON=================="
yum install -y python-pip python-devel python python-passlib

echo "==================Installing GIT=================="
yum install -y git

echo "==================Installing ANSIBLE=================="
yum install -y ansible

echo "==================Installing java-1.8.0-openjdk-headless================="
yum install -y java-1.8.0-openjdk-headless

cp /home/vagrant/etc.ansible.hosts /etc/ansible/hosts
cp /home/vagrant/etc.selinux.config /etc/selinux/config
cp /home/vagrant/etc.hosts /etc/hosts

git clone -b release-3.10 https://github.com/openshift/openshift-ansible /root/openshift-ansible


reboot
SCRIPT


Vagrant.configure(2) do |config|
 config.vm.define "openshift" do |conf|
    conf.vm.box = "centos/7"
    config.vm.box_version = "1804.02"
    conf.vm.hostname = 'openshift.example.com'
    conf.vm.network "private_network", ip: "172.24.0.11"
    conf.vm.provision "file", source: "./etc.ansible.hosts", destination: "~/etc.ansible.hosts"
    conf.vm.provision "file", source: "./etc.selinux.config", destination: "~/etc.selinux.config"
    conf.vm.provision "file", source: "./etc.hosts", destination: "~/etc.hosts"
    conf.vm.provider "virtualbox" do |v|
        v.memory = 6144
        v.cpus = 2
    end
    conf.vm.provision "shell", inline: $lab_openshift
 end
end

SELinux 配置(etc.selinux.config)

SELINUX=permissive

etc/ansible/hosts (etc.ansible.hosts)

[OSEv3:children]
masters
nodes
etcd

[OSEv3:vars]

openshift_ip=172.24.0.11

openshift_deployment_type=origin
openshift_disable_check=memory_availability,disk_availability

ansible_service_broker_install=false
openshift_master_cluster_hostname=172.24.0.11
openshift_master_cluster_public_hostname=openshift.example.com
openshift_hostname=172.24.0.11
openshift_public_hostname=openshift.example.com
openshift_metrics_install_metrics=true
openshift_metrics_image_version=v3.10
openshift_master_default_subdomain=openshift.example.com

openshift_disable_check=disk_availability,memory_availability,docker_image_availability

    openshift_node_groups=[{'name': 'node-config-all-in-one', 'labels': ['node-role.kubernetes.io/master=true', 'node-role.kubernetes.io/infra=true', 'node-role.kubernetes.io/compute=true']}]


[masters]
172.24.0.11 ansible_connection=local

[etcd]
172.24.0.11 ansible_connection=local

[nodes]
172.24.0.11 ansible_connection=local openshift_node_group_name="node-config-all-in-one"

/etc/hosts (etc.hosts)

172.24.0.11    localhost   openshift.example.com   openshift
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6