ansible playbook [setup] 收集事实 - SSH UNREACHABLE Connection timed out during banner

Question

我在 Mac 机器上。

$ which ansible
/Library/Frameworks/Python.framework/Versions/3.5/bin/ansible

或者我猜，ansible 可以位于通用位置：/usr/bin/ansible（例如：在 CentOS/Ubuntu 上）。

$ ansible --version
ansible 2.2.0.0

运行以下剧本在我的另一个 vagrant / Ubuntu 盒子中运行良好。

剧本文件看起来像：

- hosts: all
  become: true
  gather_facts: true

  roles:
    - a_role_which_just_say_hello_world_debug_msg

从我的本地机器，我可以成功地 ssh 到目标 servers/the 以下服务器（没有任何密码，因为我已经使用 ssh-add 添加了 .pem 密钥文件），在 Ansible 剧本运行.

中的 Ansible 剧本 [Setup]（收集事实步骤）中失败

在 Mac 机器上，我有时会遇到这个错误（不是每次）。错误：Failed to connect to the host via ssh: Connection timed out during banner exchange。 PS: 这个问题不是一直都有。

$ ansible-playbook -i inventory -l tag_cluster_mycluster myplabook.yml

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [myclusterSomeServer01_i_07f318688f6339971]
fatal: [myclusterSomeServer02_i_03df6f1f988e665d9]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Connection timed out during banner exchange\r\n", "unreachable": true}

好的，尝试了几次，同样的行为，在 15 个服务器（我在 mycluster 集群中）中，[SETUP] 设置在收集事实设置期间失败，下次它工作正常。

重试： $ ansible-playbook -i inventory -l tag_cluster_mycluster myplabook.yml

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [myclusterSomeServer01_i_07f318688f6339971]
ok: [myclusterSomeServer02_i_03df6f1f988e665d9]
ok: [myclusterSomeServer03_i_057dfr56u88e665d9]
...
.....more...this time it worked for all servers.

正如你在上面看到的，这次上面的步骤工作正常。同样的问题（SSH 连接超时）发生在一些 task/actions 期间（我正在尝试使用 Ansible yum 模块安装一些东西。如果我再试一次，它对失败的服务器工作正常上次成功的另一台服务器可能会失败。因此，行为是随机的。

我的 /etc/ansible/ansible.cfg 文件有：

[ssh_connection]
scp_if_ssh = True

Answer 1

将以下 timeout 设置添加到 /etc/ansible/ansible.cfg 配置文件时，当我将其增加到 25 时，它起作用了。当它是 10 或 15 时，由于连接，我仍然看到某些服务器中的错误超时横幅问题。

[defaults]
timeout = 25

[ssh_connection]
scp_if_ssh = True

除上述之外，我不得不使用 serial: N 或 serial: N%（其中 N 是一个数字）运行我的剧本一次在 N 个或服务器百分比上, 然后它工作正常。

即

- hosts: all
  become: true
  gather_facts: true
  serial: 2
  #serial: "10%"
  #serial: "{{ serialNumber }}"
  #serial: "{{ serialNumber }}%"

  vars:
   - serialNumber: 5

  roles:
    - a_role_which_just_say_hello_world_debug_msg

ansible playbook [setup] 收集事实 - SSH UNREACHABLE Connection timed out during banner

ansible playbook [setup] gather facts - SSH UNREACHABLE Connection timed out during banner

ssh

amazon-ec2

connection-timeout