Jenkins 代理 JNLP 连接问题 "failed to synchronize IO streams on the channel" 和 "Protocol stack cannot write data anymore"

Jenkins Agent JNLP connectivity issues "failed to synchronize IO streams on the channel" and "Protocol stack cannot write data anymore"

有人知道如何处理 linux-to-linux Jenkins 代理与 JNLP 连接的问题吗?

我每 20 天左右就会看到这些内容。我正在使用 centos 7 代理主机 - 2.223 jenkins 和 remoting 4.2。

我们的工作 运行 他们的许多临时步骤 docker 包含通过 docker 插件,我们目前使用 devicemapper 作为我们的 docker 存储驱动程序。发生这种情况时,docker 似乎确实承受了一些负载,但我还没有详细的统计数据来支持该理论

docker.image.inside {
...
}

发生这种情况时,代理主机被主服务器报告为离线,并且两者仍在具有相同安全组的同一个 AWS VPC 中,因此它们之间的连接应该仍然是可能的(我上次没有检查它因为我们还有其他火灾要处理)。此外,代理 java 进程仍在 运行ning。

我听说这可能与插件和复杂的管道代码有关。当我尝试将其与 jenkins master 的日志进行匹配时,我在主机消息中看不到太多信息,也没有看到模式。

我也想知道切换到 ssh 代理插件是否可以使问题变得不那么严重。我可能会尝试在主机上启用更多日志记录,希望能捕获更多详细信息。

如果您看到或有任何建议,请告诉我您是如何处理的。

01:51:49 agent.host java: INFO: Failed to synchronize IO streams on the channel hudson.remoting.Channel@...:JNLP4-connect connection to jenkins.edgewise.devops/#.#.#.#:50000
01:51:49 agent.host java: java.lang.InterruptedException
01:51:49 agent.host java: at java.lang.Object.wait(Native Method)
01:51:49 agent.host java: at hudson.remoting.Request.call(Request.java:177)
01:51:49 agent.host java: at hudson.remoting.Channel.call(Channel.java:997)
01:51:49 agent.host java: at hudson.remoting.Channel.syncIO(Channel.java:1730)
01:51:49 agent.host java: at hudson.Launcher$RemoteLaunchCallable.join(Launcher.java:1328)
01:51:49 agent.host java: at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
01:51:49 agent.host java: at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
01:51:49 agent.host java: at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
01:51:49 agent.host java: at java.lang.reflect.Method.invoke(Method.java:498)
01:51:49 agent.host java: at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:931)
01:51:49 agent.host java: at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:905)
01:51:49 agent.host java: at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:857)
01:51:49 agent.host java: at hudson.remoting.UserRequest.perform(UserRequest.java:211)
01:51:49 agent.host java: at hudson.remoting.UserRequest.perform(UserRequest.java:54)
01:51:49 agent.host java: at hudson.remoting.Request.run(Request.java:369)
01:51:49 agent.host java: at hudson.remoting.InterceptingExecutorService.call(InterceptingExecutorService.java:72)
01:51:49 agent.host java: at java.util.concurrent.FutureTask.run(FutureTask.java:266)
01:51:49 agent.host java: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
01:51:49 agent.host java: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
01:51:49 agent.host java: at hudson.remoting.Engine.lambda$newThread[=13=](Engine.java:117)
01:51:49 agent.host java: at java.lang.Thread.run(Thread.java:748)
02:01:01 agent.host systemd: Created slice User Slice of root.
02:01:01 agent.host systemd: Started Session 784 of user root.

02:15:05 agent.host dhclient[1092]: DHCPREQUEST on eth0 to #.#.#.#ort 67 (xid=0x6...)
02:15:05 agent.host dhclient[1092]: DHCPACK from #.#.#.#xid=0x6...)
02:15:07 agent.host dhclient[1092]: bound to #.#.#.# -- renewal in 1587 seconds.

02:32:00 agent.host java: Oct 23, 2020 2:32:00 AM hudson.remoting.Request run
02:32:00 agent.host java: INFO: Failed to send back a reply to the request hudson.remoting.Request@...: hudson.remoting.ChannelClosedException: Channel "unknown": Protocol stack cannot write data anymore. It is not open for write

鉴于:

  • docker 没有使用推荐的高效存储驱动程序 (doc)

devicemapper is supported, but requires direct-lvm for production environments, because loopback-lvm, while zero-configuration, has very poor performance. devicemapper was the recommended storage driver for CentOS and RHEL, as their kernel version did not support overlay2. However, current versions of CentOS and RHEL now have support for overlay2, which is now the recommended driver.

  • Jenkins Docker 插件 issue #640 描述了当 docker 负载高
  • 时如何出现 InterruptedException

代理主机上的 switching storage driver to overlay2 似乎是降低此问题风险的正确操作