使用 KUBE_PING 缩小后,Wildfly 中重复出现警告消息

Warning message repeated in Wildfly after scale-down using KUBE_PING

我正在尝试使用 KUBE_PING JGroups 协议在 Kubernetes 上以 HA Full 模式 运行 Wildfly。一切都很好地启动,我可以扩展集群,节点相互识别没有任何问题。

当我尝试缩小集群时出现问题。 ActiveMQ Artemis 不断抱怨它无法连接到断开连接的节点,即使 JGroups 已经确认旧节点已经离开集群。

我想知道我的 JGroups 配置可能做错了什么。我附上了一些日志消息,以及 KUBE_PING.

的 JGroups 配置

为了确保我提供了尽可能多的信息,我运行正在查看最新的 Wildfly 官方 docker 图片,15.0。1.Final,运行 在 JDK 11.

在此先感谢您的帮助!

编辑: 修正拼写错误

JGroups确认节点断开

wildfly-kube 12:48:36,514 INFO  [org.apache.activemq.artemis.core.server] (Thread-22 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl@10f88645)) AMQ221027: Bridge ClusterConnectionBridge@379d51e3 [name=$.artemis.internal.sf.my-cluster.7ee91868-337b-11e9-9849-ce422226aad5, queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.7ee91868-337b-11e9-9849-ce422226aad5, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=314721ae-337b-11e9-9cfa-0e8a9828b1cb], temp=false]@195607a8 targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@379d51e3 [name=$.artemis.internal.sf.my-cluster.7ee91868-337b-11e9-9849-ce422226aad5, queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.7ee91868-337b-11e9-9849-ce422226aad5, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=314721ae-337b-11e9-9cfa-0e8a9828b1cb], temp=false]@195607a8 targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=http-acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=8080&host=100-116-0-4], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@1699294977[nodeUUID=314721ae-337b-11e9-9cfa-0e8a9828b1cb, connector=TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=http-acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=8080&host=100-122-0-6, address=jms, server=ActiveMQServerImpl::serverUUID=314721ae-337b-11e9-9cfa-0e8a9828b1cb])) [initialConnectors=[TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=http-acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=8080&host=100-116-0-4], discoveryGroupConfiguration=null]] is connected
wildfly-kube-b6f69fb9-b2hd5 wildfly-kube 12:48:38,905 WARN  [org.apache.activemq.artemis.core.server] (Thread-5 (ActiveMQ-client-global-threads)) AMQ222095: Connection failed with failedOver=false
wildfly-kube-b6f69fb9-b2hd5 wildfly-kube 12:48:43,758 ERROR [org.jgroups.protocols.TCP] (TQ-Bundler-7,ejb,wildfly-kube-b6f69fb9-b2hd5) JGRP000034: wildfly-kube-b6f69fb9-b2hd5: failure sending message to wildfly-kube-b6f69fb9-nshvn: java.net.SocketTimeoutException: connect timed out
wildfly-kube-b6f69fb9-b2hd5 wildfly-kube 12:48:44,759 INFO  [org.infinispan.CLUSTER] (VERIFY_SUSPECT.TimerThread-13,ejb,wildfly-kube-b6f69fb9-b2hd5) ISPN000094: Received new cluster view for channel ejb: [wildfly-kube-b6f69fb9-b2hd5|2] (1) [wildfly-kube-b6f69fb9-b2hd5]
wildfly-kube-b6f69fb9-b2hd5 wildfly-kube 12:48:44,772 INFO  [org.infinispan.CLUSTER] (VERIFY_SUSPECT.TimerThread-13,ejb,wildfly-kube-b6f69fb9-b2hd5) ISPN100001: Node wildfly-kube-b6f69fb9-nshvn left the cluster
wildfly-kube-b6f69fb9-b2hd5 wildfly-kube 12:48:44,777 INFO  [org.infinispan.CLUSTER] (VERIFY_SUSPECT.TimerThread-13,ejb,wildfly-kube-b6f69fb9-b2hd5) ISPN000094: Received new cluster view for channel ejb: [wildfly-kube-b6f69fb9-b2hd5|2] (1) [wildfly-kube-b6f69fb9-b2hd5]
wildfly-kube-b6f69fb9-b2hd5 wildfly-kube 12:48:44,779 INFO  [org.infinispan.CLUSTER] (VERIFY_SUSPECT.TimerThread-13,ejb,wildfly-kube-b6f69fb9-b2hd5) ISPN100001: Node wildfly-kube-b6f69fb9-nshvn left the cluster
wildfly-kube-b6f69fb9-b2hd5 wildfly-kube 12:48:44,787 INFO  [org.infinispan.CLUSTER] (VERIFY_SUSPECT.TimerThread-13,ejb,wildfly-kube-b6f69fb9-b2hd5) ISPN000094: Received new cluster view for channel ejb: [wildfly-kube-b6f69fb9-b2hd5|2] (1) [wildfly-kube-b6f69fb9-b2hd5]
wildfly-kube-b6f69fb9-b2hd5 wildfly-kube 12:48:44,788 INFO  [org.infinispan.CLUSTER] (VERIFY_SUSPECT.TimerThread-13,ejb,wildfly-kube-b6f69fb9-b2hd5) ISPN100001: Node wildfly-kube-b6f69fb9-nshvn left the cluster
wildfly-kube-b6f69fb9-b2hd5 wildfly-kube 12:48:44,791 INFO  [org.infinispan.CLUSTER] (VERIFY_SUSPECT.TimerThread-13,ejb,wildfly-kube-b6f69fb9-b2hd5) ISPN000094: Received new cluster view for channel ejb: [wildfly-kube-b6f69fb9-b2hd5|2] (1) [wildfly-kube-b6f69fb9-b2hd5]
wildfly-kube-b6f69fb9-b2hd5 wildfly-kube 12:48:44,792 INFO  [org.infinispan.CLUSTER] (VERIFY_SUSPECT.TimerThread-13,ejb,wildfly-kube-b6f69fb9-b2hd5) ISPN100001: Node wildfly-kube-b6f69fb9-nshvn left the cluster

重复的 ActiveMQ Artemis 警告(每 3 秒)

wildfly-kube-b6f69fb9-b2hd5 wildfly-kube 13:02:11,825 WARN  [org.apache.activemq.artemis.core.server] (Thread-55 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl@866e807)) AMQ224091: Bridge ClusterConnectionBridge@39836857 [name=$.artemis.internal.sf.my-cluster.314721ae-337b-11e9-9cfa-0e8a9828b1cb, queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.314721ae-337b-11e9-9cfa-0e8a9828b1cb, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=7ee91868-337b-11e9-9849-ce422226aad5], temp=false]@39425add targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@39836857 [name=$.artemis.internal.sf.my-cluster.314721ae-337b-11e9-9cfa-0e8a9828b1cb, queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.314721ae-337b-11e9-9cfa-0e8a9828b1cb, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=7ee91868-337b-11e9-9849-ce422226aad5], temp=false]@39425add targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=http-acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=8080&host=100-122-0-6], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@1432944139[nodeUUID=7ee91868-337b-11e9-9849-ce422226aad5, connector=TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=http-acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=8080&host=100-116-0-4, address=jms, server=ActiveMQServerImpl::serverUUID=7ee91868-337b-11e9-9849-ce422226aad5])) [initialConnectors=[TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=http-acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=8080&host=100-122-0-6], discoveryGroupConfiguration=null]] is unable to connect to destination. Retrying
wildfly-kube-b6f69fb9-b2hd5 wildfly-kube 13:02:14,897 WARN  [org.apache.activemq.artemis.core.server] (Thread-68 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl@866e807)) AMQ224091: Bridge ClusterConnectionBridge@39836857 [name=$.artemis.internal.sf.my-cluster.314721ae-337b-11e9-9cfa-0e8a9828b1cb, queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.314721ae-337b-11e9-9cfa-0e8a9828b1cb, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=7ee91868-337b-11e9-9849-ce422226aad5], temp=false]@39425add targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@39836857 [name=$.artemis.internal.sf.my-cluster.314721ae-337b-11e9-9cfa-0e8a9828b1cb, queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.314721ae-337b-11e9-9cfa-0e8a9828b1cb, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=7ee91868-337b-11e9-9849-ce422226aad5], temp=false]@39425add targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=http-acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=8080&host=100-122-0-6], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@1432944139[nodeUUID=7ee91868-337b-11e9-9849-ce422226aad5, connector=TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=http-acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=8080&host=100-116-0-4, address=jms, server=ActiveMQServerImpl::serverUUID=7ee91868-337b-11e9-9849-ce422226aad5])) [initialConnectors=[TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=http-acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=8080&host=100-122-0-6], discoveryGroupConfiguration=null]] is unable to connect to destination. Retrying

JGroups 配置

<subsystem xmlns="urn:jboss:domain:jgroups:6.0">
<channels default="ee">
    <channel name="ee" stack="tcp" cluster="ejb"/>
</channels>
<stacks>
    <stack name="tcp">
        <transport type="TCP" socket-binding="jgroups-tcp">
            <property name="logical_addr_cache_expiration">360000</property>
        </transport>
        <protocol type="kubernetes.KUBE_PING">
            <property name="namespace">${KUBERNETES_CLUSTER_NAMESPACE:default}</property>
            <property name="labels">${KUBERNETES_CLUSTER_LABEL:cluster=nyc}</property>
            <property name="port_range">0</property>
        </protocol>
        <protocol type="MERGE3"/>
        <protocol type="FD_SOCK" socket-binding="jgroups-tcp-fd"/>
        <protocol type="FD_ALL"/>
        <protocol type="VERIFY_SUSPECT"/>
        <protocol type="pbcast.NAKACK2">
            <property name="use_mcast_xmit">false</property>
        </protocol>
        <protocol type="UNICAST3"/>
        <protocol type="pbcast.STABLE"/>
        <protocol type="pbcast.GMS">
            <property name="join_timeout">30000</property>
            <property name="print_local_addr">true</property>
            <property name="print_physical_addrs">true</property>
        </protocol>
        <protocol type="MFC"/>
        <protocol type="FRAG3"/>
    </stack>
</stacks>

ActiveMQ Artemis 配置

<subsystem xmlns="urn:jboss:domain:messaging-activemq:5.0">
<server name="default">
    <cluster user="my_admin" password="my_password"/>
    <security-setting name="#">
        <role name="guest" send="true" consume="true" create-non-durable-queue="true" delete-non-durable-queue="true"/>
    </security-setting>
    <address-setting name="#" dead-letter-address="jms.queue.DLQ" expiry-address="jms.queue.ExpiryQueue" max-size-bytes="10485760" page-size-bytes="2097152" message-counter-history-day-limit="10" redistribution-delay="1000"/>
    <http-connector name="http-connector" socket-binding="http" endpoint="http-acceptor"/>
    <http-connector name="http-connector-throughput" socket-binding="http" endpoint="http-acceptor-throughput">
        <param name="batch-delay" value="50"/>
    </http-connector>
    <in-vm-connector name="in-vm" server-id="0">
        <param name="buffer-pooling" value="false"/>
    </in-vm-connector>
    <http-acceptor name="http-acceptor" http-listener="default"/>
    <http-acceptor name="http-acceptor-throughput" http-listener="default">
        <param name="batch-delay" value="50"/>
        <param name="direct-deliver" value="false"/>
    </http-acceptor>
    <in-vm-acceptor name="in-vm" server-id="0">
        <param name="buffer-pooling" value="false"/>
    </in-vm-acceptor>
    <broadcast-group name="bg-group1" jgroups-cluster="activemq-cluster" connectors="http-connector"/>
    <discovery-group name="dg-group1" jgroups-cluster="activemq-cluster"/>
    <cluster-connection name="my-cluster" address="jms" connector-name="http-connector" discovery-group="dg-group1"/>
    <jms-queue name="ExpiryQueue" entries="java:/jms/queue/ExpiryQueue"/>
    <jms-queue name="DLQ" entries="java:/jms/queue/DLQ"/>
    <connection-factory name="InVmConnectionFactory" entries="java:/ConnectionFactory" connectors="in-vm"/>
    <connection-factory name="RemoteConnectionFactory" entries="java:jboss/exported/jms/RemoteConnectionFactory" connectors="http-connector" ha="true" block-on-acknowledge="true" reconnect-attempts="-1"/>
    <pooled-connection-factory name="activemq-ra" entries="java:/JmsXA java:jboss/DefaultJMSConnectionFactory" connectors="in-vm" transaction="xa"/>
</server>

更新: 我要补充的一件事是,如果容器正常关闭,Artemis 似乎可以正确处理断开连接。在我的 Kubernetes 部署中的容器定义中添加 preStop 命令以在容器终止之前关闭 Wildfly,这有助于从容地将容器从集群中取出。

ActiveMQ Artemis 仅使用 JGroups(或任何其他发现机制)发现 其他代理,目的是将它们聚集在一起。一旦发现另一个 broker,它们就会在它们之间建立 TCP 连接,之后 JGroups 不再发挥任何作用,这意味着 JGroups 看到 broker 离开集群是无关紧要的。

集群桥接失败的事实足以告诉 ActiveMQ Artemis broker 已经离开了集群。此时的问题是代理应该如何响应死节点。默认情况下,它将尝试无限期地重新连接,因为它希望节点在某个时候返回。这在传统用例中是一个合理的期望,但在云中却并非如此。此行为由 cluster-connection 上的 reconnect-attempts 属性 控制。将 reconnect-attempts 设置为您认为合理的值(例如 10),您将看到网桥重新连接放弃并停止记录。