ActiveMQ 经典到 ActiveMQ Artemis 故障转移不起作用

ActiveMQ classic to ActiveMQ Artemis failover does not work

我正在尝试从 ActiveMQ“Classic”迁移到 ActiveMQ Artemis。

我们有一个由 2 个活动节点组成的集群,我们尝试在不影响消费者和生产者的情况下迁移这些节点。为此,我们停止第一个节点,迁移它,启动它并在第一个节点备份后的第二个执行相同的操作。

我们观察到 consumers/producers 无法重新连接:

o.a.a.t.f.FailoverTransport | | Failed to connect to [tcp://172.17.233.92:63616?soTimeout=30000&soWriteTimeout=30000&keepAlive=true, tcp://172.17.233.93:63616?soTimeout=30000&soWriteTimeout=30000&keepAlive=true] after: 30 attempt(s) continuing to retry.

Consumers/producers 在我们重新启动它们后能够连接。 这是正常行为吗?

这是 ActiveMQ Artemis 代理:

  <connectors>
       <connector name="netty-connector">tcp://172.17.233.92:63616</connector>
       <connector name="server_0">tcp://172.17.233.93:63616</connector>
  </connectors>
  <acceptors>
       <acceptor name="netty-acceptor">tcp://172.17.233.92:63616?protocols=OPENWIRE</acceptor>
       <acceptor name="invm">"vm://0</acceptor>
  </acceptors>
  <cluster-connections>
     <cluster-connection name="cluster">
        <connector-ref>netty-connector</connector-ref>
        <retry-interval>500</retry-interval>
        <use-duplicate-detection>true</use-duplicate-detection>
        <message-load-balancing>ON_DEMAND</message-load-balancing>
        <max-hops>1</max-hops>
        <static-connectors>
           <connector-ref>server_0</connector-ref>
        </static-connectors>
     </cluster-connection>
  </cluster-connections>

这里是 ActiveMQ“经典”配置

     <!-- Transport protocol -->
    <transportConnectors>
        <transportConnector name="openwire"
                            uri="nio://172.17.233.92:63616?transport.soTimeout=15000&transport.threadName&keepAlive=true&transport.soWriteTimeout=15000&wireFormat.maxInactivityDuration=0"
                            enableStatusMonitor="true" rebalanceClusterClients="true" updateClusterClients="true" updateClusterClientsOnRemove="true" />
    </transportConnectors>

    <!-- Network of brokers setup -->
    <networkConnectors>
        <!-- we need conduit subscriptions for topics , but not for queue -->
        <networkConnector name="NC_topic" duplex="false" conduitSubscriptions="true" networkTTL="1" uri="static:(tcp://172.17.233.92:63616,tcp://172.17.233.93:63616)" decreaseNetworkConsumerPriority="true" suppressDuplicateTopicSubscriptions="true" dynamicOnly="true">
            <excludedDestinations>
                <queue physicalName=">" />
            </excludedDestinations>
        </networkConnector>
        <!-- we need conduit subscriptions for topics , but not for queue -->
        <networkConnector name="NC_queue" duplex="false" conduitSubscriptions="false" networkTTL="1" uri="static:(tcp://172.17.233.92:63616,tcp://172.17.233.93:63616)" decreaseNetworkConsumerPriority="true" suppressDuplicateQueueSubscriptions="true" dynamicOnly="true">
            <excludedDestinations>
                <topic physicalName=">" />
            </excludedDestinations>
        </networkConnector>
    </networkConnectors>

这个问题应该是由于 updateClusterClientsOnRemove 引起的,如果属实,将在集群从网络中删除时更新客户端,请参阅 broker-side options for failover

当第一个节点停止时,客户端将删除它并且不会再次添加它,因为带有 ActiveMQ Classic 的第二个节点无法连接到带有 ActiveMQ Artemis 的第一个节点。

最后,我们决定先停止2个节点,再升级重启。从 consumer/producer 的角度来看,这意味着中断,但所有订阅在重新启动后都已正确完成。