Flink 1.2 无法以 HA Cluster 模式启动

Flink 1.2 does not start in HA Cluster mode

我已经在本地安装了 HA 集群模式的 Flink 1.2 2 JobManagers 1 TaskManager 并且它一直拒绝以这种模式实际启动显示 "Starting cluster." 消息而不是 "Starting HA cluster with 2 masters and 1 peers in ZooKeeper quorum."

显然在 bin/config.sh 中读取的配置如下:

# High availability
if [ -z "${HIGH_AVAILABILITY}" ]; then
     HIGH_AVAILABILITY=$(readFromConfig ${KEY_HIGH_AVAILABILITY} "" "${YAML_CONF}")
     if [ -z "${HIGH_AVAILABILITY}" ]; then
        # Try deprecated value
        DEPRECATED_HA=$(readFromConfig "recovery.mode" "" "${YAML_CONF}")
        if [ -z "${DEPRECATED_HA}" ]; then
            HIGH_AVAILABILITY="none"
        elif [ ${DEPRECATED_HA} == "standalone" ]; then
            # Standalone is now 'none'
            HIGH_AVAILABILITY="none"
        else
            HIGH_AVAILABILITY=${DEPRECATED_HA}
        fi
     else
         HIGH_AVAILABILITY="none"
     fi
fi

这意味着独立于配置文件中为 "high-availability" 键配置的内容(在我的例子中值为 "zookeeper")它将设置为 "none" 并在 bin/start-cluster.sh

if [[ $HIGH_AVAILABILITY == "zookeeper" ]]; then
    # HA Mode
    readMasters

    echo "Starting HA cluster with ${#MASTERS[@]} masters."

    for ((i=0;i<${#MASTERS[@]};++i)); do
        master=${MASTERS[i]}
        webuiport=${WEBUIPORTS[i]}
        ssh -n $FLINK_SSH_OPTS $master -- "nohup /bin/bash -l \"${FLINK_BIN_DIR}/jobmanager.sh\" start cluster ${master} ${webuiport} &"
    done

else
    echo "Starting cluster."

    # Start single JobManager on this machine
    "$FLINK_BIN_DIR"/jobmanager.sh start cluster
fi

永远不会进入第一个 if 分支。

还有其他人遇到过这个问题吗?

是的,我认为这是一个错误:issues.apache.org/jira/browse/FLINK-6000.

它已经有一个待处理的 PR。