MariaDB galera 集群 10.2 第二个节点 "Failed to open channel"

MariaDB galera cluster 10.2 second node "Failed to open channel"

我正在尝试 运行 mariadb:10.2.14 作为我 windows 本地计算机上的 galera 集群,使用 docker 组合。 运行 初始 boot 节点工作正常,但第二个节点无法加入集群并出现错误:

node_1 | 2018-05-04 3:13:46 140187778701184 [Note] WSREP: view((empty))

node_1 | 2018-05-04 3:13:46 140187778701184 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out) at gcomm/src/pc.cpp:connect():158

node_1 | 2018-05-04 3:13:46 140187778701184 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)

node_1 | 2018-05-04 3:13:46 140187778701184 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1458: Failed to open channel 'galera' at 'gcomm://boot': -110 (Connection timed out)

node_1 | 2018-05-04 3:13:46 140187778701184 [ERROR] WSREP: gcs connect failed: Connection timed out

node_1 | 2018-05-04 3:13:46 140187778701184 [ERROR] WSREP: wsrep::connect(gcomm://boot) failed: 7

node_1 | 2018-05-04 3:13:46 140187778701184 [ERROR] Aborting

我在容器里运行ping boot验证主机名解析正确,不知道为什么连接不上。我试图将配置基于我看到的 mariadb:10.1 的各种 docker 文件,例如 https://gist.github.com/lucidfrontier45/497341c4b848dfbd6dfb

我的 docker 撰写文件:

# Docker compose file for running a local MySQL server
version: '2.2'
services:
  boot:
    image: mariadb:10.2.14
    command: mysqld --user=mysql --wsrep_new_cluster
    environment:
      MYSQL_DATABASE: "db"
      MYSQL_ALLOW_EMPTY_PASSWORD: "yes"
      # Needed because galera doesn't support MyISAM, which tzinfo uses
      MYSQL_INITDB_SKIP_TZINFO: "yes"
    ports:
      - ${SQL_PORT}:3306
      - 4444:4444
      - 4567:4567
      - 4568:4568
    networks:
      - sql
    volumes:
      - ./kubernetes/mariadb.conf.d:/etc/mysql/mariadb.conf.d
      - /var/lib/mysql
  node:
    image: mariadb:10.2.14
    command: mysqld --user=mysql --wsrep_cluster_address=gcomm://boot
    environment:
      MYSQL_DATABASE: "db"
      MYSQL_ALLOW_EMPTY_PASSWORD: "yes"
      # Needed because galera doesn't support MyISAM, which tzinfo uses
      MYSQL_INITDB_SKIP_TZINFO: "yes"
    networks:
      - sql
    volumes:
      - ./kubernetes/mariadb.conf.d:/etc/mysql/mariadb.conf.d
      - /var/lib/mysql
networks:
  sql:

我在 maraidb.conf.d 中的配置文件:

# This will be passed to all mysql clients
[client]
default-character-set=utf8mb4

[mysql]
default-character-set=utf8mb4

# The MySQL server
[mysqld]
character-set-server=utf8mb4
collation-server=utf8mb4_unicode_ci
default_storage_engine=innodb
binlog_format=row
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=0

# Allow server to accept connections on all interfaces.
bind-address=0.0.0.0

#
# * Galera-related settings
#
# https://mariadb.com/kb/en/mariadb/galera-cluster-system-variables/
#
[galera]
wsrep_on=ON
wsrep_log_conflicts=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
# TODO: is rsync the best option?
wsrep_sst_method=rsync

wsrep_cluster_name=galera
#wsrep_slave_threads=1

wsrep_cluster_address 参数应该在 galera 集群创建之初定义。

这就是您需要执行以下操作的原因:

1. 添加 wsrep_cluster_addressmaraidb.conf.d 中的配置文件,所有节点都得到它:

wsrep_cluster_address="gcomm://boot,node"

2.当我们在node容器中启动mysql时,从mysqld命令中删除--wsrep_cluster_address标志,因为我们已经在配置中:

version: '2.2'
services:
  boot:
    image: mariadb:10.2.14
    command: mysqld --user=mysql --wsrep_new_cluster
    environment:
      MYSQL_DATABASE: "db"
      MYSQL_ALLOW_EMPTY_PASSWORD: "yes"
      # Needed because galera doesn't support MyISAM, which tzinfo uses
      MYSQL_INITDB_SKIP_TZINFO: "yes"
    ports:
      - ${SQL_PORT}:3306
      - 4444:4444
      - 4567:4567
      - 4568:4568
    networks:
      - sql
    volumes:
      - ./kubernetes/mariadb.conf.d:/etc/mysql/mariadb.conf.d
      - /var/lib/mysql
  node:
    image: mariadb:10.2.14
    command: mysqld --user=mysql
    environment:
      MYSQL_DATABASE: "db"
      MYSQL_ALLOW_EMPTY_PASSWORD: "yes"
      # Needed because galera doesn't support MyISAM, which tzinfo uses
      MYSQL_INITDB_SKIP_TZINFO: "yes"
    networks:
      - sql
    volumes:
      - ./kubernetes/mariadb.conf.d:/etc/mysql/mariadb.conf.d
      - /var/lib/mysql
networks:
  sql:

谢谢@Nickolay,你的回答基本有效,但我 运行 遇到了另一个问题。不过,它确实让我走上了寻找解决方案的正确轨道。

所以看来主要问题是 --wsrep_new_cluster 本身不足以 bootstrap 节点,您需要设置 wsrep_cluster_address 变量。使用 --wsrep_cluster_address=gcomm:// 设置对我有用。

此外,我 运行 遇到了似乎存在竞争条件的问题,并且第一个引导节点无法初始化并出现错误,即它不是最后一个节点。我通过在第二个节点命令中使用短暂的睡眠来解决这个问题。

我的最终 docker 撰写文件:

# Docker compose file for running a local mariadb galera cluster
version: '3.6'
services:
  boot:
    image: mariadb:10.2.14
    command: mysqld --user=mysql --wsrep_cluster_address=gcomm://
    environment:
      MYSQL_DATABASE: "db"
      MYSQL_ALLOW_EMPTY_PASSWORD: "yes"
      # Needed because galera doesn't support MyISAM, which tzinfo uses
      MYSQL_INITDB_SKIP_TZINFO: "yes"
    ports:
      - ${SQL_PORT}:3306
      - 4444:4444
      - 4567:4567
      - 4568:4568
    volumes:
      - ./kubernetes/mariadb.conf.d:/etc/mysql/mariadb.conf.d
      - /var/lib/mysql
  node:
    image: mariadb:10.2.14
    command: bash -c "sleep 10; mysqld --user=mysql --wsrep_cluster_address=gcomm://boot"
    environment:
      MYSQL_DATABASE: "db"
      MYSQL_ALLOW_EMPTY_PASSWORD: "yes"
      # Needed because galera doesn't support MyISAM, which tzinfo uses
      MYSQL_INITDB_SKIP_TZINFO: "yes"
    volumes:
      - ./kubernetes/mariadb.conf.d:/etc/mysql/mariadb.conf.d
      - /var/lib/mysql