drbd & Corosync - 我的 drbd 工作,它告诉我它是最新的,但它不是
drbd & Corosync - My drbd works, it shows me that it is upToDate, but it is not
我有一个具有两个节点的高可用性集群,其中有一个用于 drbd 的资源、一个虚拟 IP 和在 drbd 分区上共享的 mariaDB 文件。
一切似乎都正常,但 drbd 没有同步我创建的最新文件,即使 drbd 状态告诉我它们是最新的。
sudo drbdadm status
iba role:Primary
disk:UpToDate
pcs也不显示错误
sudo pcs status
Cluster name: cluster_iba
Cluster Summary:
* Stack: corosync
* Current DC: iba2-ip192 (version 2.0.3-4b1f869f0f) - partition with quorum
* Last updated: Tue Feb 22 18:16:20 2022
* Last change: Mon Feb 21 16:19:38 2022 by root via cibadmin on iba1-ip192
* 2 nodes configured
* 6 resource instances configured
Node List:
* Online: [ iba1-ip192 iba2-ip192 ]
Full List of Resources:
* virtual_ip (ocf::heartbeat:IPaddr2): Started iba2-ip192
* Clone Set: DrbdData-clone [DrbdData] (promotable):
* Masters: [ iba2-ip192 ]
* Slaves: [ iba1-ip192 ]
* DrbdFS (ocf::heartbeat:Filesystem): Started iba2-ip192
* WebServer (ocf::heartbeat:apache): Started iba2-ip192
* Maria (ocf::heartbeat:mysql): Started iba2-ip192
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
所有约束:
sudo pcs constraint list --full
Location Constraints:
Ordering Constraints:
promote DrbdData-clone then start DrbdFS (kind:Mandatory) (id:order-DrbdData-clone-DrbdFS-mandatory)
start DrbdFS then start virtual_ip (kind:Mandatory) (id:order-DrbdFS-virtual_ip-mandatory)
start virtual_ip then start WebServer (kind:Mandatory) (id:order-virtual_ip-WebServer-mandatory)
start DrbdFS then start Maria (kind:Mandatory) (id:order-DrbdFS-Maria-mandatory)
Colocation Constraints:
DrbdFS with DrbdData-clone (score:INFINITY) (with-rsc-role:Master) (id:colocation-DrbdFS-DrbdData-clone-INFINITY)
virtual_ip with DrbdFS (score:INFINITY) (id:colocation-virtual_ip-DrbdFS-INFINITY)
WebServer with virtual_ip (score:INFINITY) (id:colocation-WebServer-virtual_ip-INFINITY)
Maria with DrbdFS (score:INFINITY) (id:colocation-Maria-DrbdFS-INFINITY)
Ticket Constraints:
节点iba2-ip192(当它是主节点时)/mnt/datosDRBD中的文件,
/mnt/datosDRBD$ ls -l
total 80
-rw-r--r-- 1 root root 5801 feb 21 12:16 drbd_cfg
-rw-r--r-- 1 root root 10494 feb 21 12:18 fs_cfg
drwx------ 2 root root 16384 feb 21 10:12 lost+found
drwxr-xr-x 4 mysql mysql 4096 feb 22 18:00 mariaDB
-rw-r--r-- 1 root root 17942 feb 21 12:39 MariaDB_cfg
-rw-r--r-- 1 root root 5 feb 21 10:13 testMParicio.txt
-rw-r--r-- 1 root root 13578 feb 21 12:21 WebServer_cfg
以及节点 iba1-ip192(当它是主节点时)/mnt/datosDRBD 中的文件,
ls -l
total 92
-rw-r--r-- 1 root root 5801 feb 21 12:16 drbd_cfg
drwxrwxrwx 5 www-data www-data 4096 feb 22 13:41 FilesSGITV
-rw-r--r-- 1 root root 10494 feb 21 12:18 fs_cfg
drwx------ 2 root root 16384 feb 21 10:12 lost+found
drwxr-xr-x 7 mysql mysql 4096 feb 22 17:55 mariaDB
-rw-r--r-- 1 root root 17942 feb 21 12:39 MariaDB_cfg
-rw-r--r-- 1 root root 5 feb 22 17:58 testMParicio2.txt
-rw-r--r-- 1 www-data www-data 9 feb 22 17:58 testMParicio3.txt
-rw-r--r-- 1 root root 5 feb 21 10:13 testMParicio.txt
-rw-r--r-- 1 root root 13578 feb 21 12:21 WebServer_cfg
所有新文件,testMParicio2.txt testMParicio3.txt 和文件夹 FilesSGITV 都丢失了。
我不知道该怎么办。我很迷茫。
感谢任何帮助,谢谢。
(编辑)
我的 drbd 配置,在两个节点中...
cat /etc/drbd.conf
# You can find an example in /usr/share/doc/drbd.../drbd.conf.example
include "drbd.d/global_common.conf";
include "drbd.d/*.res";
还有我的 *.res 配置,也在两个节点中:
resource iba {
device /dev/drbd0;
disk /dev/md3;
meta-disk internal;
on iba1 {
address 10.0.0.248:7789;
}
on iba2 {
address 10.0.0.249:7789;
}
}
drbdadm 使用 iba1 和 iba2,IP 为 10.0.0.248 和 10.0.0.249
Corosync 使用 iba1-ip192 和 iba2-192,IP 为 192.168.1.248 和 192.168.1.249
cat /etc/hosts
127.0.0.1 localhost
#127.0.1.1 iba1
10.0.0.248 iba1
10.0.0.249 iba2
192.168.1.248 iba1-ip192
192.168.1.249 iba2-ip192
cat /etc/drbd.d/global_common.conf
global {
usage-count yes;
udev-always-use-vnr; # treat implicit the same as explicit volumes
}
common {
handlers {
}
startup {
}
options {
}
disk {
}
net {
protocol C;
}
}
(编辑 2)
我在 /proc/drbd
中发现了一个问题
在主节点中:
cat /proc/drbd
version: 8.4.11 (api:1/proto:86-101)
srcversion: FC3433D849E3B88C1E7B55C
0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r-----
ns:0 nr:0 dw:2284 dr:11625 al:6 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:42364728
在辅助节点中
cat /proc/drbd
version: 8.4.11 (api:1/proto:86-101)
srcversion: FC3433D849E3B88C1E7B55C
0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:36538580
辅助节点不记得 ssh 密钥,用
修复
ssh-keygen -R 10.0.0.248
ssh-copy-id iba@iba1
但 drbd 仍处于 StandAlone 状态。
我不知道如何继续
我找到了一个 Split-Brain 没有出现在 pcs 的状态中。
sudo journalctl | grep Split-Brain
feb 21 13:00:10 ibatec1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
feb 21 13:21:40 ibatec1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
feb 21 13:27:54 ibatec1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
我已经停止了集群,在主服务器上设置了--force,
然后...
在 split-brain 受害者上(假设 DRBD 资源是 iba):
drbdadm disconnect iba
drbdadm secondary iba
drbdadm connect --discard-my-data iba
在 split-brain 个幸存者:
drbdadm primary iba
drbdadm connect iba
我有一个具有两个节点的高可用性集群,其中有一个用于 drbd 的资源、一个虚拟 IP 和在 drbd 分区上共享的 mariaDB 文件。
一切似乎都正常,但 drbd 没有同步我创建的最新文件,即使 drbd 状态告诉我它们是最新的。
sudo drbdadm status
iba role:Primary
disk:UpToDate
pcs也不显示错误
sudo pcs status
Cluster name: cluster_iba
Cluster Summary:
* Stack: corosync
* Current DC: iba2-ip192 (version 2.0.3-4b1f869f0f) - partition with quorum
* Last updated: Tue Feb 22 18:16:20 2022
* Last change: Mon Feb 21 16:19:38 2022 by root via cibadmin on iba1-ip192
* 2 nodes configured
* 6 resource instances configured
Node List:
* Online: [ iba1-ip192 iba2-ip192 ]
Full List of Resources:
* virtual_ip (ocf::heartbeat:IPaddr2): Started iba2-ip192
* Clone Set: DrbdData-clone [DrbdData] (promotable):
* Masters: [ iba2-ip192 ]
* Slaves: [ iba1-ip192 ]
* DrbdFS (ocf::heartbeat:Filesystem): Started iba2-ip192
* WebServer (ocf::heartbeat:apache): Started iba2-ip192
* Maria (ocf::heartbeat:mysql): Started iba2-ip192
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
所有约束:
sudo pcs constraint list --full
Location Constraints:
Ordering Constraints:
promote DrbdData-clone then start DrbdFS (kind:Mandatory) (id:order-DrbdData-clone-DrbdFS-mandatory)
start DrbdFS then start virtual_ip (kind:Mandatory) (id:order-DrbdFS-virtual_ip-mandatory)
start virtual_ip then start WebServer (kind:Mandatory) (id:order-virtual_ip-WebServer-mandatory)
start DrbdFS then start Maria (kind:Mandatory) (id:order-DrbdFS-Maria-mandatory)
Colocation Constraints:
DrbdFS with DrbdData-clone (score:INFINITY) (with-rsc-role:Master) (id:colocation-DrbdFS-DrbdData-clone-INFINITY)
virtual_ip with DrbdFS (score:INFINITY) (id:colocation-virtual_ip-DrbdFS-INFINITY)
WebServer with virtual_ip (score:INFINITY) (id:colocation-WebServer-virtual_ip-INFINITY)
Maria with DrbdFS (score:INFINITY) (id:colocation-Maria-DrbdFS-INFINITY)
Ticket Constraints:
节点iba2-ip192(当它是主节点时)/mnt/datosDRBD中的文件,
/mnt/datosDRBD$ ls -l
total 80
-rw-r--r-- 1 root root 5801 feb 21 12:16 drbd_cfg
-rw-r--r-- 1 root root 10494 feb 21 12:18 fs_cfg
drwx------ 2 root root 16384 feb 21 10:12 lost+found
drwxr-xr-x 4 mysql mysql 4096 feb 22 18:00 mariaDB
-rw-r--r-- 1 root root 17942 feb 21 12:39 MariaDB_cfg
-rw-r--r-- 1 root root 5 feb 21 10:13 testMParicio.txt
-rw-r--r-- 1 root root 13578 feb 21 12:21 WebServer_cfg
以及节点 iba1-ip192(当它是主节点时)/mnt/datosDRBD 中的文件,
ls -l
total 92
-rw-r--r-- 1 root root 5801 feb 21 12:16 drbd_cfg
drwxrwxrwx 5 www-data www-data 4096 feb 22 13:41 FilesSGITV
-rw-r--r-- 1 root root 10494 feb 21 12:18 fs_cfg
drwx------ 2 root root 16384 feb 21 10:12 lost+found
drwxr-xr-x 7 mysql mysql 4096 feb 22 17:55 mariaDB
-rw-r--r-- 1 root root 17942 feb 21 12:39 MariaDB_cfg
-rw-r--r-- 1 root root 5 feb 22 17:58 testMParicio2.txt
-rw-r--r-- 1 www-data www-data 9 feb 22 17:58 testMParicio3.txt
-rw-r--r-- 1 root root 5 feb 21 10:13 testMParicio.txt
-rw-r--r-- 1 root root 13578 feb 21 12:21 WebServer_cfg
所有新文件,testMParicio2.txt testMParicio3.txt 和文件夹 FilesSGITV 都丢失了。
我不知道该怎么办。我很迷茫。
感谢任何帮助,谢谢。
(编辑)
我的 drbd 配置,在两个节点中...
cat /etc/drbd.conf
# You can find an example in /usr/share/doc/drbd.../drbd.conf.example
include "drbd.d/global_common.conf";
include "drbd.d/*.res";
还有我的 *.res 配置,也在两个节点中:
resource iba {
device /dev/drbd0;
disk /dev/md3;
meta-disk internal;
on iba1 {
address 10.0.0.248:7789;
}
on iba2 {
address 10.0.0.249:7789;
}
}
drbdadm 使用 iba1 和 iba2,IP 为 10.0.0.248 和 10.0.0.249
Corosync 使用 iba1-ip192 和 iba2-192,IP 为 192.168.1.248 和 192.168.1.249
cat /etc/hosts
127.0.0.1 localhost
#127.0.1.1 iba1
10.0.0.248 iba1
10.0.0.249 iba2
192.168.1.248 iba1-ip192
192.168.1.249 iba2-ip192
cat /etc/drbd.d/global_common.conf
global {
usage-count yes;
udev-always-use-vnr; # treat implicit the same as explicit volumes
}
common {
handlers {
}
startup {
}
options {
}
disk {
}
net {
protocol C;
}
}
(编辑 2)
我在 /proc/drbd
中发现了一个问题在主节点中:
cat /proc/drbd
version: 8.4.11 (api:1/proto:86-101)
srcversion: FC3433D849E3B88C1E7B55C
0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r-----
ns:0 nr:0 dw:2284 dr:11625 al:6 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:42364728
在辅助节点中
cat /proc/drbd
version: 8.4.11 (api:1/proto:86-101)
srcversion: FC3433D849E3B88C1E7B55C
0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:36538580
辅助节点不记得 ssh 密钥,用
修复ssh-keygen -R 10.0.0.248
ssh-copy-id iba@iba1
但 drbd 仍处于 StandAlone 状态。
我不知道如何继续
我找到了一个 Split-Brain 没有出现在 pcs 的状态中。
sudo journalctl | grep Split-Brain
feb 21 13:00:10 ibatec1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
feb 21 13:21:40 ibatec1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
feb 21 13:27:54 ibatec1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
我已经停止了集群,在主服务器上设置了--force, 然后... 在 split-brain 受害者上(假设 DRBD 资源是 iba):
drbdadm disconnect iba
drbdadm secondary iba
drbdadm connect --discard-my-data iba
在 split-brain 个幸存者:
drbdadm primary iba
drbdadm connect iba