Heketi 没有看到所有的 glusterfs 砖

Heketi doesn't see all glusterfs bricks

glusterfs 有问题。 PODs.

似乎并非所有积木都在使用

所以我使用的是 Kubernetes v1.11、Heketi v9.0.0 和 glusterfs 4.1.5

之前我在一个工作节点上遇到了奇怪的问题,所以我需要重新启动这个节点。在此之后我遇到了 heketi pod 的问题。看起来它无法挂载 heketi db。 我解决了这个问题,它能够启动,但是当我检查 PODs 中的 gluster mounted share 时,我注意到那里只有最新的数据。
首先,我检查了所有同行的状态:

Number of Peers: 2

Hostname: 192.168.2.148
Uuid: a5b95e50-5fba-41a4-ad4d-e0c0f32686e9
State: Peer in Cluster (Connected)

Hostname: 192.168.2.70
Uuid: 0d91679f-bd49-4cd2-b003-383d8208f81b
State: Peer in Cluster (Connected)

然后使用

比较 glusterfs 卷内的所有砖块
gluster volume info

对于所有具有 heketi 拓扑的 gluster 节点和卷,一切都很好。
在我的 glusterfs 节点上使用 lvdisplay 我检查了所有砖块的路径,甚至将其中一些安装在主机节点上以确保数据在其中仍然可用 - 在其中找到了旧数据。

glusterfs 实例的一些输出:

gluster volume list
heketidbstorage
vol_3e49f0d33f5610cae6808cc77e028698
vol_d592d9bed635ad3f18b32fad15b30e5e

# gluster volume info vol_3e49f0d33f5610cae6808cc77e028698

Volume Name: vol_3e49f0d33f5610cae6808cc77e028698
Type: Distributed-Replicate
Volume ID: ca82ebcd-5e8a-4969-b7a5-c17b7e9b7b9e
Status: Started
Snapshot Count: 0
Number of Bricks: 5 x 3 = 15
Transport-type: tcp
Bricks:
Brick1: 192.168.2.96:/var/lib/heketi/mounts/vg_0c8b46493bec60ea4531d7efbc5160b3/brick_e1879c9d08c2da4691ae0c3e85b3d090/brick
Brick2: 192.168.2.148:/var/lib/heketi/mounts/vg_0e3d99f43455efbaad08a9835e5829b5/brick_c0074cec63bd11946ee22d981346d76a/brick
Brick3: 192.168.2.70:/var/lib/heketi/mounts/vg_c974ac16bc0c783c55a29f83eeb71962/brick_b7a72f169526b8c154784ebe0611f4c0/brick
Brick4: 192.168.2.70:/var/lib/heketi/mounts/vg_c974ac16bc0c783c55a29f83eeb71962/brick_5044931511b6f0c9bc28e3305a12de34/brick
Brick5: 192.168.2.148:/var/lib/heketi/mounts/vg_0e3d99f43455efbaad08a9835e5829b5/brick_8fbb59b27b61117eb5b87873e7371d56/brick
Brick6: 192.168.2.96:/var/lib/heketi/mounts/vg_0c8b46493bec60ea4531d7efbc5160b3/brick_f89c254d5b8905380e3c3d1cc5ca22ca/brick
Brick7: 192.168.2.148:/var/lib/heketi/mounts/vg_0e3d99f43455efbaad08a9835e5829b5/brick_d5e3a284457d35d2245b8a93f4a700aa/brick
Brick8: 192.168.2.70:/var/lib/heketi/mounts/vg_c974ac16bc0c783c55a29f83eeb71962/brick_397fc0cc8465f9517ea859af25f928db/brick
Brick9: 192.168.2.96:/var/lib/heketi/mounts/vg_0c8b46493bec60ea4531d7efbc5160b3/brick_a69906bc8f4662288f7578c6770660fc/brick
Brick10: 192.168.2.96:/var/lib/heketi/mounts/vg_0c8b46493bec60ea4531d7efbc5160b3/brick_03e6de08eeb075dde0943c7e0191ca3e/brick
Brick11: 192.168.2.148:/var/lib/heketi/mounts/vg_0e3d99f43455efbaad08a9835e5829b5/brick_d762e5bf6013e3bd31e88e02ce9f06c0/brick
Brick12: 192.168.2.70:/var/lib/heketi/mounts/vg_c974ac16bc0c783c55a29f83eeb71962/brick_aca4a7b0d51ca95376ec7f29515d290f/brick
Brick13: 192.168.2.148:/var/lib/heketi/mounts/vg_0e3d99f43455efbaad08a9835e5829b5/brick_231f645020d2a0c691173cee432ace9e/brick
Brick14: 192.168.2.96:/var/lib/heketi/mounts/vg_0c8b46493bec60ea4531d7efbc5160b3/brick_807002a90ae315d40138a6031096d812/brick
Brick15: 192.168.2.70:/var/lib/heketi/mounts/vg_c974ac16bc0c783c55a29f83eeb71962/brick_f516b9811f1581d5fc453692e2a15183/brick
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

# gluster volume status vol_3e49f0d33f5610cae6808cc77e028698 detail

Status of volume: vol_3e49f0d33f5610cae6808cc77e028698
------------------------------------------------------------------------------
Brick                : Brick 192.168.2.96:/var/lib/heketi/mounts/vg_0c8b46493bec60ea4531d7efbc5160b3/brick_e1879c9d08c2da4691ae0c3e85b3d090/brick
TCP Port             : 49153
RDMA Port            : 0
Online               : Y
Pid                  : 187
File System          : xfs
Device               : /dev/mapper/vg_0c8b46493bec60ea4531d7efbc5160b3-brick_e1879c9d08c2da4691ae0c3e85b3d090
Mount Options        : rw,noatime,nouuid,attr2,inode64,logbsize=128k,sunit=256,swidth=512,noquota
Inode Size           : 512
Disk Space Free      : 566.3GB
Total Disk Space     : 1.4TB
Inode Count          : 155713088
Free Inodes          : 155419488

当我在 POD 中使用 df -h 检查 gluster 安装共享时,我看到:

192.168.2.148:vol_3e49f0d33f5610cae6808cc77e028698  486G  304G  183G  63% /var/lib/kubelet/pods/6804de60-97e8-11e9-be43-12eea8244508/volumes/kubernetes.io~glusterfs/pvc-d9222fa3-b649-11e8-9583-12eea8244508

但应该只有几太字节。

我终于找到了解决办法。它与 glusterfs v4 的 shared-brick-count 个已知问题有关。 Bug 1517260 - Volume wrong size