k-safety 评估失败时如何延迟 Vertica 节点关闭?

How to delay Vertica node shutdown when k-safety assessment fails?

我们使用的是 3 节点 Vertica 集群。 节点之间的网络连接有时会在短时间内失败(例如:10 秒)。

发生这种情况时,所有节点一旦检测到其他节点不可达(因为无法满足k-safety),就会迅速关闭。例如,节点0003在vertica日志中记录了以下序列:

00:04:30.633 node v_feedback_node0001 left the cluster
...
00:04:30.670 Node left cluster, reassessing k-safety...
...
00:04:32.389 node v_feedback_node0002 left the cluster
...
00:04:32.414 Changing node v_feedback_node0003 startup state from UP to UNSAFE
...
00:04:33.425 Shutting down this node
...
00:04:38.547 node v_feedback_node0003 left the cluster

是否可以配置一个延迟,在该延迟之后每个节点将在放弃和关闭之前尝试重新连接到其他节点?

Vertica forum 上得到了 Vertica 员工的答复。

This [reconnection delay] time is hard coded to 8 seconds.

I think time is better spent making the network more reliable. 30 sec of network failure is a lot (i mean really, really large, typically network rtt is in the microseconds). even if you kept vertica up by delaying k-safe assessment, nothing really can connect to the database, or most likely all db connections may reset.