Cassandra 修复失败

Cassandra Repair fails

Cassandra 修复无法 运行,节点 1 上出现以下错误。我早些时候错误地并行启动了多个修复会话。 我发现有一个错误 https://issues.apache.org/jira/browse/CASSANDRA-11824 已针对同一场景解决。 但我已经在使用 cassandra 3.9 请确认 运行ning nodetool scrub 是否是唯一的解决方法?在 运行 清理之前我们需要牢记任何注意事项,因为我需要 运行 直接在 Prod 上执行此操作。

com.google.common.util.concurrent.UncheckedExecutionException: org.apache.cassandra.exceptions.RepairException: [repair #6546ce10-3a70-11ec-9336-394ae1cd743d on test/test_config, [(-1879129450237588992,-1867793788349541955], (-1228457230064908637,-1228389616821781301], (583169750278890460,583583127041100026]]] Validation failed in /10.11.22.123
        at com.google.common.util.concurrent.Futures.wrapAndThrowUnchecked(Futures.java:1525) ~[guava-18.0.jar:na]

在节点 2(10.11.22.123) 上,

ERROR 17:33:12 Cannot start multiple repair sessions over the same sstables
ERROR 17:33:12 Failed creating a merkle tree for [repair #6546ce10-3a70-11ec-9336-394ae1cd743d on test/test_config, [(-1879129450237588992,-1867793788349541955], (-1228457230064908637,-1228389616821781301], (583169750278890460,583583127041100026]]], /10.11.22.789(node 1) (see log for details)
ERROR 17:33:12 Exception in thread Thread[ValidationExecutor:10,1,main]
java.lang.RuntimeException: Cannot start multiple repair sessions over the same sstables
        at org.apache.cassandra.service.ActiveRepairService$ParentRepairSession.markSSTablesRepairing(ActiveRepairService.java:526) ~[apache-cassandra-3.9.jar:3.9]
        at org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:1318) ~[apache-cassandra-3.9.jar:3.9]

Nodetool tpstats 显示确实有活跃的修复作业,但它们实际上不是 运行ning 或 compactionstats 没有显示任何 运行ning 作业。 所以我只重新启动了修复被卡住的节点,这清除了那些卡住的修复作业,之后我能够 运行 进行新的修复。

nodetool tpstats    
Pool Name                    Active   Pending      Completed   Blocked  All time blocked
MutationStage                     0         0      323161614         0                 0
ViewMutationStage                 0         0              0         0                 0
ReadStage                         0         0      339671804         0                 0
RequestResponseStage              0         0      440712393         0                 0
ReadRepairStage                   0         0       13751257         0                 0
CounterMutationStage              0         0              0         0                 0
Repair#3                          1      3525              3         0                 0
.....