为什么 cyclicBarrier 不能在屏障操作执行后立即获取?

Why cyclicBarrier can't be acquired right after barrier action execution?

让我们考虑以下代码:

public static void main(String[] args) throws InterruptedException {
    CyclicBarrier cb = new CyclicBarrier(3, () -> {
        logger.info("Barrier action starting");
        try {
            Thread.sleep(5000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        logger.info("Barrier action finishing");
    });
    for (int i = 0; i < 6; i++) {
        int counter = i;
        Thread.sleep(100);
        new Thread(() -> {
            try {
                logger.info("Try to acquire barrier for {}", counter);
                cb.await();
                logger.info("barrier acquired for {}", counter);

            } catch (Exception e) {
                e.printStackTrace();
            }

        }).start();
    }
}

我已经创建了大小为 3 的屏障和需要 5 秒的屏障动作。

我看到以下输出:

2019-10-27 15:23:09.393  INFO   --- [       Thread-0] my.playground.RemoteServiceFacade        : Try to acquire barrier for 0
2019-10-27 15:23:09.492  INFO   --- [       Thread-1] my.playground.RemoteServiceFacade        : Try to acquire barrier for 1
2019-10-27 15:23:09.593  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : Try to acquire barrier for 2
2019-10-27 15:23:09.594  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : Barrier action starting
2019-10-27 15:23:09.693  INFO   --- [       Thread-3] my.playground.RemoteServiceFacade        : Try to acquire barrier for 3
2019-10-27 15:23:09.794  INFO   --- [       Thread-4] my.playground.RemoteServiceFacade        : Try to acquire barrier for 4
2019-10-27 15:23:09.897  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : Try to acquire barrier for 5
2019-10-27 15:23:14.594  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : Barrier action finishing
2019-10-27 15:23:14.595  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : barrier acquired for 2
2019-10-27 15:23:14.595  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : Barrier action starting
2019-10-27 15:23:19.596  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : Barrier action finishing
2019-10-27 15:23:19.597  INFO   --- [       Thread-0] my.playground.RemoteServiceFacade        : barrier acquired for 0
2019-10-27 15:23:19.597  INFO   --- [       Thread-4] my.playground.RemoteServiceFacade        : barrier acquired for 4
2019-10-27 15:23:19.597  INFO   --- [       Thread-3] my.playground.RemoteServiceFacade        : barrier acquired for 3
2019-10-27 15:23:19.597  INFO   --- [       Thread-1] my.playground.RemoteServiceFacade        : barrier acquired for 1
2019-10-27 15:23:19.597  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : barrier acquired for 5

所以我们可以看到:

  1. 第一个屏障动作持续 15:23:09 - 15:23:14
  2. 第二次屏障作用持续 15:23:14 - 15:23:19

但是在第一个屏障操作终止后只有一个线程能够登录:

2019-10-27 15:23:14.595  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : barrier acquired for 2

我预计 3 个线程应该能够大约在 15:23:14 获得,因为 CyclicBarrier 大小为 3。

你能解释一下这种行为吗?

P.S.

我尝试 运行 这段代码很多次,结果总是相似。

P.S.2.

我试着稍微改变一下时间:

public static void main(String[] args) throws InterruptedException {
    CyclicBarrier cb = new CyclicBarrier(3, () -> {
        logger.info("Barrier action starting");
        try {
            Thread.sleep(500);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        logger.info("Barrier action finishing");
    });
    for (int i = 0; i < 6; i++) {
        int counter = i;
        Thread.sleep(1000);
        new Thread(() -> {
            try {
                logger.info("Try to acquire barrier for {}", counter);
                cb.await();
                logger.info("barrier acquired for {}", counter);

            } catch (Exception e) {
                e.printStackTrace();
            }

        }).start();
    }
}

我看到了预期的结果:

2019-10-27 23:22:14.497  INFO   --- [       Thread-0] my.playground.RemoteServiceFacade        : Try to acquire barrier for 0
2019-10-27 23:22:15.495  INFO   --- [       Thread-1] my.playground.RemoteServiceFacade        : Try to acquire barrier for 1
2019-10-27 23:22:16.495  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : Try to acquire barrier for 2
2019-10-27 23:22:16.496  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : Barrier action starting
2019-10-27 23:22:16.998  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : Barrier action finishing
2019-10-27 23:22:16.998  INFO   --- [       Thread-0] my.playground.RemoteServiceFacade        : barrier acquired for 0
2019-10-27 23:22:16.998  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : barrier acquired for 2
2019-10-27 23:22:16.998  INFO   --- [       Thread-1] my.playground.RemoteServiceFacade        : barrier acquired for 1
2019-10-27 23:22:17.495  INFO   --- [       Thread-3] my.playground.RemoteServiceFacade        : Try to acquire barrier for 3
2019-10-27 23:22:18.495  INFO   --- [       Thread-4] my.playground.RemoteServiceFacade        : Try to acquire barrier for 4
2019-10-27 23:22:19.496  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : Try to acquire barrier for 5
2019-10-27 23:22:19.499  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : Barrier action starting
2019-10-27 23:22:20.002  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : Barrier action finishing
2019-10-27 23:22:20.003  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : barrier acquired for 5
2019-10-27 23:22:20.003  INFO   --- [       Thread-3] my.playground.RemoteServiceFacade        : barrier acquired for 3
2019-10-27 23:22:20.003  INFO   --- [       Thread-4] my.playground.RemoteServiceFacade        : barrier acquired for 4

一个有趣的问题,它不是超级微不足道的,但我会尽可能简洁地解释它。

虽然多线程不保证任何类型的执行顺序,但对于这个答案,让我们假设有两个序列首先发生:

  1. 所有线程同时启动
  2. 所有线程同时调用 barrier.await()

在这种情况下,您会看到类似

的输出
Try to acquire barrier for 0
Try to acquire barrier for 1
Try to acquire barrier for 2
Try to acquire barrier for 3
Try to acquire barrier for 4
Try to acquire barrier for 5

你的6个线程当前状态如下:

  1. 线程 01 await 在共享 Condition 上,因为三个线程尚未达到障碍 "yet"

  2. 线程 2 仍然 运行 可以作为屏障的 "tripping" 线程

  3. 线程 345 将等待屏障的 lock.lock 调用(相同的 Lock 实例在其中创建了 Condition

当 barrier 看到线程 2 时,它先前记录了 01 到达 barrier 所以它知道这个循环已经完成并会释放 01。但是在释放其他两个线程之前,它需要 运行 您定义的 barrierAction 休眠 5 秒,所以它会这样做。

然后您会看到输出

Barrier action starting
Barrier action finishing

线程 2 仍然持有锁,RUNNABLE 并准备好退出屏障,所以它这样做了,你会看到这个输出。

barrier acquired for 2

但在线程 2 存在之前,它会向所有其他等待当前屏障的线程发出信号。这就是它变得棘手的地方,01await 是在共享 Condition 上完成的。 Condition 被所有屏障 "generations" 共享。因此,即使前两个线程在后三个线程 locking 之前 awaiting,当 signalAll 完成后,第一代线程仍然需要等待轮到他们醒来.

此时我们有 5 个线程处于 BLOCKED (3, 4 & 5) 或 TIMED_WAITING (0, 1) 状态。在这个例子中,他们 block/wait 在 Lock 上的时机很重要。如果它们都按顺序发生,则关键部分的队列将是:

Thread-0 -> Thread-1 -> Thread-5 -> Thread-4 -> Thread-3 
   |                                               |
  TAIL                                            HEAD

因此下一个释放的线程将是 Thread-3,然后是 4,然后是 5。队列看起来像这样的原因是因为所有线程同时到达 lock 并且它们都排队,线程 01 显然先到达它所以进入了barrier 的临界区,然后 await 而线程 2 进来唤醒它们,但现在 01 被放在队列的末尾,3、4接下来将触发 5。

当线程 2 离开屏障并且 signalAll 线程 34 将 运行 并且由于它们是第二代的一部分将挂起直到线程 5 通过并触发 barrier 操作。然后打印出

Barrier action starting
Barrier action finishing

最后,线程 5 将再次 signalAll,其余线程将完成唤醒。

在这种情况下,您将看到线程 5 首先完成,然后是其余的

barrier acquired for 5
barrier acquired for 0
barrier acquired for 1
barrier acquired for 3
barrier acquired for 4