ScheduledExecutorService 任务比预期晚 运行

ScheduledExecutorService tasks are running later than expected

我定期 运行ning 任务并为间隔提供灵活性,在每个任务结束时计算下一次超时,从 Instant.now() 转换为毫秒,并安排使用 ScheduledExecutorService#schedule.

这段代码通常工作正常(左边的蓝色曲线),但其他日子不太好。

在我看来,启动时有时会出现问题(机器每晚都会重新启动),尽管程序 应该并且确实会 自我纠正 ScheduledExecutorService#schedule不恢复并且计划任务运行一直迟到。看来完全重启 JVM 是唯一的解决方案。

我最初的想法是这是一个错误,根据机器启动的时间,事情可能会出错。但以下日志输出表明问题与我对 ScheduledExecutorService#schedule 的使用有关:

// Log time in GMT+2, other times are in GMT
// The following lines are written following system startup (all times are correct)
08 juin 00:08:49.993 [main] WARN  com.pgscada.webdyn.Webdyn - Scheduling next webdyn service time. Currently 2018-06-07T22:08:49.993Z, last connection null
08 juin 00:08:50.586 [main] INFO  com.pgscada.webdyn.Webdyn - The next data sample at 2018-06-07T22:10:00Z and the next FTP connection at 2018-06-07T22:30:00Z
08 juin 00:08:50.586 [main] WARN  com.pgscada.webdyn.Webdyn - Completed webdyn schedule in 9ms, next execution at 2018-06-07T22:10:00Z (in 69414 ms) will run as data-sample
// So we are expecting the next execution to occur at 00:10:00 (or in 69.4 seconds)
// Except that it runs at 00:11:21
08 juin 00:11:21.206 [pool-1-thread-4] INFO  com.pgscada.webdyn.Webdyn - Executing Webdyn service, isDataSample=true, isFtpConnection=false, nextTimeout=2018-06-07T22:10:00Z, lastFtpConnection=null
// But thats OK because it should correct itself
08 juin 00:13:04.151 [pool-1-thread-4] WARN  com.pgscada.webdyn.Webdyn - Scheduling next webdyn service time. Currently 2018-06-07T22:10:00Z, last connection null
08 juin 00:13:04.167 [pool-1-thread-4] INFO  com.pgscada.webdyn.Webdyn - The next data sample at 2018-06-07T22:20:00Z and the next FTP connection at 2018-06-07T22:30:00Z
08 juin 00:13:04.167 [pool-1-thread-4] WARN  com.pgscada.webdyn.Webdyn - Completed webdyn schedule in 0ms, next execution at 2018-06-07T22:20:00Z (in 415833 ms) will run as data-sample
// So now we are expecting the next execution to occur at 00:20:00 (or in 415.8 seconds)
// But it runs at 00:28:06
08 juin 00:28:06.145 [pool-1-thread-4] INFO  com.pgscada.webdyn.Webdyn - Executing Webdyn service, isDataSample=true, isFtpConnection=false, nextTimeout=2018-06-07T22:20:00Z, lastFtpConnection=null

下面是调度函数的实际生产代码

ScheduledExecutorService EXECUTORS = Executors.newScheduledThreadPool(10);


private void scheduleNextTimeout(Instant currentTime, Instant lastFtpConnection) {

    try {

        log.info("Scheduling next webdyn service time. Currently {}, last connection {}", currentTime, lastFtpConnection);

        // Parse config files first
        getConfigIni().parse();

        long time = System.nanoTime();
        final Instant earliestPossibleTimeout = Instant.now().plusSeconds(5); 

        Instant nextDataSample = nextTimeout(currentTime);

        if (nextDataSample.isBefore(earliestPossibleTimeout)) {
            final Instant oldTime = nextDataSample;
            nextDataSample = nextTimeout(earliestPossibleTimeout);
            log.warn("Next data sample was calculated to a time in the past '{}', resetting to a future time: {}", oldTime, nextDataSample);
        }

        Instant nextFtp = nextFtpConnection(currentTime, lastFtpConnection);

        if (nextFtp.isBefore(earliestPossibleTimeout)) {
            final Instant oldTime = nextFtp;
            nextFtp = nextFtpConnection(earliestPossibleTimeout, lastFtpConnection);
            log.warn("Next FTP connection was calculated to a time in the past '{}', resetting to a future time: {}", oldTime, nextFtp);
        }

        final boolean isFtpConnection = !nextDataSample.isBefore(nextFtp);
        final boolean isDataSample = !isFtpConnection || nextDataSample.equals(nextFtp);
        log.info("The next data sample at {} and the next FTP connection at {}", nextDataSample, nextFtp);

        final Instant nextTimeout = nextDataSample.isBefore(nextFtp) ? nextDataSample : nextFtp;
        final long millis = Duration.between(Instant.now(), nextTimeout).toMillis();
        EXECUTORS.schedule(() -> {
            log.info("Executing Webdyn service, isDataSample={}, isFtpConnection={}, nextTimeout={}, lastFtpConnection={}",
                    isDataSample, isFtpConnection, nextTimeout, lastFtpConnection);

            long tme = System.nanoTime();
            try {
                connect(isDataSample, isFtpConnection, nextTimeout, lastFtpConnection);
                log.warn("Completed webdyn service in {}s", (System.nanoTime() - tme) / 1000000);
            } catch (final Throwable ex) {
                log.error("Failed webdyn service after {}ms : {}", (System.nanoTime() - tme) / 1000000, ex.getMessage(), ex);
            } finally {
                scheduleNextTimeout(nextTimeout, isFtpConnection ? nextTimeout : lastFtpConnection);
            }
        }, millis, TimeUnit.MILLISECONDS);

        log.warn("Completed webdyn schedule in {}ms, next execution at {} (in {} ms) will run as {}",
                (System.nanoTime() - time) / 1000000, nextTimeout, millis, isFtpConnection ? "ftp-connection" : "data-sample");

    } catch (final Throwable ex) {
        log.error("Fatal error in webdyn schedule : {}", ex.getMessage(), ex);
    }
}

正如我在问题下方的评论中所述,这里的问题是有一个共享的、可变的和非线程安全的资源(EXECUTORS 属性)被多个线程更改。 它在启动时由主线程更改,并且无论哪个线程从池中用于任务执行。

需要注意的是,甚至当你只有一个线程访问一个共享资源时时间(仅仅是因为一次只有一个任务运行ning),你仍然需要考虑并发.这是因为没有同步 Java 内存模型不能保证一个线程所做的更改对其他线程永远可见,无论它们晚了多久 运行。

因此解决方案是使方法 scheduleNextTimeout 同步,从而保证更改不会保留在执行线程的本地并写入主内存。

您也可以围绕该部分制作一个同步块(在 "this" 上同步),这可以访问共享资源,但由于系统似乎不是重型系统和其余代码好像用不了多久,没那个必要...

这是我第一次遇到此类问题时从中学到的一篇简短的文章中的要点:) https://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#jsr133

很高兴能帮上忙。