ScheduledExecutorService 任务比预期晚 运行
ScheduledExecutorService tasks are running later than expected
我定期 运行ning 任务并为间隔提供灵活性,在每个任务结束时计算下一次超时,从 Instant.now() 转换为毫秒,并安排使用 ScheduledExecutorService#schedule
.
这段代码通常工作正常(左边的蓝色曲线),但其他日子不太好。
在我看来,启动时有时会出现问题(机器每晚都会重新启动),尽管程序 应该并且确实会 自我纠正 ScheduledExecutorService#schedule
不恢复并且计划任务运行一直迟到。看来完全重启 JVM 是唯一的解决方案。
我最初的想法是这是一个错误,根据机器启动的时间,事情可能会出错。但以下日志输出表明问题与我对 ScheduledExecutorService#schedule
的使用有关:
// Log time in GMT+2, other times are in GMT
// The following lines are written following system startup (all times are correct)
08 juin 00:08:49.993 [main] WARN com.pgscada.webdyn.Webdyn - Scheduling next webdyn service time. Currently 2018-06-07T22:08:49.993Z, last connection null
08 juin 00:08:50.586 [main] INFO com.pgscada.webdyn.Webdyn - The next data sample at 2018-06-07T22:10:00Z and the next FTP connection at 2018-06-07T22:30:00Z
08 juin 00:08:50.586 [main] WARN com.pgscada.webdyn.Webdyn - Completed webdyn schedule in 9ms, next execution at 2018-06-07T22:10:00Z (in 69414 ms) will run as data-sample
// So we are expecting the next execution to occur at 00:10:00 (or in 69.4 seconds)
// Except that it runs at 00:11:21
08 juin 00:11:21.206 [pool-1-thread-4] INFO com.pgscada.webdyn.Webdyn - Executing Webdyn service, isDataSample=true, isFtpConnection=false, nextTimeout=2018-06-07T22:10:00Z, lastFtpConnection=null
// But thats OK because it should correct itself
08 juin 00:13:04.151 [pool-1-thread-4] WARN com.pgscada.webdyn.Webdyn - Scheduling next webdyn service time. Currently 2018-06-07T22:10:00Z, last connection null
08 juin 00:13:04.167 [pool-1-thread-4] INFO com.pgscada.webdyn.Webdyn - The next data sample at 2018-06-07T22:20:00Z and the next FTP connection at 2018-06-07T22:30:00Z
08 juin 00:13:04.167 [pool-1-thread-4] WARN com.pgscada.webdyn.Webdyn - Completed webdyn schedule in 0ms, next execution at 2018-06-07T22:20:00Z (in 415833 ms) will run as data-sample
// So now we are expecting the next execution to occur at 00:20:00 (or in 415.8 seconds)
// But it runs at 00:28:06
08 juin 00:28:06.145 [pool-1-thread-4] INFO com.pgscada.webdyn.Webdyn - Executing Webdyn service, isDataSample=true, isFtpConnection=false, nextTimeout=2018-06-07T22:20:00Z, lastFtpConnection=null
下面是调度函数的实际生产代码
ScheduledExecutorService EXECUTORS = Executors.newScheduledThreadPool(10);
private void scheduleNextTimeout(Instant currentTime, Instant lastFtpConnection) {
try {
log.info("Scheduling next webdyn service time. Currently {}, last connection {}", currentTime, lastFtpConnection);
// Parse config files first
getConfigIni().parse();
long time = System.nanoTime();
final Instant earliestPossibleTimeout = Instant.now().plusSeconds(5);
Instant nextDataSample = nextTimeout(currentTime);
if (nextDataSample.isBefore(earliestPossibleTimeout)) {
final Instant oldTime = nextDataSample;
nextDataSample = nextTimeout(earliestPossibleTimeout);
log.warn("Next data sample was calculated to a time in the past '{}', resetting to a future time: {}", oldTime, nextDataSample);
}
Instant nextFtp = nextFtpConnection(currentTime, lastFtpConnection);
if (nextFtp.isBefore(earliestPossibleTimeout)) {
final Instant oldTime = nextFtp;
nextFtp = nextFtpConnection(earliestPossibleTimeout, lastFtpConnection);
log.warn("Next FTP connection was calculated to a time in the past '{}', resetting to a future time: {}", oldTime, nextFtp);
}
final boolean isFtpConnection = !nextDataSample.isBefore(nextFtp);
final boolean isDataSample = !isFtpConnection || nextDataSample.equals(nextFtp);
log.info("The next data sample at {} and the next FTP connection at {}", nextDataSample, nextFtp);
final Instant nextTimeout = nextDataSample.isBefore(nextFtp) ? nextDataSample : nextFtp;
final long millis = Duration.between(Instant.now(), nextTimeout).toMillis();
EXECUTORS.schedule(() -> {
log.info("Executing Webdyn service, isDataSample={}, isFtpConnection={}, nextTimeout={}, lastFtpConnection={}",
isDataSample, isFtpConnection, nextTimeout, lastFtpConnection);
long tme = System.nanoTime();
try {
connect(isDataSample, isFtpConnection, nextTimeout, lastFtpConnection);
log.warn("Completed webdyn service in {}s", (System.nanoTime() - tme) / 1000000);
} catch (final Throwable ex) {
log.error("Failed webdyn service after {}ms : {}", (System.nanoTime() - tme) / 1000000, ex.getMessage(), ex);
} finally {
scheduleNextTimeout(nextTimeout, isFtpConnection ? nextTimeout : lastFtpConnection);
}
}, millis, TimeUnit.MILLISECONDS);
log.warn("Completed webdyn schedule in {}ms, next execution at {} (in {} ms) will run as {}",
(System.nanoTime() - time) / 1000000, nextTimeout, millis, isFtpConnection ? "ftp-connection" : "data-sample");
} catch (final Throwable ex) {
log.error("Fatal error in webdyn schedule : {}", ex.getMessage(), ex);
}
}
正如我在问题下方的评论中所述,这里的问题是有一个共享的、可变的和非线程安全的资源(EXECUTORS 属性)被多个线程更改。
它在启动时由主线程更改,并且无论哪个线程从池中用于任务执行。
需要注意的是,甚至当你只有一个线程访问一个共享资源时时间(仅仅是因为一次只有一个任务运行ning),你仍然需要考虑并发.这是因为没有同步 Java 内存模型不能保证一个线程所做的更改对其他线程永远可见,无论它们晚了多久 运行。
因此解决方案是使方法 scheduleNextTimeout 同步,从而保证更改不会保留在执行线程的本地并写入主内存。
您也可以围绕该部分制作一个同步块(在 "this" 上同步),这可以访问共享资源,但由于系统似乎不是重型系统和其余代码好像用不了多久,没那个必要...
这是我第一次遇到此类问题时从中学到的一篇简短的文章中的要点:)
https://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#jsr133
很高兴能帮上忙。
我定期 运行ning 任务并为间隔提供灵活性,在每个任务结束时计算下一次超时,从 Instant.now() 转换为毫秒,并安排使用 ScheduledExecutorService#schedule
.
这段代码通常工作正常(左边的蓝色曲线),但其他日子不太好。
在我看来,启动时有时会出现问题(机器每晚都会重新启动),尽管程序 应该并且确实会 自我纠正 ScheduledExecutorService#schedule
不恢复并且计划任务运行一直迟到。看来完全重启 JVM 是唯一的解决方案。
我最初的想法是这是一个错误,根据机器启动的时间,事情可能会出错。但以下日志输出表明问题与我对 ScheduledExecutorService#schedule
的使用有关:
// Log time in GMT+2, other times are in GMT
// The following lines are written following system startup (all times are correct)
08 juin 00:08:49.993 [main] WARN com.pgscada.webdyn.Webdyn - Scheduling next webdyn service time. Currently 2018-06-07T22:08:49.993Z, last connection null
08 juin 00:08:50.586 [main] INFO com.pgscada.webdyn.Webdyn - The next data sample at 2018-06-07T22:10:00Z and the next FTP connection at 2018-06-07T22:30:00Z
08 juin 00:08:50.586 [main] WARN com.pgscada.webdyn.Webdyn - Completed webdyn schedule in 9ms, next execution at 2018-06-07T22:10:00Z (in 69414 ms) will run as data-sample
// So we are expecting the next execution to occur at 00:10:00 (or in 69.4 seconds)
// Except that it runs at 00:11:21
08 juin 00:11:21.206 [pool-1-thread-4] INFO com.pgscada.webdyn.Webdyn - Executing Webdyn service, isDataSample=true, isFtpConnection=false, nextTimeout=2018-06-07T22:10:00Z, lastFtpConnection=null
// But thats OK because it should correct itself
08 juin 00:13:04.151 [pool-1-thread-4] WARN com.pgscada.webdyn.Webdyn - Scheduling next webdyn service time. Currently 2018-06-07T22:10:00Z, last connection null
08 juin 00:13:04.167 [pool-1-thread-4] INFO com.pgscada.webdyn.Webdyn - The next data sample at 2018-06-07T22:20:00Z and the next FTP connection at 2018-06-07T22:30:00Z
08 juin 00:13:04.167 [pool-1-thread-4] WARN com.pgscada.webdyn.Webdyn - Completed webdyn schedule in 0ms, next execution at 2018-06-07T22:20:00Z (in 415833 ms) will run as data-sample
// So now we are expecting the next execution to occur at 00:20:00 (or in 415.8 seconds)
// But it runs at 00:28:06
08 juin 00:28:06.145 [pool-1-thread-4] INFO com.pgscada.webdyn.Webdyn - Executing Webdyn service, isDataSample=true, isFtpConnection=false, nextTimeout=2018-06-07T22:20:00Z, lastFtpConnection=null
下面是调度函数的实际生产代码
ScheduledExecutorService EXECUTORS = Executors.newScheduledThreadPool(10);
private void scheduleNextTimeout(Instant currentTime, Instant lastFtpConnection) {
try {
log.info("Scheduling next webdyn service time. Currently {}, last connection {}", currentTime, lastFtpConnection);
// Parse config files first
getConfigIni().parse();
long time = System.nanoTime();
final Instant earliestPossibleTimeout = Instant.now().plusSeconds(5);
Instant nextDataSample = nextTimeout(currentTime);
if (nextDataSample.isBefore(earliestPossibleTimeout)) {
final Instant oldTime = nextDataSample;
nextDataSample = nextTimeout(earliestPossibleTimeout);
log.warn("Next data sample was calculated to a time in the past '{}', resetting to a future time: {}", oldTime, nextDataSample);
}
Instant nextFtp = nextFtpConnection(currentTime, lastFtpConnection);
if (nextFtp.isBefore(earliestPossibleTimeout)) {
final Instant oldTime = nextFtp;
nextFtp = nextFtpConnection(earliestPossibleTimeout, lastFtpConnection);
log.warn("Next FTP connection was calculated to a time in the past '{}', resetting to a future time: {}", oldTime, nextFtp);
}
final boolean isFtpConnection = !nextDataSample.isBefore(nextFtp);
final boolean isDataSample = !isFtpConnection || nextDataSample.equals(nextFtp);
log.info("The next data sample at {} and the next FTP connection at {}", nextDataSample, nextFtp);
final Instant nextTimeout = nextDataSample.isBefore(nextFtp) ? nextDataSample : nextFtp;
final long millis = Duration.between(Instant.now(), nextTimeout).toMillis();
EXECUTORS.schedule(() -> {
log.info("Executing Webdyn service, isDataSample={}, isFtpConnection={}, nextTimeout={}, lastFtpConnection={}",
isDataSample, isFtpConnection, nextTimeout, lastFtpConnection);
long tme = System.nanoTime();
try {
connect(isDataSample, isFtpConnection, nextTimeout, lastFtpConnection);
log.warn("Completed webdyn service in {}s", (System.nanoTime() - tme) / 1000000);
} catch (final Throwable ex) {
log.error("Failed webdyn service after {}ms : {}", (System.nanoTime() - tme) / 1000000, ex.getMessage(), ex);
} finally {
scheduleNextTimeout(nextTimeout, isFtpConnection ? nextTimeout : lastFtpConnection);
}
}, millis, TimeUnit.MILLISECONDS);
log.warn("Completed webdyn schedule in {}ms, next execution at {} (in {} ms) will run as {}",
(System.nanoTime() - time) / 1000000, nextTimeout, millis, isFtpConnection ? "ftp-connection" : "data-sample");
} catch (final Throwable ex) {
log.error("Fatal error in webdyn schedule : {}", ex.getMessage(), ex);
}
}
正如我在问题下方的评论中所述,这里的问题是有一个共享的、可变的和非线程安全的资源(EXECUTORS 属性)被多个线程更改。 它在启动时由主线程更改,并且无论哪个线程从池中用于任务执行。
需要注意的是,甚至当你只有一个线程访问一个共享资源时时间(仅仅是因为一次只有一个任务运行ning),你仍然需要考虑并发.这是因为没有同步 Java 内存模型不能保证一个线程所做的更改对其他线程永远可见,无论它们晚了多久 运行。
因此解决方案是使方法 scheduleNextTimeout 同步,从而保证更改不会保留在执行线程的本地并写入主内存。
您也可以围绕该部分制作一个同步块(在 "this" 上同步),这可以访问共享资源,但由于系统似乎不是重型系统和其余代码好像用不了多久,没那个必要...
这是我第一次遇到此类问题时从中学到的一篇简短的文章中的要点:) https://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#jsr133
很高兴能帮上忙。