由于长时间闲置,webjob 以某种方式异常中止

webjob aborts with exception somehow due to idling to long

我正在 azure webjob 上提取 .zip 存档。

一段时间以来工作正常。

现在webjob突然开始失败了:

[12/11/2017 16:59:57 > bf607f: ERR ] Command 'cmd /c ""run.cmd""' was aborted due to no output nor CPU activity for 121 seconds. You can increase the SCM_COMMAND_IDLE_TIMEOUT app setting (or WEBJOBS_IDLE_TIMEOUT if this is a WebJob) if needed.
cmd /c ""run.cmd""
[12/11/2017 16:59:57 > bf607f: ERR ] replace D:\home\site\store\extracted/documents/6465465465466015.pdf? [y]es, [n]o, [A]ll, [N]one, [r]ename: 
[12/11/2017 16:59:57 > bf607f: SYS INFO] Status changed to Failed
[12/11/2017 16:59:57 > bf607f: SYS ERR ] System.AggregateException: One or more errors occurred. ---> Kudu.Core.Infrastructure.CommandLineException: Command 'cmd /c ""run.cmd""' was aborted due to no output nor CPU activity for 121 seconds. You can increase the SCM_COMMAND_IDLE_TIMEOUT app setting (or WEBJOBS_IDLE_TIMEOUT if this is a WebJob) if needed.
cmd /c ""run.cmd""
   at Kudu.Core.Infrastructure.IdleManager.WaitForExit(IProcess process)
   at Kudu.Core.Infrastructure.ProcessExtensions.<Start>d__12.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Kudu.Core.Infrastructure.Executable.<ExecuteAsync>d__31.MoveNext()
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
   at System.Threading.Tasks.Task`1.get_Result()
   at Kudu.Core.Infrastructure.Executable.ExecuteInternal(ITracer tracer, Func`2 onWriteOutput, Func`2 onWriteError, Encoding encoding, String arguments, Object[] args)
   at Kudu.Core.Infrastructure.Executable.ExecuteReturnExitCode(ITracer tracer, Action`1 onWriteOutput, Action`1 onWriteError, String arguments, Object[] args)
   at Kudu.Core.Jobs.BaseJobRunner.RunJobInstance(JobBase job, IJobLogger logger, String runId, String trigger, ITracer tracer, Int32 port)
---> (Inner Exception #0) ExitCode: -1, Output: Command 'cmd /c ""run.cmd""' was aborted due to no output nor CPU activity for 121 seconds. You can increase the SCM_COMMAND_IDLE_TIMEOUT app setting (or WEBJOBS_IDLE_TIMEOUT if this is a WebJob) if needed., Error: Command 'cmd /c ""run.cmd""' was aborted due to no output nor CPU activity for 121 seconds. You can increase the SCM_COMMAND_IDLE_TIMEOUT app setting (or WEBJOBS_IDLE_TIMEOUT if this is a WebJob) if needed., Kudu.Core.Infrastructure.CommandLineException: Command 'cmd /c ""run.cmd""' was aborted due to no output nor CPU activity for 121 seconds. You can increase the SCM_COMMAND_IDLE_TIMEOUT app setting (or WEBJOBS_IDLE_TIMEOUT if this is a WebJob) if needed.
cmd /c ""run.cmd""
   at Kudu.Core.Infrastructure.IdleManager.WaitForExit(IProcess process)
   at Kudu.Core.Infrastructure.ProcessExtensions.<Start>d__12.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Kudu.Core.Infrastructure.Executable.<ExecuteAsync>d__31.MoveNext()<---

我的意思是当异常发生时 webjob 似乎很忙,那么为什么我会得到那个空闲超时异常?

我建议您在 Web 应用程序的“应用程序设置”配置中添加 SCM_COMMAND_IDLE_TIMEOUT 和 WEBJOBS_IDLE_TIMEOUT 设置,并使用您选择的值。

例如:

SCM_COMMAND_IDLE_TIMEOUT = 3600

WEBJOBS_IDLE_TIMEOUT = 3600

如果未启用,您可以打开“始终开启”功能,看看是否有帮助。

默认情况下,Web 应用程序会在闲置一段时间后卸载。这使系统可以节省资源。在基本或标准模式下,您可以启用“始终开启”以始终保持应用程序加载。如果您的应用 运行 有连续的 WebJobs,您应该启用“Always On”,否则 WebJobs 可能不会 运行 可靠。要启用,请转到网络应用程序 -> 设置 -> 应用程序设置 -> 启用“始终开启”。

此外,请参阅 diagnostic log stream 以获取有关此问题的更多详细信息。

I mean it seems when the exception happened the webjob was pretty busy, so why do I get that idle timeout exception?

根本原因:

控制台长时间无输出

解法:

我们也可以像 Ashok 提到的那样增加 WEBJOBS_IDLE_TIMEOUT value.This 应该在 Web 应用程序的配置 设置中设置,而不是 WebJob 的 App.config。并且该值以秒为单位。

您还可以每分钟将输出添加到控制台。更多细节可以参考这个blog.

Another solution is to add output to the Console, which is especially useful for jobs that are doing long running asynchronous tasks or polling external services, For these cases adding a heartbeat style Console write every minute is better than increasing the Idle Timeout to huge numbers