Azure 连续 Web 作业在某些情况下失败
Azure continuous webjob fails in some cases
我有一个连续的 webjob 运行 天蓝色,在 8 小时前进行更大规模的部署后,在某些情况下状态从未完成,而在其他情况下完成作业。
我启用了我能找到的所有日志记录,并且花了好几个小时试图找出问题所在。
我似乎能够找到的唯一日志错误信息来自 job_log,其中指出:
[11/15/2017 14:46:23 > e553e5: ERR ] Unhandled Exception: Microsoft.WindowsAzure.Storage.StorageException: The remote server returned an error: (404) Not Found. ---> System.Net.WebException: The remote server returned an error: (404) Not Found.
[11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Shared.Protocol.HttpResponseParsers.ProcessExpectedStatusCodeNoException[T](HttpStatusCode expectedStatusCode, HttpStatusCode actualStatusCode, T retVal, StorageCommandBase1 cmd, Exception ex) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\Common\Shared\Protocol\HttpResponseParsers.Common.cs:line 50
[11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Blob.CloudBlob.<DeleteBlobImpl>b__33(RESTCommand
1 cmd, HttpWebResponse resp, Exception ex, OperationContext ctx) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Blob\CloudBlob.cs:line 3349
[11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndGetResponse[T](IAsyncResult getResponseResult) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 299
[11/15/2017 14:46:23 > e553e5: ERR ] --- End of inner exception stack trace ---
[11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndExecuteAsync[T](IAsyncResult result) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 50
[11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Blob.CloudBlob.EndDelete(IAsyncResult asyncResult) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Blob\CloudBlob.cs:line 1729
[11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Core.Util.AsyncExtensions.<>c__DisplayClass4.b__3(IAsyncResult ar) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Util\AsyncExtensions.cs:line 114
[11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown ---
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
[11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Protocols.PersistentQueueWriter1.<DeleteAsync>d__6.MoveNext()
[11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown ---
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
[11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Loggers.CompositeFunctionInstanceLogger.<DeleteLogFunctionStartedAsync>d__e.MoveNext()
[11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown ---
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
[11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.<TryExecuteAsync>d__1.MoveNext()
[11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown ---
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
[11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Executors.TriggeredFunctionExecutor
1.d__0.MoveNext()
[11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown ---
[11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Timers.BackgroundExceptionDispatcher.<>c__DisplayClass1.b__0()
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
[11/15/2017 14:46:23 > e553e5: ERR ] at System.Threading.ThreadHelper.ThreadStart()
任何人都可以给我一些关于如何调试它的想法,因为我没有想法。
我的 webjobs 主要是这样的:
static void Main()
{
var host = new JobHost();
var config = new JobHostConfiguration();
config.Queues.MaxPollingInterval = new TimeSpan(0,0,0,30);
config.Queues.MaxDequeueCount = 3;
// The following code ensures that the WebJob will be running continuously
host.RunAndBlock();
}
进程队列消息如下所示:
public static void ProcessQueueMessage([QueueTrigger("importqueue")] string msg)
{
try
{
WorkerWebJobCore wwjc = new WorkerWebJobCore();
wwjc.RunCore(msg, TableStorageAccessResources.ImportQueue,
TableStorageAccessResources.TableStorageDataOneId,
TableStorageAccessResources.TableStorageDataOnePassword);
}
catch (Exception e)
{
CommunicatorLog.Log.LogError("WebJobWorker","WebJobWorker","Error in processing queue message","ERRWJWF01");
}
}
所以我对所有事情都有把握,所以我不明白它怎么会失败?
提前致谢。
显然 运行 低于 Microsoft.Azure.Webjob 2.0.0 的版本使得无法获得有用的答案。
当我终于开始尝试安装该版本时,它向我指出了有用的错误消息的问题。
问题与关于 webjob 核心工作的 dll 版本错误有关
我的猜测是您的队列或存储本身中的文件出现问题。
它似乎试图删除不再存在的文件。或者可能 "larger" 正在被删除。
深入研究后,您的 WebJob 部署方式也可能存在问题。部署时有时可能会有所不同?看看这些:
https://github.com/Azure/azure-webjobs-sdk/issues/922
Azure WebJob QueueTrigger message is not deleted from queue
我有一个连续的 webjob 运行 天蓝色,在 8 小时前进行更大规模的部署后,在某些情况下状态从未完成,而在其他情况下完成作业。 我启用了我能找到的所有日志记录,并且花了好几个小时试图找出问题所在。
我似乎能够找到的唯一日志错误信息来自 job_log,其中指出:
[11/15/2017 14:46:23 > e553e5: ERR ] Unhandled Exception: Microsoft.WindowsAzure.Storage.StorageException: The remote server returned an error: (404) Not Found. ---> System.Net.WebException: The remote server returned an error: (404) Not Found. [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Shared.Protocol.HttpResponseParsers.ProcessExpectedStatusCodeNoException[T](HttpStatusCode expectedStatusCode, HttpStatusCode actualStatusCode, T retVal, StorageCommandBase
1 cmd, Exception ex) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\Common\Shared\Protocol\HttpResponseParsers.Common.cs:line 50 [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Blob.CloudBlob.<DeleteBlobImpl>b__33(RESTCommand
1 cmd, HttpWebResponse resp, Exception ex, OperationContext ctx) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Blob\CloudBlob.cs:line 3349 [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndGetResponse[T](IAsyncResult getResponseResult) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 299 [11/15/2017 14:46:23 > e553e5: ERR ] --- End of inner exception stack trace --- [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndExecuteAsync[T](IAsyncResult result) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 50 [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Blob.CloudBlob.EndDelete(IAsyncResult asyncResult) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Blob\CloudBlob.cs:line 1729 [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Core.Util.AsyncExtensions.<>c__DisplayClass4.b__3(IAsyncResult ar) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Util\AsyncExtensions.cs:line 114 [11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown --- [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Protocols.PersistentQueueWriter1.<DeleteAsync>d__6.MoveNext() [11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown --- [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Loggers.CompositeFunctionInstanceLogger.<DeleteLogFunctionStartedAsync>d__e.MoveNext() [11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown --- [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.<TryExecuteAsync>d__1.MoveNext() [11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown --- [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Executors.TriggeredFunctionExecutor
1.d__0.MoveNext() [11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown --- [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Timers.BackgroundExceptionDispatcher.<>c__DisplayClass1.b__0() [11/15/2017 14:46:23 > e553e5: ERR ] at System.Threading.ThreadHelper.ThreadStart_Context(Object state) [11/15/2017 14:46:23 > e553e5: ERR ] at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) [11/15/2017 14:46:23 > e553e5: ERR ] at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) [11/15/2017 14:46:23 > e553e5: ERR ] at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) [11/15/2017 14:46:23 > e553e5: ERR ] at System.Threading.ThreadHelper.ThreadStart()
任何人都可以给我一些关于如何调试它的想法,因为我没有想法。
我的 webjobs 主要是这样的:
static void Main()
{
var host = new JobHost();
var config = new JobHostConfiguration();
config.Queues.MaxPollingInterval = new TimeSpan(0,0,0,30);
config.Queues.MaxDequeueCount = 3;
// The following code ensures that the WebJob will be running continuously
host.RunAndBlock();
}
进程队列消息如下所示:
public static void ProcessQueueMessage([QueueTrigger("importqueue")] string msg)
{
try
{
WorkerWebJobCore wwjc = new WorkerWebJobCore();
wwjc.RunCore(msg, TableStorageAccessResources.ImportQueue,
TableStorageAccessResources.TableStorageDataOneId,
TableStorageAccessResources.TableStorageDataOnePassword);
}
catch (Exception e)
{
CommunicatorLog.Log.LogError("WebJobWorker","WebJobWorker","Error in processing queue message","ERRWJWF01");
}
}
所以我对所有事情都有把握,所以我不明白它怎么会失败?
提前致谢。
显然 运行 低于 Microsoft.Azure.Webjob 2.0.0 的版本使得无法获得有用的答案。 当我终于开始尝试安装该版本时,它向我指出了有用的错误消息的问题。
问题与关于 webjob 核心工作的 dll 版本错误有关
我的猜测是您的队列或存储本身中的文件出现问题。
它似乎试图删除不再存在的文件。或者可能 "larger" 正在被删除。
深入研究后,您的 WebJob 部署方式也可能存在问题。部署时有时可能会有所不同?看看这些:
https://github.com/Azure/azure-webjobs-sdk/issues/922
Azure WebJob QueueTrigger message is not deleted from queue