瞬态故障处理的正确实现(Azure)

Correct Implementation of Transient Fault Handling (Azure)

在过去一天左右的时间里,我一直在尝试在 Azure SQL 数据库上实施瞬态故障处理。尽管我与数据库建立了有效连接,但我不相信它会按预期处理瞬态故障。

到目前为止我的方法涉及

public static void SetRetryStratPol()
{
    const string defaultRetryStrategyName = "default";

    var strategy = new Incremental(defaultRetryStrategyName, 3, TimeSpan.FromSeconds(1), TimeSpan.FromSeconds(2));
    var strategies = new List<RetryStrategy> { strategy };
    var manager = new RetryManager(strategies, defaultRetryStrategyName);
    RetryManager.SetDefault(manager);
    retryPolicy = new RetryPolicy<SqlDatabaseTransientErrorDetectionStrategy>(strategy);
    retryPolicy.Retrying += (obj, eventArgs) =>
                            {
                                var msg = String.Format("Retrying, CurrentRetryCount = {0} , Delay = {1}, Exception = {2}", eventArgs.CurrentRetryCount, eventArgs.Delay, eventArgs.LastException.Message);
                                System.Diagnostics.Debug.WriteLine(msg);
                            };
}

我从 Global.asaxApplication_Start() 中调用该方法。 [retryPolicy 是静态 class 上的全局静态变量,其中还包括下一个方法。]

那我有个方法

public static ReliableSqlConnection GetReliableConnection()
{
    var conn = new ReliableSqlConnection("Server=...,1433;Database=...;User ID=...;Password=...;Trusted_Connection=False;Encrypt=True;Connection Timeout=30;", retryPolicy);

    conn.Open();

    return conn;
}

然后我用这个方法

using (var conn = GetReliableConnection())
using (var cmd = conn.CreateCommand())
{
    cmd.CommandText = "SELECT COUNT(*) FROM ReliabilityTest";

    result = (int) cmd.ExecuteScalarWithRetry();

    return View(result);
}

到目前为止,这有效。然后,为了测试重试策略,我尝试使用错误的用户名(here 的建议)。

但是当我逐步执行该代码时,光标立即跳转到我的 catch 语句

Login failed for user '[my username]'.

我本以为这个异常只会在几秒钟后被捕获,但根本没有延迟。

此外,我还尝试使用实体框架,完全遵循 this post,但得到相同的结果。

我错过了什么?是否有配置步骤或我是否错误地引发了瞬态故障?

瞬态故障处理块用于处理瞬态错误。因为username/password不正确导致登录失败当然不是其中之一。从这个网页:http://msdn.microsoft.com/en-us/library/dn440719%28v=pandp.60%29.aspx:

What Are Transient Faults?

When an application uses a service, errors can occur because of temporary conditions such as intermittent service, infrastructure-level faults, network issues, or explicit throttling by the service; these types of error occur more frequently with cloud-based services, but can also occur in on-premises solutions. If you retry the operation a short time later (maybe only a few milliseconds later) the operation may succeed. These types of error conditions are referred to as transient faults. Transient faults typically occur very infrequently, and in most cases, only a few retries are necessary for the operation to succeed.

您可能需要检查此应用程序块 (http://topaz.codeplex.com/) 的源代码,并查看从 SQL 数据库返回的哪些错误代码被认为是暂时性错误,因此需要重试。

您始终可以扩展功能并将失败的登录作为测试代码的暂时性错误之一。

更新

请查看此处的源代码:http://topaz.codeplex.com/SourceControl/latest#source/Source/TransientFaultHandling.Data/SqlDatabaseTransientErrorDetectionStrategy.cs。这就是重试魔法发生的地方。您可以做的是创建一个 class(我们称之为 CustomSqlDatabaseTransientErrorDetectionStrategy)并将整个代码从 link 复制到此 class)。然后出于测试目的,您可以添加 login failed 场景作为暂时性错误之一,并在您的应用程序中使用此 class 而不是 SqlDatabaseTransientErrorDetectionStrategy.