加载缓存时如何对 Azure 云服务 WebRoles 进行负载平衡

How do you load balance Azure Cloud Services WebRoles when loading a cache

给定 在 .net 4.5.2 和 OS 系列“4”(Windows 2012).

Web 应用程序启动时,我们想要加载一个大约需要 10 分钟的缓存(从 blob 存储)(我们已经研究过移动它但目前不能)

然后 当 IIS 应用程序池回收时,我们希望网站保持运行。

目前云服务的默认 IIS 设置是:

因为我们默认2个WebHost,所以我们要在不同的时间回收应用程序池。如果其中一个网络主机正在加载缓存,我们理想地希望重定向来自该站点的现有连接。

到目前为止,我们有一个启动任务脚本来重新配置 IIS AppPools

appcmd set config -section:system.applicationHost/applicationPools 

  /applicationPoolDefaults.autoStart:"True"
  /applicationPoolDefaults.startMode:"AlwaysRunning"
  /applicationPoolDefaults.processModel.idleTimeout:"00:00:00" 
  /applicationPoolDefaults.recycling.logEventOnRecycle:"Time,Requests,Schedule,Memory,IsapiUnhealthy,OnDemand,ConfigChange,PrivateMemory"
  /applicationPoolDefaults.recycling.periodicRestart.time:"00:00:00" 
  /~"applicationPoolDefaults.recycling.periodicRestart.schedule" 
  /+"applicationPoolDefaults.recycling.periodicRestart.schedule.[value='06:00:00']" 
  /applicationPoolDefaults.failure.loadBalancerCapabilities:"TcpLevel" 

例如

%windir%\system32\inetsrv\appcmd set config -section:applicationPools /applicationPoolDefaults.autoStart:"True" /commit:apphost

至于代码,我们已经研究过使用 Busy 标志,直到缓存加载完毕。这似乎没有重新路由流量

RoleEnvironment.StatusCheck += WebRoleEnvironment_StatusCheck;

        if (Busy)
        {
            e.SetBusy();
        }

缺点是这是在 Application_Start 中完成的,因为需要容器。我认为将 LoadCache() 移动到 RoleEntryPointOnStart() 太难了。

注意;我们还默认打开 "Keep-alive"。

问题;

  1. 我们如何在加载缓存时使 WebHost 脱机?
  2. 我们应该更改 IIS 设置吗? https://azure.microsoft.com/en-gb/blog/iis-reset-on-windows-azure-web-role/
  3. 我们应该使用 IIS 8.0 应用程序初始化吗? http://fabriccontroller.net/iis-8-0-application-initialization-module-in-a-windows-azure-web-role/
  4. loadBalancerCapabilities 应该设置什么? https://docs.microsoft.com/en-us/iis/configuration/system.applicationhost/applicationpools/add/failure
  5. 我们应该尝试错开回收利用吗?当我们扩展(添加更多实例)时怎么样Does azure prevent that role instances are recycled at the same time?

参见 https://blogs.msdn.microsoft.com/kwill/2012/09/19/role-instance-restarts-due-to-os-upgrades/,特别是常见问题 #5:

If your website takes several minutes to warmup (either standard IIS/ASP.NET warmup of precompilation and module loading, or warming up a cache or other app specific tasks) then your clients may experience an outage or random timeouts. After a role instance restarts and your OnStart code completes then your role instance will be put back in the load balancer rotation and will begin receiving incoming requests. If your website is still warming up then all of those incoming requests will queue up and time out. If you only have 2 instances of your web role then IN_0, which is still warming up, will be taking 100% of the incoming requests while IN_1 is being restarted for the Guest OS update. This can lead to a complete outage of your service until your website is finished warming up on both instances. It is recommended to keep your instance in OnStart, which will keep it in the Busy state where it won't receive incoming requests from the load balancer, until your warmup is complete. You can use the following code to accomplish this:

 public class WebRole : RoleEntryPoint {  
   public override bool OnStart () {  
     // For information on handling configuration changes  
     // see the MSDN topic at http://go.microsoft.com/fwlink/?LinkId=166357.  
     IPHostEntry ipEntry = Dns.GetHostEntry (Dns.GetHostName ());  
     string ip = null;  
     foreach (IPAddress ipaddress in ipEntry.AddressList) {  
       if (ipaddress.AddressFamily.ToString () == "InterNetwork") {  
         ip = ipaddress.ToString ();  
       }  
     }  
     string urlToPing = "http://" + ip;  
     HttpWebRequest req = HttpWebRequest.Create (urlToPing) as HttpWebRequest;  
     WebResponse resp = req.GetResponse ();  
     return base.OnStart ();  
   }  
 }  

根据你的描述,根据我的理解和经验,我认为在目前的场景下几乎不可能满足你所​​有的需求,需要在架构上做出改变。

下面是我的想法。

  1. 我猜是缓存 blob 文件太大导致从 blob 存储加载缓存需要更多时间。所以为了减少时间成本。我认为解决方案是通过使用统计将缓存 blob 文件拆分为许多较小的文件并同时加载它们,或者使用 table 存储而不是 blob 存储作为二级缓存,只需从 [=27 查询缓存数据=] 存储并将其作为具有过期时间的 L1 缓存存储到内存中,即使您可以使用 Azure Redis 缓存来存储比 table 存储更快的缓存数据。
  2. 确保 keep-alive 连接有重试机制。然后,当角色实例停止或重新启动时,现有连接将被重定向到另一个角色实例。
  3. 要实现重新启动角色实例的功能,REST API Reboot Role Instance 可以做到。

希望对您有所帮助。

这就是我们最终得到的结果:

编辑:更改为 HttpWebRequest 因此支持重定向

a) 部署虚拟机/OS 修补后,我们会轮询 OnStart()

中的 httpsIn 端点
public class WebRole : RoleEntryPoint
{
    public override bool OnStart()
    {
        ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;

        // Note: the Web Requests all run in IIS, not from this process.
        // So, we aren't disabling certs globally, just for checks against our own endpoint.
        ServicePointManager.ServerCertificateValidationCallback += (o, certificate, chain, errors) => true;

        var address = GetAddress("httpIn");

        var request = (HttpWebRequest)WebRequest.Create(address);
        request.MaximumAutomaticRedirections = 1;
        request.AllowAutoRedirect = false;
        var response = request.GetResponse() as HttpWebResponse;
        //_logger.WriteEventLog($"Response: '{response?.StatusCode}'");
        return base.OnStart();
    }

    static Uri GetAddress(string endpointName)
    {
        var endpoint = RoleEnvironment.CurrentRoleInstance.InstanceEndpoints[endpointName];
        var address = $"{endpoint.Protocol}://{endpoint.IPEndpoint.Address}:{endpoint.IPEndpoint.Port}";
        return new Uri(address);
    }
}

b) 对于 AppPool Recycles,我们在 Global.asax

中报告 Busy
public class RoleEnvironmentReadyCheck
{
    bool _isBusy = true;

    public RoleEnvironmentReadyCheck()
    {
        RoleEnvironment.StatusCheck += RoleEnvironment_StatusCheck;
    }

    void RoleEnvironment_StatusCheck(object sender, RoleInstanceStatusCheckEventArgs e)
    {
        if (_isBusy)
        {
            e.SetBusy();
        }
    }

    public void SetReady()
    {
        _isBusy = false;
    }
}

public class WebApiApplication : HttpApplication
{
    protected void Application_Start()
    {
        var roleStatusCheck = new RoleEnvironmentReadyCheck();
        //SuperLoadCache()
        roleStatusCheck.SetReady();
    }
}

c) 对于 AppPool 回收,我们 select 一天中的某个时间 (03:00AM) 并将角色错开 30 分钟,并在 PowerShell 脚本 ConfigureIIS 中停止空闲超时。ps1

$InstanceId = $env:INSTANCEID
$role = ($InstanceId -split '_')[-1]
$roleId = [int]$role
$gapInMinutes = 30
$startTime = New-TimeSpan -Hours 3
$offset = New-TimeSpan -Minutes ($gapInMinutes * $roleId)
$time = $startTime + $offset
$timeInDay = "{0:hh\:mm\:ss}" -f $time

Write-Host "ConfigureIIS with role: $role to $timeInDay"

& $env:windir\system32\inetsrv\appcmd set config -section:system.applicationHost/applicationPools /applicationPoolDefaults.processModel.idleTimeout:"00:00:00" /commit:apphost
& $env:windir\system32\inetsrv\appcmd set config -section:system.applicationHost/applicationPools /applicationPoolDefaults.recycling.logEventOnRecycle:"Time,Requests,Schedule,Memory,IsapiUnhealthy,OnDemand,ConfigChange,PrivateMemory" /commit:apphost
& $env:windir\system32\inetsrv\appcmd set config -section:system.applicationHost/applicationPools /applicationPoolDefaults.recycling.periodicRestart.time:"00:00:00" /commit:apphost
& $env:windir\system32\inetsrv\appcmd set config -section:system.applicationHost/applicationPools /~"applicationPoolDefaults.recycling.periodicRestart.schedule" /commit:apphost
& $env:windir\system32\inetsrv\appcmd set config -section:system.applicationHost/applicationPools /+"applicationPoolDefaults.recycling.periodicRestart.schedule.[value='$timeInDay']" /commit:apphost

并将 RoleId 传递给 ConfigureIIS.cmd

PowerShell -ExecutionPolicy Unrestricted .\ConfigureIIS.ps1 >> "%TEMP%\StartupLog.txt" 2>&1

EXIT /B 0

内设ServiceDefinition.csdef

 <Task commandLine="ConfigureIIS.cmd" executionContext="elevated" taskType="simple">
    <Environment>
      <Variable name="INSTANCEID">
        <RoleInstanceValue xpath="/RoleEnvironment/CurrentInstance/@id"/>
      </Variable>
    </Environment>
  </Task>