Azure Application Insights 采样更改了采样率

Azure Application Insights sampling changed sampling rate

我们的一个 azure Function v3 应用从 200mb 的应用洞察摄取量减少到约 18gb。我们没有添加任何额外的日志记录语句、更改任何 sdk 或触发任何额外的函数执行。我们没有在我们的项目中指定 app insights sdk,因此它使用 Azure 已安装的内容。 运行 下面 Microsoft 推荐的显示采样百分比的查询使得自适应采样发生了明显的变化。

union requests,dependencies,pageViews,browserTimings,exceptions,traces
| where timestamp > ago(50d)
| summarize RetainedPercentage = 100/avg(itemCount) by bin(timestamp, 1h), itemType
|  order by timestamp, itemType

这是在出现峰值之前

这是在出现峰值之后

这是host.json

{
  "version": "2.0",
  "logging": {
    "logLevel": {
      "default": "Information",
      "Host.Triggers.DurableTask": "Warning",
      "DurableTask.AzureStorage": "Warning",
      "DurableTask.Core": "Warning"
    },
    "applicationInsights": {
      "samplingSettings": {
        "isEnabled": true,
        "excludedTypes": "Request"
      }
    }
  },
  "extensions": {
    "eventHubs": {
      "batchCheckpointFrequency": 1,
      "eventProcessorOptions": {
        "maxBatchSize": 64,
        "prefetchCount": 128
      }
    },
    "durableTask": {
      "hubName": "FooDevicesTaskHub",
      "storageProvider": {
        "connectionStringName": "AzureWebJobsStorageDurable"
      },
      "tracing": {
        "traceInputsAndOutputs": false,
        "traceReplayEvents": false
      }
    },
    "serviceBus": {
      "messageHandlerOptions": {
        "maxConcurrentCalls": 1
      }
    }
  }
}

这是包裹

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFramework>netcoreapp3.1</TargetFramework>
    <AzureFunctionsVersion>v3</AzureFunctionsVersion>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="AutoMapper" Version="8.1.1" />
    <PackageReference Include="Azure.Storage.Blobs" Version="12.8.0" />
    <PackageReference Include="Azure.Storage.Files.DataLake" Version="12.2.2" />
    <PackageReference Include="Microsoft.Azure.Devices" Version="1.18.1" />
    <PackageReference Include="Microsoft.Azure.EventGrid" Version="3.2.0" />
    <PackageReference Include="Microsoft.Azure.WebJobs.Extensions.CosmosDB" Version="3.0.7" />
    <PackageReference Include="Microsoft.Azure.WebJobs.Extensions.DurableTask" Version="2.5.1" />
    <PackageReference Include="Microsoft.Azure.WebJobs.Extensions.EventGrid" Version="2.1.0" />
    <PackageReference Include="Microsoft.Azure.WebJobs.Extensions.EventHubs" Version="4.1.1" />
    <PackageReference Include="Microsoft.Azure.WebJobs.Extensions.ServiceBus" Version="4.3.0" />
    <PackageReference Include="Microsoft.Azure.WebJobs.Extensions.Storage" Version="4.0.3" />
    <PackageReference Include="Microsoft.Extensions.Http" Version="3.1.7" />
    <PackageReference Include="Microsoft.NET.Sdk.Functions" Version="3.0.13" />
    <PackageReference Include="Microsoft.Azure.Functions.Extensions" Version="1.0.0" />
    <PackageReference Include="Polly" Version="7.2.1" />
    <PackageReference Include="Polly.Contrib.WaitAndRetry" Version="1.1.1" />
    <PackageReference Include="SendGrid" Version="9.24.2" />
    <PackageReference Include="System.Net.Http.Json" Version="5.0.0" />
  </ItemGroup>

添加了更多基于评论的查询结果:

traces
| summarize sum(itemCount), count(), dcount(strcat(cloud_RoleName, "/")) by bin(timestamp, 30sec)
| render timechart

之前:

之后:

关于什么可能导致此问题或需要寻找什么的任何想法?我们有一张 MS 的票,但他们已经调查了几周。

自适应采样基于每个应用实例。因此,如果每个节点的负载减少(整体负载减少或者您重构了您的应用程序{切换到其他计划等}并且现在有更小的实例等)那么这可以解释这些数字。

要检查是否是这种情况,您可以输出以下列:

sum(itemCount), count(), dcount(strcat(cloud_RoleName, "/", cloud_RoleInstance), 4)