TPL DataFlow 无故闲置

TPL DataFlow is being idle with no reason

考虑 16 核机器上的以下 DataFlow 管道 提取 - TransformBlock。结合容量:32,DOP:16 下载 - TransformBlock。绑定容量:1,024,DOP:64 进程 - TransformBlock。绑定容量:50,000,DOP:16

这是管道的执行顺序:提取 --> 下载 --> 处理 我们观察到,我们的管道有时会“卡住”并且不会消耗所有消息。我们添加了一些痕迹来检查内部发生了什么,并确实验证了这种情况。例如:

Timestamp                       BlockName   Input   Output  DOP Total
2022-02-24 17:16:21.0160704     Extract     0       32      0   32
2022-02-24 17:16:21.0160704     Download    0       921     1   922
2022-02-24 17:16:21.0160704     Process     0       0       1   1
2022-02-24 17:16:21.0785734     Extract     0       32      0   32
2022-02-24 17:16:21.0785734     Download    0       921     1   922
2022-02-24 17:16:21.0785734     Process     0       0       1   1
2022-02-24 17:16:21.1254470     Extract     0       32      0   32
2022-02-24 17:16:21.1254470     Download    0       1024    0   1024
2022-02-24 17:16:21.1254470     Process     0       0       1   1
2022-02-24 17:16:21.1723229     Extract     0       32      0   32
2022-02-24 17:16:21.1723229     Download    0       1024    0   1024
2022-02-24 17:16:21.1723229     Process     0       0       1   1
2022-02-24 17:16:21.2191997     Extract     0       32      0   32
2022-02-24 17:16:21.2191997     Download    0       1024    0   1024
2022-02-24 17:16:21.2191997     Process     0       0       1   1
2022-02-24 17:16:21.2660764     Extract     0       32      0   32
2022-02-24 17:16:21.2660764     Download    0       1024    0   1024
2022-02-24 17:16:21.2660764     Process     0       0       1   1
2022-02-24 17:16:21.3285760     Extract     0       32      0   32
2022-02-24 17:16:21.3285760     Download    0       1024    0   1024
2022-02-24 17:16:21.3285760     Process     0       0       1   1
2022-02-24 17:16:21.3754516     Extract     0       32      0   32
2022-02-24 17:16:21.3754516     Download    0       1024    0   1024
2022-02-24 17:16:21.3754516     Process     0       0       1   1
2022-02-24 17:16:21.4223896     Extract     0       29      0   29
2022-02-24 17:16:21.4223896     Download    0       992     15  1007
2022-02-24 17:16:21.4223896     Process     0       7       1   8

正如您在此处看到的那样,下载仅在 2022-02-24 17:16:21.4223896 移动了它的消息,它在 2022-02-24 17:16:21.1254470.[= 已经“满”了11=]

我的问题是在这 297 毫秒内发生了什么?查看那个特定时间的线程数,它根本不高...

根据您在程序开始时的 other recent question, my guess is that the behavior of your pipeline is dominated by the behavior of the heavily saturated ThreadPool. The blocks are competing with each other for the few and slowly increasing number of available ThreadPool threads, making the MaxDegreeOfParallelism configuration of the blocks mostly irrelevant for approximately the first couple of minutes after the start of the application. Eventually the ThreadPool will grow enough to satisfy the demand, but with an injection rate of only one new thread every second, this will take a while. Since your application makes so heavy use of the ThreadPool, it might be a good idea to use the ThreadPool.SetMinThreads 方法判断,并配置 ThreadPool 将在切换到慢速算法之前立即按需创建的最小线程数。

或者,您可以考虑将同步工作转换为 asynchronous,如果可能的话(如果异步 API 可用于您正在做的任何事情),以尽量减少所需的线程数。