TPL 数据流向所有消费者重复消息

TPL Dataflow duplicate message to all consumers

我目前正在使用 WPF 和 TPL 数据流编写应用程序,它应该执行以下操作:

  1. 加载目录中的所有文件
  2. 一旦它开始处理,记录一些东西到 ui 并处理每个文件
  3. 完成后将一些内容记录到 ui

问题是 UI 的日志记录需要在 UI 线程中发生,并且只在它开始处理之前记录。

我现在能够做到这一点的唯一方法是从 TPL 转换块内部手动调用调度程序并更新 UI:

Application.Current.Dispatcher.Invoke(new Action(() =>
{
    ProcessedFiles.Add(optimizedFileResult);
}));

我想通过 DataFlow 块来完成此操作,尽管它在 UI 线程上 运行ning 使用:

ExecutionDataflowBlockOptions.TaskScheduler = TaskScheduler.FromCurrentSynchronizationContext();

但是,如果我在进行优化的块上设置它,优化也将 运行 单线程。

另一方面,如果我在处理块之前创建一个新块并在那里调用它。它会在实际开始之前就开始说 "processing"。

示例代码

我创建了一些示例代码来重现此问题:

public class TplLoggingToUiIssue
    {
        public TplLoggingToUiIssue()
        {

        }

        public IEnumerable<string> RecurseFiles()
        {
            for (int i = 0; i < 20; i++)
            {
                yield return i.ToString();
            }
        }

        public async Task Go()
        {
            var block1 = new TransformBlock<string, string>(input =>
            {
                Console.WriteLine($"1: {input}");
                return input;
            }, new ExecutionDataflowBlockOptions()
            {
                MaxDegreeOfParallelism = 4,
                BoundedCapacity = 10,
                EnsureOrdered = false
            });

            var block2 = new TransformBlock<string, string>(input =>
            {
                Console.WriteLine($"2: {input}\t\t\tStarting {input} now (ui logging)");
                return input;
            }, new ExecutionDataflowBlockOptions()
            {
                //TaskScheduler = TaskScheduler.FromCurrentSynchronizationContext(), (Doesn't work in Console app, but you get the idea)
                MaxDegreeOfParallelism = 1,
                BoundedCapacity = 1,
                EnsureOrdered = false
            });


            var block3 = new TransformBlock<string, string>(async input =>
            {
                Console.WriteLine($"3 start: {input}");
                await Task.Delay(5000);
                Console.WriteLine($"3 end: {input}");
                return input;
            }, new ExecutionDataflowBlockOptions()
            {
                MaxDegreeOfParallelism = 2,
                BoundedCapacity = 10,
                EnsureOrdered = false
            });

            var block4 = new ActionBlock<string>(input =>
            {
                Console.WriteLine($"4: {input}");
            }, new ExecutionDataflowBlockOptions()
            {
                MaxDegreeOfParallelism = 1,
                BoundedCapacity = 1,
                EnsureOrdered = false
            });


            block1.LinkTo(block2, new DataflowLinkOptions() { PropagateCompletion = true });
            block2.LinkTo(block3, new DataflowLinkOptions() { PropagateCompletion = true });
            block3.LinkTo(block4, new DataflowLinkOptions() { PropagateCompletion = true });


            var files = RecurseFiles();
            await Task.Run(async () =>
            {
                foreach (var file in files)
                {
                    Console.WriteLine($"Posting: {file}");
                    var result = await block1.SendAsync(file);

                    if (!result)
                    {
                        Console.WriteLine("Result is false!!!");
                    }
                }
            });

            Console.WriteLine("Completing");
            block1.Complete();
            await block4.Completion;
            Console.WriteLine("Done");
        }
    }

如果你 运行 这个样本(只有 6 'files'),你将得到以下输出:

Posting: 0
Posting: 1
Posting: 2
Posting: 3
Posting: 4
Posting: 5
1: 2
1: 1
1: 3
1: 0
1: 4
1: 5
2: 2                    Starting 2 now (ui logging)
Completing
3 start: 2
2: 0                    Starting 0 now (ui logging)
3 start: 0
2: 3                    Starting 3 now (ui logging)
2: 1                    Starting 1 now (ui logging)
2: 4                    Starting 4 now (ui logging)
2: 5                    Starting 5 now (ui logging)
3 end: 2
3 end: 0
3 start: 3
3 start: 1
4: 2
4: 0
3 end: 3
3 end: 1
4: 3
3 start: 4
3 start: 5
4: 1
3 end: 5
3 end: 4
4: 5
4: 4
Done

从这个输出中可以看出,它的记录开始得太早了。我也尝试过使用 Broadcast 块,但这会覆盖值,因此它们会丢失。

理想的情况是以某种方式让日志记录块等到处理块有容量,然后推送一项。

这是一个有点做作的方法,它通过异步 lambda 作为参数传递给 ActionBlock.

的开始-完成事件得到增强
public static Func<TInput, Task> Enhance<TInput>(
    Func<TInput, Task> action,
    Action<TInput> onActionStarted = null,
    Action<TInput> onActionFinished = null,
    ISynchronizeInvoke synchronizingObject = null)
{
    return async (item) =>
    {
        RaiseEvent(onActionStarted, item, synchronizingObject);
        await action(item).ConfigureAwait(false);
        RaiseEvent(onActionFinished, item, synchronizingObject);
    };
}

private static void RaiseEvent<T>(Action<T> onEvent, T arg1,
    ISynchronizeInvoke synchronizingObject)
{
    if (onEvent == null) return;
    if (synchronizingObject != null && synchronizingObject.InvokeRequired)
    {
        synchronizingObject.Invoke(onEvent, new object[] { arg1 });
    }
    else
    {
        onEvent(arg1);
    }
}

用法示例:

private void Form_Load(object sender, EventArgs e)
{
    var block = new ActionBlock<string>(Enhance<string>(async item =>
    {
        await Task.Delay(5000); // Simulate some lengthy asynchronous job
    }, onActionStarted: item =>
    {
        this.Text = $"{item} started";
    }, onActionFinished: item =>
    {
        ListBoxCompleted.Items.Add(item);
    }, synchronizingObject: this), new ExecutionDataflowBlockOptions()
    {
        MaxDegreeOfParallelism = 2,
        BoundedCapacity = 10,
        EnsureOrdered = false
    });
}

onActionStartedonActionFinished 回调将在 UI 线程中为每个已处理的项目调用一次。

如其他答案所示,有几种方法可以解决这个问题。我想指出一个替代方案:为此使用 Progress<T>。虽然它被设计为最适合与 Tasks 一起使用,但它也适用于 Dataflow,如下所示:

        private void Form1_Load(object sender, EventArgs e)
        {
            var progressReporter = new Progress<string>();
            progressReporter.ProgressChanged += (reporter, message) => label1.Text = message;

            var b1 = new ActionBlock<string>((input) =>
            {
                ((IProgress<string>)progressReporter).Report(input);
            }, new ExecutionDataflowBlockOptions
            {
                MaxDegreeOfParallelism = 10
            }); 

            b1.Post("a");
            b1.Post("b");
            b1.Post("c");
            b1.Post("d");
        }

总的来说,这看起来是一个干净的替代方案,无需为各个块添加一些管道。

可以在这个优秀的 blogpost

中找到更多信息