如何在 C# 中使用多线程进行批处理

How to do batch processing using multi-threading in C#

我正在使用并行 foreach 在阻塞集合中添加值,但是当阻塞集合有 10 个值时,我需要对其进行一些处理,然后再次清除阻塞集合,然后再次开始向阻塞集合添加值.

这里有两个问题

  1. 当我用它做一些处理时,它会继续向阻塞集合添加值,我可以在列表上加锁,但仍然在它到达锁时值会得到增加了。

  2. 如果我放置的锁完全破坏了并行编程的使用,我希望在处理完这 10 条消息之前在该列表中添加该对象。我可以复制列表内容并在这里再次清空列表也是同样的问题我不能只复制 10 个项目因为内容已经改变了。

有时 if 条件永远不会满足,因为在检查条件之前,值会增加。

有什么解决办法吗?

public static BlockingCollection<string> failedMessages = new BlockingCollection<string>();
static void Main(string[] args)
{
    var myCollection = new List<string>();
    myCollection.Add("test");
    //Consider myCollection having more than 100 items 
    Parallel.ForEach(myCollection, item =>
    {
        failedMessages.Add(item);
        if (failedMessages.Count == 10)
        {
            DoSomething();
        }
    });

}

static public void DoSomething()
{
    //dosome operation with failedMessages 
    failedMessages = new BlockingCollection<string>();
}   

    

这看起来像是 DataFlow 的工作:

示例使用批大小为 10 的 BatchBlock<string>ActionBlock<string[]> 来使用批次:

using System;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
                    
public class Program
{
    public static void Main()
    {
        Console.WriteLine("Hello World");
        // Set up DataFlow Blocks
        BatchBlock<string> batcher = new BatchBlock<string>( 10 );
        ActionBlock<string[]> consumer = 
            new ActionBlock<string[]>( 
                (msgs) => Console.WriteLine("Processed {0} messages.", msgs.Length)
            );
        // put them together
        batcher.LinkTo( consumer );
        
        // start posting
        Parallel.For( 0, 103, (i) => batcher.Post(string.Format("Test {0}",i)));
        
        // shutdown
        batcher.Complete();
        batcher.Completion.Wait();
    }
}

进行中:https://dotnetfiddle.net/Y9Ezg4

进一步阅读:https://docs.microsoft.com/en-us/dotnet/standard/parallel-programming/walkthrough-using-batchblock-and-batchedjoinblock-to-improve-efficiency


编辑:根据要求 - 如果您不能或不想使用 DataFlow,您当然可以做类似的事情:

using System;
using System.Threading;
using System.Threading.Tasks;
using System.Collections.Generic;
using System.Collections.Concurrent;
using System.Linq;
                    
public class Program
{
    public static void Main()
    {
        FailedMessageHandler fmh = new FailedMessageHandler( new Progress<string[]>((list) => { Console.WriteLine("Handling {0} messages. [{1}]", list.Length, string.Join(",", list));}));
        Parallel.For(0,52, (i) => {fmh.Add(string.Format("Test {0,3}",i));});
        Thread.Sleep(1500); // Demo: Timeout
        var result = Parallel.For(53,107, (i) => {fmh.Add(string.Format("Test {0,3}",i));});
        while(!result.IsCompleted)
        {
            // Let Parallel.For run to end ...
            Thread.Sleep(10);
        }
        // Graceful shutdown:
        fmh.CompleteAdding();
        fmh.AwaitCompletion();
    }
}

public class FailedMessageHandler
{
    private BlockingCollection<string> workQueue = new BlockingCollection<string>();
    private List<string> currentBuffer = new List<string>(10);
    private IProgress<string[]> progress;
    private Thread workThread;
    
    public FailedMessageHandler( IProgress<string[]> progress )
    {
        this.progress = progress;
        workThread = new Thread(WatchDog);
        workThread.Start();
    }
    
    public void Add( string failedMessage )
    {
        if ( workQueue.IsAddingCompleted )
        {
            throw new InvalidOperationException("Adding is completed!");
        }
        
        workQueue.Add(failedMessage);
    }
    
    private void WatchDog()
    {
        while(true)
        {
            // Demo: Include a timeout - If there are less than 10 items
            // for x amount of time, send whatever you got so far.
            CancellationTokenSource timeout = new CancellationTokenSource(TimeSpan.FromSeconds(1));
            try{
               var failedMsg = workQueue.Take(timeout.Token);
               currentBuffer.Add(failedMsg);
               if( currentBuffer.Count >= 10 ){
                   progress.Report(currentBuffer.ToArray());
                   currentBuffer.Clear();
               }
            }
            catch(OperationCanceledException)
            {
                Console.WriteLine("TIMEOUT!");
                // timeout.
                if( currentBuffer.Any() ) // handle items if there are
                {
                    progress.Report(currentBuffer.ToArray());
                    currentBuffer.Clear();
                }
            }
            catch(InvalidOperationException)
            {
                Console.WriteLine("COMPLETED!");
                // queue has been completed.
                if( currentBuffer.Any() ) // handle remaining items
                {
                    progress.Report(currentBuffer.ToArray());
                    currentBuffer.Clear();
                }
                break;
            }
        }
        Console.WriteLine("DONE!");
    }
    
    public void CompleteAdding()
    {
        workQueue.CompleteAdding();
    }
    
    public void AwaitCompletion()
    {
        if( workThread != null )
            workThread.Join();
    }
}

在行动:https://dotnetfiddle.net/H2Rg35

请注意,Progress 的使用将在主线程上执行处理。如果您改为传递 Action,它将在 workThread 上执行。因此,请根据您的要求调整示例。

这也只是提供一个想法,这个有很多变体,也许使用 Task/Async ...