VB.NET - Parallel.For 比 Sequential For 慢

VB.NET - Parallel.For is Slower than Sequential For

我正在尝试使用并行处理,以便根据内容分离数据。

在下面的示例中,我生成了随机数,如果满足 a 条件,我想将它们存储到数据表中。

令我失望的是,顺序 for 的工作速度比并行快。

是否可以使并行工作更快?

Imports System.Random
Imports System.Threading
Imports System.Threading.Tasks

Public Class Form1
    Public No As Integer = 5
    Public DT(No) As DataTable
    Public S(No) As String
    Public StartTimer As DateTime
    Private Sub ParrallelProc_Btn_Click(sender As Object, e As EventArgs) Handles ParrallelProc_Btn.Click
        For j = 1 To No
            DT(j).Rows.Clear()
        Next
        StartTimer = Now
        For k = 1 To 10000
            Parallel.For(1, No + 1, Sub(i)
                                        Dim CurrentNo As String = CStr(Math.Round(Rnd() * 1000000, 0))
                                        If CurrentNo.Contains(S(i)) Then DT(i).Rows.Add(CurrentNo)
                                    End Sub)
        Next
        Dim Interval = Now.Subtract(StartTimer).TotalSeconds
    End Sub

    Private Sub SequentialProc_Btn_Click(sender As Object, e As EventArgs) Handles SequentialProc_Btn.Click
        For j = 1 To No
            DT(j).Rows.Clear()
        Next
        StartTimer = Now
        For k = 1 To 10000
            For l = 1 To No
                Dim CurrentNo As String = CStr(Math.Round(Rnd() * 1000000, 0))
                If CurrentNo.Contains(S(l)) Then DT(l).Rows.Add(CurrentNo)
            Next
        Next
        Dim Interval = Now.Subtract(StartTimer).TotalSeconds
    End Sub
End Class

首先,不是吹牛,我的电脑并行运行 160 毫秒,顺序运行 40 毫秒。

创建线程会产生一些开销,而且只有 5 个线程是不必要的 - 您最好只做这 5 件事。尤其是像您所拥有的那样轻巧的东西。并行化是为了同时执行多个 long-ish 运行 任务。

最终,一旦克服了线程开销,并行循环就会更快。我测试了增加 No,这发生在 100 左右。

Public No As Integer = 100
Public DT(No) As DataTable
Public S(No) As String
Public StartTimer As DateTime
Private iterations As Integer = 10000

Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
    For i = 1 To No
        DT(i) = New DataTable()
        DT(i).Columns.Add()
        S(i) = (i + 1).ToString()
    Next
End Sub

Private Sub ParallelProc_Btn_Click(sender As Object, e As EventArgs) Handles ParallelProc_Btn.Click
    clearDT()
    Dim sw As New Stopwatch()
    sw.Start()
    For k = 1 To iterations
        Parallel.For(
            1,
            No + 1,
            AddressOf process)
    Next
    sw.Stop()
    MessageBox.Show(sw.ElapsedMilliseconds)
End Sub

Private Sub SequentialProc_Btn_Click(sender As Object, e As EventArgs) Handles SequentialProc_Btn.Click
    clearDT()
    Dim sw As New Stopwatch()
    sw.Start()
    For k = 1 To iterations
        For i = 1 To No
            process(i)
        Next
    Next
    MessageBox.Show(sw.ElapsedMilliseconds)
End Sub

Private Sub clearDT()
    For j = 1 To No
        DT(j).Rows.Clear()
    Next
End Sub

Private Sub process(i As Integer)
    Randomize()
    Dim CurrentNo As String = CStr(Math.Round(Rnd() * 1000000, 0))
    If CurrentNo.Contains(S(i)) Then DT(i).Rows.Add(CurrentNo)
End Sub

我还将操作移到了两个例程都可以调用的 Sub 中。重用您的代码不仅可以节省时间和 space,还可以确保您只是比较方法,而不是例程。

您还应该在使用 Rnd() 之前调用 Randomize()。参见 https://msdn.microsoft.com/en-us/library/y66ey2hh(v=vs.110).aspx

更好的测试是在 process() 方法中放入一些实质内容,例如 Thread.Sleep(1),然后使用 Noiterations。你会发现并行休眠比顺序休眠效率高得多。

将较小的循环放在较大的循环中,它应该使并行循环比顺序循环快得多。

Imports System
Imports System.Diagnostics
Imports System.Threading.Tasks

Public Module Module1
Public Sub Main()

    Dim Rnd as New Random()
    Dim _sw as new Stopwatch

    _sw.Restart()
    For k = 1 To 1000
        Parallel.For(1, 6, Sub(i)
            Dim CurrentNo As Double = Rnd.Next()
            ' Do other stuff
        End Sub)
    Next

    Console.WriteLine(_sw.Elapsed) 
    ' >> took 00:00:00.0659017 on dotnetfiddle.net

    _sw.Restart()
    Parallel.For(1, 1000+1, Sub(k)
        For i as Integer = 1 to 5
            Dim CurrentNo As Double = Rnd.Next()
            ' Do other stuff
        Next
    End Sub)

    Console.WriteLine(_sw.Elapsed)
    ' >> took 00:00:00.0009715 on dotnetfiddle.net

End Sub

End Module