提高 Linq to Datatable 的性能

Improve Linq to Datatable Performance

我有一个包含 500K 行的数据表,格式如下;

Int | Decimal | String

我们正在使用单例模式,最终我们的 DataTable 需要以 List(Of AssetAllocation) 结束,其中 AssetAllocation 是:

Public Class AssetAllocation
    Property TpId() As Integer
    Property Allocation As List(Of Sector)
End Class

Public Class Sector
    Property Description() As String
    Property Weighting As Decimal
End Class

我正在使用的linq;

Private Shared Function LoadAll() As List(Of AssetAllocation)

        Dim rtn = New List(Of AssetAllocation)

        Using dt = GetRawData()

            Dim dist = (From x In dt.AsEnumerable Select x!TP_ID).ToList().Distinct()

            rtn.AddRange(From i As Integer In dist
                         Select New AssetAllocation With {
                            .TpId = i,
                            .Allocation = (From b In dt.AsEnumerable
                                           Where b!TP_ID = i Select New Sector With {
                                               .Description = b!DESCRIPTION.ToString(),
                                               .Weighting = b!WEIGHT
                                           }).ToList()})
        End Using

        Return rtn
    End Function

由于构建扇区列表的内部查询,执行 linq 需要很长时间。不同的列表包含 80k

这可以改进吗?

如果我理解您要执行的操作,那么此查询的性能应该会好得多。诀窍是使用 GroupBy 来避免在每次迭代中都必须搜索整个 table 以匹配 id。 我用 C# 编写了它,但我相信你可以将它翻译成 VB.

var rtn  = 
        dt.AsEnumerable()
        .GroupBy(x => x.Field<int>("TP_ID"))
        .Select(x => new AssetAllocation()
        { 
            TpId = x.Key, 
            Allocation = x.Select(y => new Sector
            {
                Description =  y.Field<string>("Description"),
                Weighting = y.Field<decimal>("WEIGHT") 
            }).ToList()
        }).ToList();