针对强类型数据集的慢速 LINQ 查询

Slow LINQ query against strongly typed Dataset

我有一个包含大约 5,000 行的数据库。还有许多多对多关系。作为 "advanced search" 查询的一部分,我需要跨表进行自由文本搜索。

我创建了一个强类型数据集,并在应用程序启动时从 SQL 服务器导入所有数据。在对数据集执行 LINQ 查询时,查询执行得非常慢(大约 15 秒)。我认为针对内存数据集执行查询会比 SQL 服务器快得多,但事实并非如此。我什至需要在 where 子句中添加更多连接和 "searches",所以事情只会变得更糟。

在我搜索的字段中,最长的是 Summary,数据库中最长的字段不到 2,000 字节,所以我们不是在谈论要搜索的大量数据。我是不是找错树了,还是有办法提高这个查询的性能?

代码如下:

var results = from e in _data.ds.Employee
      join es in _data.ds.EmployeeSkill on e.EmployeeId equals es.EmployeeId into esGroup from esItem in esGroup.DefaultIfEmpty()
      join s in _data.ds.Skill on esItem?.SkillId equals s.SkillId into sGroup from skillItem in sGroup.DefaultIfEmpty()
      join er in _data.ds.EmployeeRole on e.EmployeeId equals er.EmployeeId into erGroup from erItem in erGroup.DefaultIfEmpty()
      join r in _data.ds.Role on erItem?.RoleId equals r.RoleId into rGroup from rItem in rGroup.DefaultIfEmpty()
      join et in _data.ds.EmployeeTechnology on e.EmployeeId equals et.EmployeeId into etGroup from etItem in etGroup.DefaultIfEmpty()
      join t in _data.ds.Technology on etItem?.TechnologyId equals t.TechnologyId into tGroup from tItem in etGroup.DefaultIfEmpty()
      where
        e.FirstName.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
        e.LastName.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
        e.RMMarket.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
        !e.IsSummaryNull() && e.Summary.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0
      select new SearchResult
      {
          EmployeeId = e.EmployeeId,
          Name = e.FirstName + " " + e.LastName,
          Title = e.Title,
          ImageUrl = e.IsImageUrlNull() ? string.Empty : e.ImageUrl,
          Market = e.RMMarket,
          Group = e.Group,
          Summary = e.IsSummaryNull() ? string.Empty : e.Summary.Substring(1, e.Summary.Length < summaryLength ? e.Summary.Length - 1 : summaryLength),
          AdUserName = e.AdUserName
      };

一些想法:

首先,您要搜索字符串。如果要搜索的东西很多,可以考虑维护一个全文索引来加快速度。

其次,将where子句放在join子句之前。过滤掉数据的东西在 LINQ 语句中应该尽可能高。它目前正在为每一行加入一堆数据,即使在 where 子句为 false 时也不会使用它。

假设您仍然加载到 DataSets 而不是对象列表(没有足够的信息来翻译该部分),这是我的建议:

预加入数据以用作您的搜索索引:

var searchBase = (from e in _data.ds.Employee
             join es in _data.ds.EmployeeSkill on e.EmployeeId equals es.EmployeeId into esGroup
             from esItem in esGroup.DefaultIfEmpty()
             join s in _data.ds.Skill on esItem?.SkillId equals s.SkillId into sGroup
             from skillItem in sGroup.DefaultIfEmpty()
             join er in _data.ds.EmployeeRole on e.EmployeeId equals er.EmployeeId into erGroup
             from erItem in erGroup.DefaultIfEmpty()
             join r in _data.ds.Role on erItem?.RoleId equals r.RoleId into rGroup
             from rItem in rGroup.DefaultIfEmpty()
             join et in _data.ds.EmployeeTechnology on e.EmployeeId equals et.EmployeeId into etGroup
             from etItem in etGroup.DefaultIfEmpty()
             join t in _data.ds.Technology on etItem?.TechnologyId equals t.TechnologyId into tGroup
             from tItem in etGroup.DefaultIfEmpty()
             select new {
                e.FirstName, e.LastName, e.RMMarket, e.Summary,
                e.EmployeeID, e.Title, e.ImageUrl, e.Group, e.AdUserName
             }).ToList();

运行 对加载和连接的数据进行搜索:

var results = from e in searchBase
          where
                e.FirstName.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
                e.LastName.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
                e.RMMarket.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
                !e.IsSummaryNull() && e.Summary.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0
          select new SearchResult {
              EmployeeId = e.EmployeeId,
              Name = e.FirstName + " " + e.LastName,
              Title = e.Title,
              ImageUrl = e.IsImageUrlNull() ? string.Empty : e.ImageUrl,
              Market = e.RMMarket,
              Group = e.Group,
              Summary = e.IsSummaryNull() ? string.Empty : e.Summary.Substring(1, e.Summary.Length < summaryLength ? e.Summary.Length - 1 : summaryLength),
              AdUserName = e.AdUserName
          };

顺便说一句,您的示例代码显示没有理由进行连接,因为连接范围变量的 none 正在条件或答案中使用,而且您无论如何都要连接每个变量,所以将它们排除在外将是最快的解决方案。