针对强类型数据集的慢速 LINQ 查询
Slow LINQ query against strongly typed Dataset
我有一个包含大约 5,000 行的数据库。还有许多多对多关系。作为 "advanced search" 查询的一部分,我需要跨表进行自由文本搜索。
我创建了一个强类型数据集,并在应用程序启动时从 SQL 服务器导入所有数据。在对数据集执行 LINQ 查询时,查询执行得非常慢(大约 15 秒)。我认为针对内存数据集执行查询会比 SQL 服务器快得多,但事实并非如此。我什至需要在 where 子句中添加更多连接和 "searches",所以事情只会变得更糟。
在我搜索的字段中,最长的是 Summary,数据库中最长的字段不到 2,000 字节,所以我们不是在谈论要搜索的大量数据。我是不是找错树了,还是有办法提高这个查询的性能?
代码如下:
var results = from e in _data.ds.Employee
join es in _data.ds.EmployeeSkill on e.EmployeeId equals es.EmployeeId into esGroup from esItem in esGroup.DefaultIfEmpty()
join s in _data.ds.Skill on esItem?.SkillId equals s.SkillId into sGroup from skillItem in sGroup.DefaultIfEmpty()
join er in _data.ds.EmployeeRole on e.EmployeeId equals er.EmployeeId into erGroup from erItem in erGroup.DefaultIfEmpty()
join r in _data.ds.Role on erItem?.RoleId equals r.RoleId into rGroup from rItem in rGroup.DefaultIfEmpty()
join et in _data.ds.EmployeeTechnology on e.EmployeeId equals et.EmployeeId into etGroup from etItem in etGroup.DefaultIfEmpty()
join t in _data.ds.Technology on etItem?.TechnologyId equals t.TechnologyId into tGroup from tItem in etGroup.DefaultIfEmpty()
where
e.FirstName.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
e.LastName.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
e.RMMarket.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
!e.IsSummaryNull() && e.Summary.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0
select new SearchResult
{
EmployeeId = e.EmployeeId,
Name = e.FirstName + " " + e.LastName,
Title = e.Title,
ImageUrl = e.IsImageUrlNull() ? string.Empty : e.ImageUrl,
Market = e.RMMarket,
Group = e.Group,
Summary = e.IsSummaryNull() ? string.Empty : e.Summary.Substring(1, e.Summary.Length < summaryLength ? e.Summary.Length - 1 : summaryLength),
AdUserName = e.AdUserName
};
一些想法:
首先,您要搜索字符串。如果要搜索的东西很多,可以考虑维护一个全文索引来加快速度。
其次,将where
子句放在join
子句之前。过滤掉数据的东西在 LINQ 语句中应该尽可能高。它目前正在为每一行加入一堆数据,即使在 where
子句为 false 时也不会使用它。
假设您仍然加载到 DataSet
s 而不是对象列表(没有足够的信息来翻译该部分),这是我的建议:
预加入数据以用作您的搜索索引:
var searchBase = (from e in _data.ds.Employee
join es in _data.ds.EmployeeSkill on e.EmployeeId equals es.EmployeeId into esGroup
from esItem in esGroup.DefaultIfEmpty()
join s in _data.ds.Skill on esItem?.SkillId equals s.SkillId into sGroup
from skillItem in sGroup.DefaultIfEmpty()
join er in _data.ds.EmployeeRole on e.EmployeeId equals er.EmployeeId into erGroup
from erItem in erGroup.DefaultIfEmpty()
join r in _data.ds.Role on erItem?.RoleId equals r.RoleId into rGroup
from rItem in rGroup.DefaultIfEmpty()
join et in _data.ds.EmployeeTechnology on e.EmployeeId equals et.EmployeeId into etGroup
from etItem in etGroup.DefaultIfEmpty()
join t in _data.ds.Technology on etItem?.TechnologyId equals t.TechnologyId into tGroup
from tItem in etGroup.DefaultIfEmpty()
select new {
e.FirstName, e.LastName, e.RMMarket, e.Summary,
e.EmployeeID, e.Title, e.ImageUrl, e.Group, e.AdUserName
}).ToList();
运行 对加载和连接的数据进行搜索:
var results = from e in searchBase
where
e.FirstName.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
e.LastName.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
e.RMMarket.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
!e.IsSummaryNull() && e.Summary.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0
select new SearchResult {
EmployeeId = e.EmployeeId,
Name = e.FirstName + " " + e.LastName,
Title = e.Title,
ImageUrl = e.IsImageUrlNull() ? string.Empty : e.ImageUrl,
Market = e.RMMarket,
Group = e.Group,
Summary = e.IsSummaryNull() ? string.Empty : e.Summary.Substring(1, e.Summary.Length < summaryLength ? e.Summary.Length - 1 : summaryLength),
AdUserName = e.AdUserName
};
顺便说一句,您的示例代码显示没有理由进行连接,因为连接范围变量的 none 正在条件或答案中使用,而且您无论如何都要连接每个变量,所以将它们排除在外将是最快的解决方案。
我有一个包含大约 5,000 行的数据库。还有许多多对多关系。作为 "advanced search" 查询的一部分,我需要跨表进行自由文本搜索。
我创建了一个强类型数据集,并在应用程序启动时从 SQL 服务器导入所有数据。在对数据集执行 LINQ 查询时,查询执行得非常慢(大约 15 秒)。我认为针对内存数据集执行查询会比 SQL 服务器快得多,但事实并非如此。我什至需要在 where 子句中添加更多连接和 "searches",所以事情只会变得更糟。
在我搜索的字段中,最长的是 Summary,数据库中最长的字段不到 2,000 字节,所以我们不是在谈论要搜索的大量数据。我是不是找错树了,还是有办法提高这个查询的性能?
代码如下:
var results = from e in _data.ds.Employee
join es in _data.ds.EmployeeSkill on e.EmployeeId equals es.EmployeeId into esGroup from esItem in esGroup.DefaultIfEmpty()
join s in _data.ds.Skill on esItem?.SkillId equals s.SkillId into sGroup from skillItem in sGroup.DefaultIfEmpty()
join er in _data.ds.EmployeeRole on e.EmployeeId equals er.EmployeeId into erGroup from erItem in erGroup.DefaultIfEmpty()
join r in _data.ds.Role on erItem?.RoleId equals r.RoleId into rGroup from rItem in rGroup.DefaultIfEmpty()
join et in _data.ds.EmployeeTechnology on e.EmployeeId equals et.EmployeeId into etGroup from etItem in etGroup.DefaultIfEmpty()
join t in _data.ds.Technology on etItem?.TechnologyId equals t.TechnologyId into tGroup from tItem in etGroup.DefaultIfEmpty()
where
e.FirstName.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
e.LastName.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
e.RMMarket.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
!e.IsSummaryNull() && e.Summary.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0
select new SearchResult
{
EmployeeId = e.EmployeeId,
Name = e.FirstName + " " + e.LastName,
Title = e.Title,
ImageUrl = e.IsImageUrlNull() ? string.Empty : e.ImageUrl,
Market = e.RMMarket,
Group = e.Group,
Summary = e.IsSummaryNull() ? string.Empty : e.Summary.Substring(1, e.Summary.Length < summaryLength ? e.Summary.Length - 1 : summaryLength),
AdUserName = e.AdUserName
};
一些想法:
首先,您要搜索字符串。如果要搜索的东西很多,可以考虑维护一个全文索引来加快速度。
其次,将where
子句放在join
子句之前。过滤掉数据的东西在 LINQ 语句中应该尽可能高。它目前正在为每一行加入一堆数据,即使在 where
子句为 false 时也不会使用它。
假设您仍然加载到 DataSet
s 而不是对象列表(没有足够的信息来翻译该部分),这是我的建议:
预加入数据以用作您的搜索索引:
var searchBase = (from e in _data.ds.Employee
join es in _data.ds.EmployeeSkill on e.EmployeeId equals es.EmployeeId into esGroup
from esItem in esGroup.DefaultIfEmpty()
join s in _data.ds.Skill on esItem?.SkillId equals s.SkillId into sGroup
from skillItem in sGroup.DefaultIfEmpty()
join er in _data.ds.EmployeeRole on e.EmployeeId equals er.EmployeeId into erGroup
from erItem in erGroup.DefaultIfEmpty()
join r in _data.ds.Role on erItem?.RoleId equals r.RoleId into rGroup
from rItem in rGroup.DefaultIfEmpty()
join et in _data.ds.EmployeeTechnology on e.EmployeeId equals et.EmployeeId into etGroup
from etItem in etGroup.DefaultIfEmpty()
join t in _data.ds.Technology on etItem?.TechnologyId equals t.TechnologyId into tGroup
from tItem in etGroup.DefaultIfEmpty()
select new {
e.FirstName, e.LastName, e.RMMarket, e.Summary,
e.EmployeeID, e.Title, e.ImageUrl, e.Group, e.AdUserName
}).ToList();
运行 对加载和连接的数据进行搜索:
var results = from e in searchBase
where
e.FirstName.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
e.LastName.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
e.RMMarket.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0 ||
!e.IsSummaryNull() && e.Summary.IndexOf(searchTerm, StringComparison.OrdinalIgnoreCase) >= 0
select new SearchResult {
EmployeeId = e.EmployeeId,
Name = e.FirstName + " " + e.LastName,
Title = e.Title,
ImageUrl = e.IsImageUrlNull() ? string.Empty : e.ImageUrl,
Market = e.RMMarket,
Group = e.Group,
Summary = e.IsSummaryNull() ? string.Empty : e.Summary.Substring(1, e.Summary.Length < summaryLength ? e.Summary.Length - 1 : summaryLength),
AdUserName = e.AdUserName
};
顺便说一句,您的示例代码显示没有理由进行连接,因为连接范围变量的 none 正在条件或答案中使用,而且您无论如何都要连接每个变量,所以将它们排除在外将是最快的解决方案。