C# Nest/Elasticsearch - 创建索引但分析不起作用
C# Nest/Elasticsearch - Create index with analysis not working
我对 ElasticSearch 和 NEST 还很陌生,我 运行 遇到了一个问题。
我正在尝试添加分析器和分词器,以便我可以在我的代码中搜索子字符串。
示例:
User user1 = new User(){ FirstName = "John", LastName = "Boat", Number="45678" }
User user2 = new User(){ FirstName = "Michael", LastName = "Johansen", Number="123456" }
搜索“12345”得到 user2,“456”得到 user1 和 user2,“Joh”得到 user1 和 user2,等等
但是,当我在创建索引时尝试将分析器和令牌过滤器添加到我的设置时,它们没有保存在弹性数据库中。
这个有效:
client.Indices.Create("customers",
index => index
.Settings(se => se
.Setting("index.mapping.total_fields.limit", "2000"))
.Map<Customer>(x => x.AutoMap())
);
这不起作用:
client.Indices.Create("crmleads",
index => index
.Settings(se => se
.Analysis(a => a
.Analyzers(analyzer => analyzer
.Custom("substring_analyzer", analyzerDescriptor => analyzerDescriptor
.Tokenizer("standard")
.Filters("lowercase", "substring")))
.TokenFilters(tf => tf
.NGram("substring", filterDescriptor => filterDescriptor
.MinGram(2)
.MaxGram(15))))
.Setting("index.mapping.total_fields.limit", "2000"))
.Map<CRMLead>(x => x
.AutoMap()
.Properties(p => p
.Text(t => t
.Name(f => f.Name)
.Analyzer("substring_analyzer"))
.Text(t => t
.Name(f => f.CVRNumber)
.Analyzer("substring_analyzer"))
.Boolean(t => t
.Name(f => f.IsConverted))
.Text(t => t
.Name(f => f.ContactPersonName)
.Analyzer("substring_analyzer"))
.Text(t => t
.Name(f => f.ContactPersonEmail
).Analyzer("substring_analyzer"))))
);
ElasticSearch 服务器的命令行也没有显示任何错误:
Command line for ElasticSearch
Created indexes shown in Kibana
另外我还没有在我的模型中添加任何东西类:
public class CRMLead
{
[Key]
[DatabaseGenerated(DatabaseGeneratedOption.Identity)]
public Guid Id { get; set; }
public Company Company { get; set; }
public Customer Customer { get; set; }
public string CVRNumber { get; set; }
public DateTime CreateDate { get; set; }
public string Name { get; set; }
public string Website { get; set; }
public string Country { get; set; }
public string Address { get; set; }
public string ZipCode { get; set; }
public string City { get; set; }
public string ContactPersonName { get; set; }
public string ContactPersonEmail { get; set; }
public string ContactPersonPhoneNumber { get; set; }
public string PhaseOneDescription { get; set; }
public CustomerContact CustomerContact { get; set; }
public ApplicationUser Seller { get; set; }
public CRMLeadStatus CRMStatus { get; set; }
public List<UploadedFile> UploadedFiles { get; set; }
public bool IsConverted { get; set; }
public bool IsDone { get; set; }
public bool IsSold { get; set; }
}
完整代码在这里:
public static void AddElasticsearch(this IServiceCollection services, IConfiguration configuration)
{
var url = configuration["Elasticsearch:url"];
var settings = new ConnectionSettings(new Uri(url));
AddDefaultMappings(settings);
settings.DefaultFieldNameInferrer(f => f);
var client = new ElasticClient(settings);
services.AddSingleton<IElasticClient>(client);
CreateIndices(client);
}
private static void AddDefaultMappings(ConnectionSettings settings)
{
settings
.DefaultMappingFor<ApplicationUser>(m => m
.IndexName("users")
.Ignore(au => au.AccessFailedCount)
.Ignore(au => au.Address)
.Ignore(au => au.AppInstalled)
.Ignore(au => au.BirthDay)
.Ignore(au => au.BorrowedEquipment)
.Ignore(au => au.ConcurrencyStamp)
.Ignore(au => au.CostPrice)
.Ignore(au => au.Culture)
.Ignore(au => au.CustomUserfields)
.Ignore(au => au.EmailConfirmed)
.Ignore(au => au.HireDate)
.Ignore(au => au.IceRelatives)
.Ignore(au => au.Id)
.Ignore(au => au.Initials)
.Ignore(au => au.IsCompanyOwner)
.Ignore(au => au.LastLogin)
.Ignore(au => au.LockoutEnabled)
.Ignore(au => au.LockoutEnd)
.Ignore(au => au.NormalizedEmail)
.Ignore(au => au.NormalizedUserName)
.Ignore(au => au.PasswordHash)
.Ignore(au => au.PhoneNumberConfirmed)
.Ignore(au => au.PrivateEmail)
.Ignore(au => au.PrivatePhoneNumber)
.Ignore(au => au.ProfileImagePath)
.Ignore(au => au.SecurityStamp)
.Ignore(au => au.TwoFactorEnabled)
.Ignore(au => au.UploadedFile)
)
.DefaultMappingFor<CRMLead>(m => m
.IndexName("crmleads")
.Ignore(crml => crml.Address)
.Ignore(crml => crml.City)
.Ignore(crml => crml.Company)
.Ignore(crml => crml.ContactPersonPhoneNumber)
.Ignore(crml => crml.Country)
.Ignore(crml => crml.CreateDate)
.Ignore(crml => crml.Customer)
.Ignore(crml => crml.CustomerContact)
.Ignore(crml => crml.IsDone)
.Ignore(crml => crml.IsSold)
.Ignore(crml => crml.PhaseOneDescription)
.Ignore(crml => crml.Seller)
.Ignore(crml => crml.UploadedFiles)
.Ignore(crml => crml.Website)
.Ignore(crml => crml.ZipCode)
)
.DefaultMappingFor<Customer>(m => m
.IndexName("customers")
.Ignore(cust => cust.Activities)
.Ignore(cust => cust.Address)
.Ignore(cust => cust.City)
.Ignore(cust => cust.Company)
.Ignore(cust => cust.Contacts)
.Ignore(cust => cust.Country)
.Ignore(cust => cust.CreateDate)
.Ignore(cust => cust.Id)
.Ignore(cust => cust.Projects)
.Ignore(cust => cust.ZipCode)
)
}
private static void CreateIndices(IElasticClient client)
{
client.Indices.Create("users",
index => index
.Map<ApplicationUser>(x => x.AutoMap())
);
client.Indices.Create("crmleads",
index => index
.Settings(se => se
.Analysis(a => a
.Analyzers(analyzer => analyzer
.Custom("substring_analyzer", analyzerDescriptor => analyzerDescriptor
.Tokenizer("standard")
.Filters("lowercase", "substring")))
.TokenFilters(tf => tf
.NGram("substring", filterDescriptor => filterDescriptor
.MinGram(2)
.MaxGram(15))))
.Setting("index.mapping.total_fields.limit", "2000"))
.Map<CRMLead>(x => x
.AutoMap()
.Properties(p => p
.Text(t => t
.Name(f => f.Name)
.Analyzer("substring_analyzer"))
.Text(t => t
.Name(f => f.CVRNumber)
.Analyzer("substring_analyzer"))
.Boolean(t => t
.Name(f => f.IsConverted))
.Text(t => t
.Name(f => f.ContactPersonName)
.Analyzer("substring_analyzer"))
.Text(t => t
.Name(f => f.ContactPersonEmail
).Analyzer("substring_analyzer"))))
);
client.Indices.Create("customers",
index => index
.Settings(se => se
.Setting("index.mapping.total_fields.limit", "2000"))
.Map<Customer>(x => x.AutoMap())
);
}
技术是:
- Nest 7.10.1
- .NET 核心 3.1
- Visual Studio 2019 - 社区版
- 弹性搜索 7.10.1
- Kibana 7.10.1
我在这里做错了什么?提前致谢。
编辑
在研究了索引响应之后(正如@Milan Gatyas 所建议的那样),我发现我的令牌过滤器 NGram.MaxGram 预计差异为 1,但实际为 13
.NGram("substring", filterDescriptor => filterDescriptor
.MinGram(2)
.MaxGram(15))))
所以为了解决这个问题,我将 max_ngram_diff 的设置设置为 15:
.Setting("index.max_ngram_diff", "15"))
为了解决这个问题,我将 max_ngram_diff 的设置设置为 15:
.Setting("index.max_ngram_diff", "15"))
我对 ElasticSearch 和 NEST 还很陌生,我 运行 遇到了一个问题。
我正在尝试添加分析器和分词器,以便我可以在我的代码中搜索子字符串。
示例:
User user1 = new User(){ FirstName = "John", LastName = "Boat", Number="45678" }
User user2 = new User(){ FirstName = "Michael", LastName = "Johansen", Number="123456" }
搜索“12345”得到 user2,“456”得到 user1 和 user2,“Joh”得到 user1 和 user2,等等
但是,当我在创建索引时尝试将分析器和令牌过滤器添加到我的设置时,它们没有保存在弹性数据库中。
这个有效:
client.Indices.Create("customers",
index => index
.Settings(se => se
.Setting("index.mapping.total_fields.limit", "2000"))
.Map<Customer>(x => x.AutoMap())
);
这不起作用:
client.Indices.Create("crmleads",
index => index
.Settings(se => se
.Analysis(a => a
.Analyzers(analyzer => analyzer
.Custom("substring_analyzer", analyzerDescriptor => analyzerDescriptor
.Tokenizer("standard")
.Filters("lowercase", "substring")))
.TokenFilters(tf => tf
.NGram("substring", filterDescriptor => filterDescriptor
.MinGram(2)
.MaxGram(15))))
.Setting("index.mapping.total_fields.limit", "2000"))
.Map<CRMLead>(x => x
.AutoMap()
.Properties(p => p
.Text(t => t
.Name(f => f.Name)
.Analyzer("substring_analyzer"))
.Text(t => t
.Name(f => f.CVRNumber)
.Analyzer("substring_analyzer"))
.Boolean(t => t
.Name(f => f.IsConverted))
.Text(t => t
.Name(f => f.ContactPersonName)
.Analyzer("substring_analyzer"))
.Text(t => t
.Name(f => f.ContactPersonEmail
).Analyzer("substring_analyzer"))))
);
ElasticSearch 服务器的命令行也没有显示任何错误:
Command line for ElasticSearch
Created indexes shown in Kibana
另外我还没有在我的模型中添加任何东西类:
public class CRMLead
{
[Key]
[DatabaseGenerated(DatabaseGeneratedOption.Identity)]
public Guid Id { get; set; }
public Company Company { get; set; }
public Customer Customer { get; set; }
public string CVRNumber { get; set; }
public DateTime CreateDate { get; set; }
public string Name { get; set; }
public string Website { get; set; }
public string Country { get; set; }
public string Address { get; set; }
public string ZipCode { get; set; }
public string City { get; set; }
public string ContactPersonName { get; set; }
public string ContactPersonEmail { get; set; }
public string ContactPersonPhoneNumber { get; set; }
public string PhaseOneDescription { get; set; }
public CustomerContact CustomerContact { get; set; }
public ApplicationUser Seller { get; set; }
public CRMLeadStatus CRMStatus { get; set; }
public List<UploadedFile> UploadedFiles { get; set; }
public bool IsConverted { get; set; }
public bool IsDone { get; set; }
public bool IsSold { get; set; }
}
完整代码在这里:
public static void AddElasticsearch(this IServiceCollection services, IConfiguration configuration)
{
var url = configuration["Elasticsearch:url"];
var settings = new ConnectionSettings(new Uri(url));
AddDefaultMappings(settings);
settings.DefaultFieldNameInferrer(f => f);
var client = new ElasticClient(settings);
services.AddSingleton<IElasticClient>(client);
CreateIndices(client);
}
private static void AddDefaultMappings(ConnectionSettings settings)
{
settings
.DefaultMappingFor<ApplicationUser>(m => m
.IndexName("users")
.Ignore(au => au.AccessFailedCount)
.Ignore(au => au.Address)
.Ignore(au => au.AppInstalled)
.Ignore(au => au.BirthDay)
.Ignore(au => au.BorrowedEquipment)
.Ignore(au => au.ConcurrencyStamp)
.Ignore(au => au.CostPrice)
.Ignore(au => au.Culture)
.Ignore(au => au.CustomUserfields)
.Ignore(au => au.EmailConfirmed)
.Ignore(au => au.HireDate)
.Ignore(au => au.IceRelatives)
.Ignore(au => au.Id)
.Ignore(au => au.Initials)
.Ignore(au => au.IsCompanyOwner)
.Ignore(au => au.LastLogin)
.Ignore(au => au.LockoutEnabled)
.Ignore(au => au.LockoutEnd)
.Ignore(au => au.NormalizedEmail)
.Ignore(au => au.NormalizedUserName)
.Ignore(au => au.PasswordHash)
.Ignore(au => au.PhoneNumberConfirmed)
.Ignore(au => au.PrivateEmail)
.Ignore(au => au.PrivatePhoneNumber)
.Ignore(au => au.ProfileImagePath)
.Ignore(au => au.SecurityStamp)
.Ignore(au => au.TwoFactorEnabled)
.Ignore(au => au.UploadedFile)
)
.DefaultMappingFor<CRMLead>(m => m
.IndexName("crmleads")
.Ignore(crml => crml.Address)
.Ignore(crml => crml.City)
.Ignore(crml => crml.Company)
.Ignore(crml => crml.ContactPersonPhoneNumber)
.Ignore(crml => crml.Country)
.Ignore(crml => crml.CreateDate)
.Ignore(crml => crml.Customer)
.Ignore(crml => crml.CustomerContact)
.Ignore(crml => crml.IsDone)
.Ignore(crml => crml.IsSold)
.Ignore(crml => crml.PhaseOneDescription)
.Ignore(crml => crml.Seller)
.Ignore(crml => crml.UploadedFiles)
.Ignore(crml => crml.Website)
.Ignore(crml => crml.ZipCode)
)
.DefaultMappingFor<Customer>(m => m
.IndexName("customers")
.Ignore(cust => cust.Activities)
.Ignore(cust => cust.Address)
.Ignore(cust => cust.City)
.Ignore(cust => cust.Company)
.Ignore(cust => cust.Contacts)
.Ignore(cust => cust.Country)
.Ignore(cust => cust.CreateDate)
.Ignore(cust => cust.Id)
.Ignore(cust => cust.Projects)
.Ignore(cust => cust.ZipCode)
)
}
private static void CreateIndices(IElasticClient client)
{
client.Indices.Create("users",
index => index
.Map<ApplicationUser>(x => x.AutoMap())
);
client.Indices.Create("crmleads",
index => index
.Settings(se => se
.Analysis(a => a
.Analyzers(analyzer => analyzer
.Custom("substring_analyzer", analyzerDescriptor => analyzerDescriptor
.Tokenizer("standard")
.Filters("lowercase", "substring")))
.TokenFilters(tf => tf
.NGram("substring", filterDescriptor => filterDescriptor
.MinGram(2)
.MaxGram(15))))
.Setting("index.mapping.total_fields.limit", "2000"))
.Map<CRMLead>(x => x
.AutoMap()
.Properties(p => p
.Text(t => t
.Name(f => f.Name)
.Analyzer("substring_analyzer"))
.Text(t => t
.Name(f => f.CVRNumber)
.Analyzer("substring_analyzer"))
.Boolean(t => t
.Name(f => f.IsConverted))
.Text(t => t
.Name(f => f.ContactPersonName)
.Analyzer("substring_analyzer"))
.Text(t => t
.Name(f => f.ContactPersonEmail
).Analyzer("substring_analyzer"))))
);
client.Indices.Create("customers",
index => index
.Settings(se => se
.Setting("index.mapping.total_fields.limit", "2000"))
.Map<Customer>(x => x.AutoMap())
);
}
技术是:
- Nest 7.10.1
- .NET 核心 3.1
- Visual Studio 2019 - 社区版
- 弹性搜索 7.10.1
- Kibana 7.10.1
我在这里做错了什么?提前致谢。
编辑
在研究了索引响应之后(正如@Milan Gatyas 所建议的那样),我发现我的令牌过滤器 NGram.MaxGram 预计差异为 1,但实际为 13
.NGram("substring", filterDescriptor => filterDescriptor
.MinGram(2)
.MaxGram(15))))
所以为了解决这个问题,我将 max_ngram_diff 的设置设置为 15:
.Setting("index.max_ngram_diff", "15"))
为了解决这个问题,我将 max_ngram_diff 的设置设置为 15:
.Setting("index.max_ngram_diff", "15"))