ElasticSearch NEST 术语查询 returns 无结果
ElasticSearch NEST term query returns no results
这是我的架构
[ElasticType(Name = "importFile")]
public class ImportFile : DocumentMapping
{
[ElasticProperty(Store = false, Index = FieldIndexOption.NotAnalyzed)]
public string FileName { get; set; }
[ElasticProperty(Store = false, Index = FieldIndexOption.NotAnalyzed)]
public string GroupId { get; set; }
[ElasticProperty(Store = false, Index = FieldIndexOption.Analyzed)]
public string FilePath { get; set; }
}
我做了一个这样的 NEST 查询:
var res = ElasticClient.Search<ImportFile>(s => s
.Index(ElasticIndexName)
.Filter(f =>
f.Term(t => t.FileName, "Group-1.uhh"))).Documents.ToArray();
和returns个零元素!
如果我查看数据库内部(使用邮递员),我可以看到我的文件:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 14.069489,
"hits": [
{
"_index": "reviewer-bdd-test-index",
"_type": "importFile",
"_id": "AU9kUka2hr5Jg98UXOae",
"_score": 14.069489,
"_source": {
"fileName": "Group-1.uhh",
"groupId": "0ae1206d0644eabd82ae490e612732df5da2cd141fdee70dc64207f86c96094f",
"filePath": ""
}
},
{
"_index": "reviewer-bdd-test-index",
"_type": "importFile",
"_id": "AU9kZO25hr5Jg98UXRnk",
"_score": 14.069489,
"_source": {
"fileName": "group-1.uhh",
"groupId": "0ae1206d0644eabd82ae490e612732df5da2cd141fdee70dc64207f86c96094f",
"filePath": ""
}
}
]
}
}
听起来您可能没有在 索引您的文档之前明确地将类型的映射放入索引 中,因此 Elasticsearch 已根据字段的默认映射推断出映射在它看到的文件中。例如,给定以下类型
[ElasticType(Name = "importFile")]
public class ImportFile
{
[ElasticProperty(Store = false, Index = FieldIndexOption.NotAnalyzed)]
public string FileName { get; set; }
[ElasticProperty(Store = false, Index = FieldIndexOption.NotAnalyzed)]
public string GroupId { get; set; }
[ElasticProperty(Store = true, Index = FieldIndexOption.Analyzed)]
public string FilePath { get; set; }
}
如果我们索引一些文档如下
void Main()
{
var settings = new ConnectionSettings(new Uri("http://localhost:9200"));
var client = new ElasticClient(settings);
client.Index<ImportFile>(
new ImportFile{
FileName = "Group-1.uhh",
FilePath = "",
GroupId = "0ae1206d0644eabd82ae490e612732df" +
"5da2cd141fdee70dc64207f86c96094"
},
index => index
.Index("reviewer-bdd-test-index")
.Type("importFile")
.Refresh());
client.Index<ImportFile>(
new ImportFile
{
FileName = "group-1.uhh",
FilePath = "",
GroupId = "0ae1206d0644eabd82ae490e612732df" +
"5da2cd141fdee70dc64207f86c96094"
},
index => index
.Index("reviewer-bdd-test-index")
.Type("importFile")
.Refresh());
var results = client.Search<ImportFile>(s => s
.Index("reviewer-bdd-test-index")
.Type("importFile")
.Query(q => q
.Filtered(fq => fq
.Filter(f => f
.Term(p => p.FileName, "Group-1.uhh")
)
)
)
);
Console.WriteLine(string.Format("{0} {1}", results.RequestInformation.RequestMethod, results.RequestInformation.RequestUrl));
Console.WriteLine(Encoding.UTF8.GetString(results.RequestInformation.Request));
Console.WriteLine("Matching document count: {0}", results.Documents.Count());
}
控制台输出如下
POST http://localhost:9200/reviewer-bdd-test-index/importFile/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"fileName": "Group-1.uhh"
}
}
}
}
}
Matching document count: 0
我们没有找到匹配的文档。使用
检查 Elasticsearch 中的映射
curl -XGET "http://localhost:9200/reviewer-bdd-test-index/_mapping"
我们看到类型 importFile
的映射是
{
"reviewer-bdd-test-index": {
"mappings": {
"importFile": {
"properties": {
"fileName": {
"type": "string"
},
"groupId": {
"type": "string"
}
}
}
}
}
}
这不是我们所期望的; fileName
和 groupId
也应该有 "index": "not_analyzed"
而 filePath
甚至不在映射中。这两者都是因为 Elasticsearch 已经根据传递给它的文档推断出映射 - fileName
和 groupId
已经被映射为字符串类型,并且将使用标准分析器进行分析,而 我相信 filePath
没有被映射,因为两个看到的文档都有一个空字符串值的字段,所以 standard analyzer 应用于该字段不会为倒排索引生成任何标记,因此该字段不包含在映射中。
因此,为了确保事情按预期工作,我们需要在索引任何文档之前向索引添加映射
void Main()
{
var settings = new ConnectionSettings(new Uri("http://localhost:9200"));
var client = new ElasticClient(settings);
// Add the mapping for ImportFile to the index
client.CreateIndex(indexSelector => indexSelector
.Index("reviewer-bdd-test-index")
.AddMapping<ImportFile>(mapping => mapping
.MapFromAttributes()
)
);
// ... Same as above after this point
}
结果是
POST http://localhost:9200/reviewer-bdd-test-index/importFile/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"fileName": "Group-1.uhh"
}
}
}
}
}
Matching document count: 1
成功!我们有一个匹配的文档。检查 Elasticsearch 中的映射会产生我们期望的结果
{
"reviewer-bdd-test-index": {
"mappings": {
"importFile": {
"properties": {
"fileName": {
"type": "string",
"index": "not_analyzed"
},
"filePath": {
"type": "string",
"store": true
},
"groupId": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
此外,属性映射可以替换为流式映射
var indexResult = client.CreateIndex(indexDescriptor => indexDescriptor
.Index("reviewer-bdd-test-index")
.AddMapping<ImportFile>(mapping => mapping
.Type("importFile")
.MapFromAttributes()
.Properties(properties => properties
.String(s => s
.Name(file => file.FileName)
.Store(false)
.Index(FieldIndexOption.NotAnalyzed))
.String(s => s
.Name(file => file.GroupId)
.Store(false)
.Index(FieldIndexOption.NotAnalyzed))
.String(s => s
.Name(file => file.FilePath)
.Store(true))
)
)
);
此时无论是属性映射还是流畅映射都可以,但是有些事情只能通过流畅映射才能实现,例如multi_fields.
这是我的架构
[ElasticType(Name = "importFile")]
public class ImportFile : DocumentMapping
{
[ElasticProperty(Store = false, Index = FieldIndexOption.NotAnalyzed)]
public string FileName { get; set; }
[ElasticProperty(Store = false, Index = FieldIndexOption.NotAnalyzed)]
public string GroupId { get; set; }
[ElasticProperty(Store = false, Index = FieldIndexOption.Analyzed)]
public string FilePath { get; set; }
}
我做了一个这样的 NEST 查询:
var res = ElasticClient.Search<ImportFile>(s => s
.Index(ElasticIndexName)
.Filter(f =>
f.Term(t => t.FileName, "Group-1.uhh"))).Documents.ToArray();
和returns个零元素!
如果我查看数据库内部(使用邮递员),我可以看到我的文件:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 14.069489,
"hits": [
{
"_index": "reviewer-bdd-test-index",
"_type": "importFile",
"_id": "AU9kUka2hr5Jg98UXOae",
"_score": 14.069489,
"_source": {
"fileName": "Group-1.uhh",
"groupId": "0ae1206d0644eabd82ae490e612732df5da2cd141fdee70dc64207f86c96094f",
"filePath": ""
}
},
{
"_index": "reviewer-bdd-test-index",
"_type": "importFile",
"_id": "AU9kZO25hr5Jg98UXRnk",
"_score": 14.069489,
"_source": {
"fileName": "group-1.uhh",
"groupId": "0ae1206d0644eabd82ae490e612732df5da2cd141fdee70dc64207f86c96094f",
"filePath": ""
}
}
]
}
}
听起来您可能没有在 索引您的文档之前明确地将类型的映射放入索引 中,因此 Elasticsearch 已根据字段的默认映射推断出映射在它看到的文件中。例如,给定以下类型
[ElasticType(Name = "importFile")]
public class ImportFile
{
[ElasticProperty(Store = false, Index = FieldIndexOption.NotAnalyzed)]
public string FileName { get; set; }
[ElasticProperty(Store = false, Index = FieldIndexOption.NotAnalyzed)]
public string GroupId { get; set; }
[ElasticProperty(Store = true, Index = FieldIndexOption.Analyzed)]
public string FilePath { get; set; }
}
如果我们索引一些文档如下
void Main()
{
var settings = new ConnectionSettings(new Uri("http://localhost:9200"));
var client = new ElasticClient(settings);
client.Index<ImportFile>(
new ImportFile{
FileName = "Group-1.uhh",
FilePath = "",
GroupId = "0ae1206d0644eabd82ae490e612732df" +
"5da2cd141fdee70dc64207f86c96094"
},
index => index
.Index("reviewer-bdd-test-index")
.Type("importFile")
.Refresh());
client.Index<ImportFile>(
new ImportFile
{
FileName = "group-1.uhh",
FilePath = "",
GroupId = "0ae1206d0644eabd82ae490e612732df" +
"5da2cd141fdee70dc64207f86c96094"
},
index => index
.Index("reviewer-bdd-test-index")
.Type("importFile")
.Refresh());
var results = client.Search<ImportFile>(s => s
.Index("reviewer-bdd-test-index")
.Type("importFile")
.Query(q => q
.Filtered(fq => fq
.Filter(f => f
.Term(p => p.FileName, "Group-1.uhh")
)
)
)
);
Console.WriteLine(string.Format("{0} {1}", results.RequestInformation.RequestMethod, results.RequestInformation.RequestUrl));
Console.WriteLine(Encoding.UTF8.GetString(results.RequestInformation.Request));
Console.WriteLine("Matching document count: {0}", results.Documents.Count());
}
控制台输出如下
POST http://localhost:9200/reviewer-bdd-test-index/importFile/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"fileName": "Group-1.uhh"
}
}
}
}
}
Matching document count: 0
我们没有找到匹配的文档。使用
检查 Elasticsearch 中的映射curl -XGET "http://localhost:9200/reviewer-bdd-test-index/_mapping"
我们看到类型 importFile
的映射是
{
"reviewer-bdd-test-index": {
"mappings": {
"importFile": {
"properties": {
"fileName": {
"type": "string"
},
"groupId": {
"type": "string"
}
}
}
}
}
}
这不是我们所期望的; fileName
和 groupId
也应该有 "index": "not_analyzed"
而 filePath
甚至不在映射中。这两者都是因为 Elasticsearch 已经根据传递给它的文档推断出映射 - fileName
和 groupId
已经被映射为字符串类型,并且将使用标准分析器进行分析,而 我相信 filePath
没有被映射,因为两个看到的文档都有一个空字符串值的字段,所以 standard analyzer 应用于该字段不会为倒排索引生成任何标记,因此该字段不包含在映射中。
因此,为了确保事情按预期工作,我们需要在索引任何文档之前向索引添加映射
void Main()
{
var settings = new ConnectionSettings(new Uri("http://localhost:9200"));
var client = new ElasticClient(settings);
// Add the mapping for ImportFile to the index
client.CreateIndex(indexSelector => indexSelector
.Index("reviewer-bdd-test-index")
.AddMapping<ImportFile>(mapping => mapping
.MapFromAttributes()
)
);
// ... Same as above after this point
}
结果是
POST http://localhost:9200/reviewer-bdd-test-index/importFile/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"fileName": "Group-1.uhh"
}
}
}
}
}
Matching document count: 1
成功!我们有一个匹配的文档。检查 Elasticsearch 中的映射会产生我们期望的结果
{
"reviewer-bdd-test-index": {
"mappings": {
"importFile": {
"properties": {
"fileName": {
"type": "string",
"index": "not_analyzed"
},
"filePath": {
"type": "string",
"store": true
},
"groupId": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
此外,属性映射可以替换为流式映射
var indexResult = client.CreateIndex(indexDescriptor => indexDescriptor
.Index("reviewer-bdd-test-index")
.AddMapping<ImportFile>(mapping => mapping
.Type("importFile")
.MapFromAttributes()
.Properties(properties => properties
.String(s => s
.Name(file => file.FileName)
.Store(false)
.Index(FieldIndexOption.NotAnalyzed))
.String(s => s
.Name(file => file.GroupId)
.Store(false)
.Index(FieldIndexOption.NotAnalyzed))
.String(s => s
.Name(file => file.FilePath)
.Store(true))
)
)
);
此时无论是属性映射还是流畅映射都可以,但是有些事情只能通过流畅映射才能实现,例如multi_fields.