如何获取至少包含一个完整单词的 Azure 搜索结果?

How to get Azure Search results that contains at least one complete word?

我正在 Azure 搜索门户中测试以下查询,但没有给我预期的结果。 结果,我想要任何至少出现一次 algo 单词的文档。

search=algo&queryType=full&searchMode=any

重要提示:MyVal 可搜索的 并且有 Lucene 分析器(西班牙语)

预期元素结果:

{
    "@odata.context": "https://....windows.net/indexes(....)/$metadata#docs(*)",
    "value": [
        {
            "MyKey":"1",
            "MyValues":[
                {
                    "MyVal":"algo aqui"
                },
                {
                    "MyVal":"lala"
                },
            ]
        }
    ]
}

不是预期的元素结果:

{
    "@odata.context": "https://....windows.net/indexes(....)/$metadata#docs(*)",
    "value": [
        {
            "MyKey":"1",
            "MyValues":[
                {
                    "MyVal":"algoOtherStuff aqui"
                },
                {
                    "MyVal":"lala"
                },
            ]
        }
    ]
}

得到的结果:

{
    "@odata.context": "https://....windows.net/indexes(....)/$metadata#docs(*)",
    "value": []
}

更多示例查询和结果

search=algo*&queryType=full&searchMode=any

[无结果]


search=/.algo./&queryType=full&searchMode=any

[无结果]


search=algo aqui&queryType=full&searchMode=any

[预期结果!!!](找到元素)


search=aqui&queryType=full&searchMode=any

[预期结果!!!](找到元素)


重要提示: 如果我更改其他两个词以进行测试,例如:"some data" 或 "something special" 并按其中之一进行搜索,Azure 搜索是返回预期的结果。似乎 "algo" 特定单词有问题。

好的,我能够使用以下代码重现该问题:

var client = new SearchServiceClient("xxxx", new SearchCredentials("abcabc"));

            client.Indexes.Create(new Microsoft.Azure.Search.Models.Index
            {
                Name = "index",
                Fields = new List<Field>
                {
                    new Field("Id", DataType.String){ IsKey = true, IsRetrievable = true, IsFilterable = true},
                    Field.NewComplex("MyValues", true, new List<Field> { new Field("MyVal", DataType.String)
                        {
                            IsRetrievable = true,
                            IsFilterable = true,
                            IsSearchable =true,
                            Analyzer = AnalyzerName.EsLucene
                        }
                    })
                }
            });

            var docs = new List<CustomDoc> {
                new CustomDoc { Id = "1", MyValues = new MyValues[] { new MyValues { MyVal = "algo aqui" }, new MyValues { MyVal = "lala" }} },
                new CustomDoc { Id = "2", MyValues = new MyValues[] { new MyValues { MyVal = "something else" }, new MyValues { MyVal = "xxx" }} },
            };

            var indexClient = client.Indexes.GetClient("index");
            indexClient.Documents.Index(IndexBatch.Upload(docs));

是的,你是对的。 "Algo" 在 StandardLucene 分析器(西班牙语)中被视为停用词:

https://github.com/apache/lucene-solr/blob/master/lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/spanish_stop.txt

更改为 EsMicrosoft 分析器 returns 搜索 "algo":

client.Indexes.Create(new Microsoft.Azure.Search.Models.Index
            {
                Name = "index",
                Fields = new List<Field>
                {
                    new Field("Id", DataType.String){ IsKey = true, IsRetrievable = true, IsFilterable = true},
                    Field.NewComplex("MyValues", true, new List<Field> { new Field("MyVal", DataType.String)
                        {
                            IsRetrievable = true,
                            IsFilterable = true,
                            IsSearchable =true,
                            Analyzer = AnalyzerName.EsMicrosoft
                        }
                    })
                }
            });