如何在 Azure 搜索中匹配此查询

How to match this query in Azure Search

我有这个索引

{
  "name": "testentities",
  "fields": [
    {
      "name": "id",
      "type": "Edm.String",
      "key": true,
      "retrievable": true,
       "filterable": true,
       "sortable": true
    },
    {
      "name": "entity_id",
      "type": "Edm.String",
      "searchable": true,
      "sortable": true,
      "facetable": false,
      "retrievable": true,
      "filterable": true,
      "searchAnalyzer":"standard",
      "indexAnalyzer": "custom_analyzer"
    },
    {
      "name": "description",
      "type": "Edm.String",
      "searchable": true,
      "sortable": false,
      "facetable": false,
      "retrievable": true,
      "filterable": true
    },
    {
      "name": "name",
      "type": "Edm.String",
      "searchable": true,
      "sortable": true,
      "facetable": false,
      "retrievable": true,
      "filterable": true
    },
    {
      "name": "entity_type",
      "type": "Edm.String",
      "searchable": true,
      "sortable": true,
      "facetable": true,
      "retrievable": true,
      "filterable": true
    },
    {
      "name": "ancestors",
      "type": "Collection(Edm.String)",
      "searchable": false,
      "sortable": false,
      "facetable": false,
      "retrievable": true,
      "filterable": true
    },
    {
      "name": "calendar_id",
      "type": "Edm.String",
      "searchable": false,
      "sortable": false,
      "facetable": false,
      "retrievable": false,
      "filterable": false
    },
    {
      "name": "currency",
      "type": "Edm.String",
      "searchable": false,
      "sortable": false,
      "facetable": false,
      "retrievable": false,
      "filterable": false
    },
    {
      "name": "timezone",
      "type": "Edm.String",
      "searchable": false,
      "sortable": false,
      "facetable": false,
      "retrievable": false,
      "filterable": false
    },
    {
      "name": "active",
      "type": "Edm.Boolean",
      "retrievable": true,
      "facetable": true,
      "filterable": true
    },
    {
      "name": "kpi_collection",
      "type": "Edm.String",
      "searchable": false,
      "sortable": false,
      "facetable": false,
      "retrievable": false,
      "filterable": false
    },
    {
      "name": "rid",
      "type": "Edm.String"
    }
  ],
  "scoringProfiles": [
    {
      "name": "boostEntity",
      "text": {
        "weights": {
          "entity_id": 9,
          "name": 8,
          "description": 1
        }
      }
    }
  ],
  "analyzers": [
    {
      "name": "custom_analyzer",
      "@odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
      "tokenizer":"token1",
      "tokenFilters": [
        "lowercase",
        "entityID_stopWords",
        "entityID_edgeNGram"

      ]
    }
  ],
  "tokenizers":[  
   {  
      "name":"token1",  
      "@odata.type":"#Microsoft.Azure.Search.StandardTokenizerV2"
   }
   ],
  "tokenFilters": [
    {
      "name": "entityID_edgeNGram",
      "@odata.type": "#Microsoft.Azure.Search.EdgeNGramTokenFilterV2",
      "minGram": 1,
      "maxGram": 6
    },
    {
      "name": "entityID_stopWords",
      "@odata.type": "#Microsoft.Azure.Search.StopwordsTokenFilter",
      "stopwords": [
        "store",
        "region",
        "zone",
        "field_org",
        ":"
      ]
    }
  ]
}

如果我执行这个查询:

{ "search": "0001", "filter": "entity_type eq 'store' ", "select":"name,entity_id,entity_type,description,active,ancestors", "count": "true"

}

我得到这个结果,这是正确的,因为它与实体 ID 之后得分最高的名称相匹配。

"@odata.count": 1,
"value": [
    {
        "@search.score": 1.6654625,
        "name": "LensCrafters 0001",
        "entity_id": "store:1",
        "entity_type": "store",
        "description": "2130 Mall Road, Florence, 41042, KY, US",
        "active": true,
        "ancestors": [
            "region:1021",
            "zone:1123",
            "field_org:lenscrafters_na",
            "ROOT"
        ]
    }
]

}

但是如果我运行这个查询

{
  "search": "1",
  "filter": "entity_type eq 'store' ",
  "select":"name,entity_id,entity_type,description,active,ancestors",
  "count": "true"

}

我得到的结果不正确

 {
            "@search.score": 1.4522386,
            "name": "LensCrafters 1622",
            "entity_id": "store:1622",
            "entity_type": "store",
            "description": "31625 Pacific Hwy S, Spc #E-1, Federal Way, 98003-5645, WA, US",
            "active": true,
            "ancestors": [
                "region:1024",
                "zone:1107",
                "field_org:lenscrafters_na",
                "ROOT"
            ]
        },
        {
            "@search.score": 1.3403159,
            "name": "LensCrafters 1178",
            "entity_id": "store:1178",
            "entity_type": "store",
            "description": "1 W FlatIron Crossing Dr #1104, Broomfield, 80021-8881, CO, US",
            "active": true,
            "ancestors": [
                "region:1019",
                "zone:1122",
                "field_org:lenscrafters_na",
                "ROOT"
            ]
        },
        { 
...............

尽管内部评分配置文件 entity_is 的值为 9,但为什么结果不是这个?

 "@odata.count": 1,
    "value": [
        {
            "@search.score": 1.6654625,
            "name": "LensCrafters 0001",
            "entity_id": "store:1",
            "entity_type": "store",
            "description": "2130 Mall Road, Florence, 41042, KY, US",
            "active": true,
            "ancestors": [
                "region:1021",
                "zone:1123",
                "field_org:lenscrafters_na",
                "ROOT"
            ]
        }
    ]
}

这是得分概况?

"scoringProfiles": [
        {
            "name": "boostEntity",
            "text": {
                "weights": {
                    "entity_id": 9,
                    "name": 8,
                    "description": 1
                }
            },
            "functions": [],
            "functionAggregation": null
        }
    ],.............

您正在 entity_id 字段上使用自定义分析器,它为文本 store:1178 生成以下标记:1, 11, 117, 1178(您可以使用 [=12= 测试您的分析器配置]).这意味着,文档 LensCrafters 1622LensCrafters 1178 匹配查询以及文档 LensCrafters 0001 - 他们在 entity_id 中都有 1。然而,文档 LensCrafters 1622LensCrafters 1178 在描述中也匹配 1。因此,他们的得分高于 LensCrafters 0001

要详细了解 Azure 搜索中的查询处理和自定义分析器,请阅读:How full text search works in Azure Search

你想在你的分析链中保留 edgeNGram 标记过滤器吗?为什么?