为什么 Retrieve 和 Rank 在查询集合时会忽略我的索引?

Why does Retrieve and Rank ignore my indexes when querying a collection?

我们在 Retrieve and Rank 中有一个 Solr 集合,其中包含一个名为 document_sub_type 的字段。该字段在 Solr 模式中索引,但没有 字段类型 值(我知道排名器打算使用的字段必须具有 字段类型 的值为 "Watson_text_en";该字段没有)。我们想过滤此 document_sub_type 元数据字段的结果。

如果我发送查询 power systems client reference AND (document_sub_type:"Client Reference*" OR document_sub_type:"Case Study*") 到 R&R 的 /select 端点,我只得到 document_sub_type 值为 [=53= 的文档] 或 "Client Reference Brief",正如预期的那样。但是,如果我向 /fcselect 端点发送相同的查询,则返回的文档有一个 document_sub_type 值,显然可以包含任何值。

我承认我们的排序器没有经过充分训练,但即使我们从查询中省略排序器也会出现这种情况。

为什么 /fcselect 忽略查询的元数据部分?

以下是两个查询的完整响应正文:

来自/select:

{
  "responseHeader": {
    "status": 0,
    "QTime": 2,
    "params": {
      "q": "power systems client reference AND (document_sub_type:\"Client Reference*\" OR document_sub_type:\"Case Study*\")",
      "fl": "document_sub_type",
      "wt": "json"
    }
  },
  "response": {
    "numFound": 89,
    "start": 0,
    "docs": [
      {
        "document_sub_type": "Client Reference Book"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Book"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Brief"
      }
    ]
  }
}

来自 /fcselect:

{
  "responseHeader": {
    "status": 0,
    "QTime": 65,
    "params": {
      "q": "power systems client reference AND (document_sub_type:\"Client Reference*\" OR document_sub_type:\"Case Study*\")",
      "ranker_id": "c852c8x19-rank-422",
      "fl": "document_sub_type",
      "wt": "json"
    }
  },
  "response": {
    "numFound": 39428,
    "start": 0,
    "maxScore": 10,
    "docs": [
      {
        "document_sub_type": "Sales guidance"
      },
      {
        "document_sub_type": "Other sales tool or Utility"
      },
      {
        "document_sub_type": "Client Reference Book"
      },
      {
        "document_sub_type": "Client Reference Brief"
      },
      {
        "document_sub_type": "Client Reference Book"
      },
      {
        "document_sub_type": "At a Glance"
      },
      {
        "document_sub_type": "Brief or Template for Marketing"
      },
      {
        "document_sub_type": "text/plain"
      },
      {
        "document_sub_type": "Brief or Template for Marketing"
      },
      {
        "document_sub_type": "QRG"
      }
    ]
  }
}

/fcselect 端点不支持在查询参数本身中将术语与布尔运算符组合。对于这种类型的操作,您应该能够使用过滤器查询来获得预期的结果。有关详细信息,请参阅此处的文档:https://www.ibm.com/watson/developercloud/doc/retrieve-rank/plugin_query_syntax.shtml#top