弹性搜索中的多字段文本和关键字字段

Multi field text and keyword fields in elasticsearch

我正在考虑从 solr 切换到 elasticsearch 并在不提供 schema/mapping 的情况下将一堆文档编入索引,并且我之前在 solr 中设置为索引字符串的许多字段有已设置为 text and keyword fields using multi-fields.

在搜索时不考虑或至少区别对待 keyword field also as a text field using multi-fields? in my case most values in fields are single words so i'd imagine it wouldn't matter if they are sent to the analyzer but the es docs seem to imply that keyword 字段有什么好处吗?

如果我搜索术语 "ipad",那么如果文档在关键字字段以及其他一些文本字段中具有 "ipad" 与没有关键字字段的相同文档?如果说 "ipad" 仅在关键字字段中,文档是否仍然匹配?

为了回答我自己的问题,我创建了一个快速测试,搜索时几乎所有关键字和文本字段都是等效的,而且多字段似乎与其主要类型获得相同的分数,所以我猜第二个字段对搜索得分

奇怪的是,关键字和文本字段中的多词值得到了相同的分数,我希望关键字字段的分数更低或根本没有,但出于我的目的,这很好,所以我不打算调查更进一步。

索引创建

PUT test_index
{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "test_type" : {
            "properties" : {
                "multifield": {
                  "type": "text",
                  "fields": {
                     "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                     }
                  }
                },

                "keywordfield": {
                  "type": "keyword"
                },

                "textfield": {
                  "type": "text"
                }

            }
        }
    }
}

数据插入

POST /_bulk
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 1 }
{ "doc" : { "multifield" : "ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 2 }
{ "doc" : { "keywordfield" : "ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 3 }
{ "doc" : { "keywordfield" : "a green ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 4 }
{ "doc" : { "textfield" : "a yellow ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 5 }
{ "doc" : { "keywordfield" : "ipad", "textfield" : "ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 6 }
{ "doc" : { "keywordfield" : "unrelated", "textfield" : "hopefully this wont show up"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 7 }
{ "doc" : { "textfield" : "ipad"  }, "doc_as_upsert" : true }

结果

GET /test_index/_search?q=ipad
{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 6,
      "max_score": 0.28122374,
      "hits": [
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "5",
            "_score": 0.28122374,
            "_source": {
               "keywordfield": "ipad",
               "textfield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "1",
            "_score": 0.2734406,
            "_source": {
               "multifield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "2",
            "_score": 0.2734406,
            "_source": {
               "keywordfield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "7",
            "_score": 0.2734406,
            "_source": {
               "textfield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "3",
            "_score": 0.16417998,
            "_source": {
               "keywordfield": "a green ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "4",
            "_score": 0.16417998,
            "_source": {
               "textfield": "a yellow ipad"
            }
         }
      ]
   }
}