ElasticSearch 能否将字段拼接在一起以显示来自字段不同部分的匹配项?

Can ElasticSearch stitch together a field to show matches from different parts of a field?

在长字段上执行查询时,即。 Description,字段本身的长度可能是 200 天或更多字符。

为了在搜索结果中显示相关性,ES 能否将字段的不同部分拼接在一起以显示这一点?

例如:

There was a red car with four doors driving down the brick road ... and another red balloon was floating.

如果查询搜索“red”,有没有办法显示如下内容:

There was a [em]red[/em] car with four doors . . . and another [em]red[/em] balloon was floating.

我意识到我们可以使用 highlighting 将匹配的关键字片段包裹在强调标签中。

想知道ES能不能把匹配到的关键字片段周围的相关字段片段拼接在一起

是的,您走在正确的道路上,这正是 highlighting 的目的。让我们在您的示例中尝试一下。

首先,让我们创建一个索引 highlights,其映射类型具有一个名为 content 的字符串字段。对于这个例子,我们使用 fast vector highlighter,它完成我们想要显示的工作。

curl -XPUT localhost:9200/highlights -d '{
  "mappings": {
    "highlight": {
      "properties": {
        "content": {
          "type": "string",
          "term_vector": "with_positions_offsets"
        }
      }
    }
  }
}'

然后我们用您建议的内容索引一个新文档:

curl -XPUT localhost:9200/highlights/highlight/1 -d '{
    "content": "There was a red car with four doors driving down the brick road bla bla bla bla bla bla bla bla bla bla bla bla and another red balloon was floating."
}'

现在我们可以查询它并突出显示术语 red,如下所示:

curl -XPOST localhost:9200/highlights/highlight/_search -d '{
  "_source": false,
  "query": {
    "match": {
      "content": "red"
    }
  },
  "highlight": {
    "fields": {
      "content": {
        "fragment_size": 30
      }
    }
  }
}'

这会产生以下结果:

{
  ...
  "hits" : {
    "total" : 1,
    "max_score" : 0.06780553,
    "hits" : [ {
      "_index" : "highlights",
      "_type" : "highlight",
      "_id" : "1",
      "_score" : 0.06780553,
      "highlight" : {
        "content" : [ 
          "There was a <em>red</em> car with four doors", 
          "bla and another <em>red</em> balloon was floating" 
        ]
      }
    } ]
  }
}

另请注意,如果需要,标签可以customized and changed您喜欢的。