Elasticsearch 批量加载奇怪地丢失了 3 个文档中的 1 个

Elasticseach bulk load strangely missing 1 of 3 documents

根据 https://www.elastic.co/guide/en/elasticsearch/guide/current/shingles.html

的带状疱疹示例,我 运行 遇到了奇怪的问题

当我尝试为该教程中的三个文档编制索引时,只有其中两个被编入索引,ID 为 3 的文档从未被编入索引。

发送到 http://elastic:9200/myIndex/page/_bulk 的请求是:

{ "index": { "_id": 1 }}
{ "text": "Sue ate the alligator" }
{ "index": { "_id": 2 }}
{ "text": "The alligator ate Sue" }
{ "index": { "_id": 3 }}
{ "text": "Sue never goes anywhere without her alligator skin purse" }

但响应是:

{
"took": 18,
"errors": false,
"items": [
    {
        "index": {
            "_index": "myIndex",
            "_type": "page",
            "_id": "1",
            "_version": 1,
            "_shards": {
                "total": 1,
                "successful": 1,
                "failed": 0
            },
            "status": 201
        }
    },
    {
        "index": {
            "_index": "myIndex",
            "_type": "page",
            "_id": "2",
            "_version": 1,
            "_shards": {
                "total": 1,
                "successful": 1,
                "failed": 0
            },
            "status": 201
        }
    }
]}

索引和映射定义:

{
"settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "analysis": {
        "filter": {
            "filter_shingle": {
                "type": "shingle",
                "max_shingle_size": 5,
                "min_shingle_size": 2,
                "output_unigrams": "false"
            },
            "filter_stop": {
                "type": "stop"
            }
        },
        "analyzer": {
            "analyzer_shingle": {
                "tokenizer": "standard",
                "filter": ["standard", "lowercase", "filter_stop", "filter_shingle"]
            }
        }
    }
},
"mappings": {
    "page": {
        "properties": {
            "text": {
                "type": "string",
                "index_options": "offsets",
                "analyzer": "standard",
                "fields": {
                    "shingles": {
                        "search_analyzer": "analyzer_shingle",
                        "analyzer": "analyzer_shingle",
                        "type": "string"
                    }
                }
            },
            "title": {
                "type": "string",
                "index_options": "offsets",
                "analyzer": "standard",
                "search_analyzer": "standard"
            }
        }
    }
}}

当批量发布文档时,您需要确保在最后一行之后包含一个换行符,如explained in the official docs

curl -XPOST http://elastic:9200/myIndex/page/_bulk -d '
{ "index": { "_id": 1 }}
{ "text": "Sue ate the alligator" }
{ "index": { "_id": 2 }}
{ "text": "The alligator ate Sue" }
{ "index": { "_id": 3 }}
{ "text": "Sue never goes anywhere without her alligator skin purse" }
'      <--- new line