Elasticsearch 批量加载奇怪地丢失了 3 个文档中的 1 个
Elasticseach bulk load strangely missing 1 of 3 documents
根据 https://www.elastic.co/guide/en/elasticsearch/guide/current/shingles.html
的带状疱疹示例,我 运行 遇到了奇怪的问题
当我尝试为该教程中的三个文档编制索引时,只有其中两个被编入索引,ID 为 3 的文档从未被编入索引。
发送到 http://elastic:9200/myIndex/page/_bulk 的请求是:
{ "index": { "_id": 1 }}
{ "text": "Sue ate the alligator" }
{ "index": { "_id": 2 }}
{ "text": "The alligator ate Sue" }
{ "index": { "_id": 3 }}
{ "text": "Sue never goes anywhere without her alligator skin purse" }
但响应是:
{
"took": 18,
"errors": false,
"items": [
{
"index": {
"_index": "myIndex",
"_type": "page",
"_id": "1",
"_version": 1,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"status": 201
}
},
{
"index": {
"_index": "myIndex",
"_type": "page",
"_id": "2",
"_version": 1,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"status": 201
}
}
]}
索引和映射定义:
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"filter": {
"filter_shingle": {
"type": "shingle",
"max_shingle_size": 5,
"min_shingle_size": 2,
"output_unigrams": "false"
},
"filter_stop": {
"type": "stop"
}
},
"analyzer": {
"analyzer_shingle": {
"tokenizer": "standard",
"filter": ["standard", "lowercase", "filter_stop", "filter_shingle"]
}
}
}
},
"mappings": {
"page": {
"properties": {
"text": {
"type": "string",
"index_options": "offsets",
"analyzer": "standard",
"fields": {
"shingles": {
"search_analyzer": "analyzer_shingle",
"analyzer": "analyzer_shingle",
"type": "string"
}
}
},
"title": {
"type": "string",
"index_options": "offsets",
"analyzer": "standard",
"search_analyzer": "standard"
}
}
}
}}
当批量发布文档时,您需要确保在最后一行之后包含一个换行符,如explained in the official docs
curl -XPOST http://elastic:9200/myIndex/page/_bulk -d '
{ "index": { "_id": 1 }}
{ "text": "Sue ate the alligator" }
{ "index": { "_id": 2 }}
{ "text": "The alligator ate Sue" }
{ "index": { "_id": 3 }}
{ "text": "Sue never goes anywhere without her alligator skin purse" }
' <--- new line
根据 https://www.elastic.co/guide/en/elasticsearch/guide/current/shingles.html
的带状疱疹示例,我 运行 遇到了奇怪的问题当我尝试为该教程中的三个文档编制索引时,只有其中两个被编入索引,ID 为 3 的文档从未被编入索引。
发送到 http://elastic:9200/myIndex/page/_bulk 的请求是:
{ "index": { "_id": 1 }}
{ "text": "Sue ate the alligator" }
{ "index": { "_id": 2 }}
{ "text": "The alligator ate Sue" }
{ "index": { "_id": 3 }}
{ "text": "Sue never goes anywhere without her alligator skin purse" }
但响应是:
{
"took": 18,
"errors": false,
"items": [
{
"index": {
"_index": "myIndex",
"_type": "page",
"_id": "1",
"_version": 1,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"status": 201
}
},
{
"index": {
"_index": "myIndex",
"_type": "page",
"_id": "2",
"_version": 1,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"status": 201
}
}
]}
索引和映射定义:
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"filter": {
"filter_shingle": {
"type": "shingle",
"max_shingle_size": 5,
"min_shingle_size": 2,
"output_unigrams": "false"
},
"filter_stop": {
"type": "stop"
}
},
"analyzer": {
"analyzer_shingle": {
"tokenizer": "standard",
"filter": ["standard", "lowercase", "filter_stop", "filter_shingle"]
}
}
}
},
"mappings": {
"page": {
"properties": {
"text": {
"type": "string",
"index_options": "offsets",
"analyzer": "standard",
"fields": {
"shingles": {
"search_analyzer": "analyzer_shingle",
"analyzer": "analyzer_shingle",
"type": "string"
}
}
},
"title": {
"type": "string",
"index_options": "offsets",
"analyzer": "standard",
"search_analyzer": "standard"
}
}
}
}}
当批量发布文档时,您需要确保在最后一行之后包含一个换行符,如explained in the official docs
curl -XPOST http://elastic:9200/myIndex/page/_bulk -d '
{ "index": { "_id": 1 }}
{ "text": "Sue ate the alligator" }
{ "index": { "_id": 2 }}
{ "text": "The alligator ate Sue" }
{ "index": { "_id": 3 }}
{ "text": "Sue never goes anywhere without her alligator skin purse" }
' <--- new line