在索引大数据时,如果我删除索引 ElasticSearch 再次创建索引并索引文档而不是给出 Index not exists
While indexing large data, if I delete the index ElasticSearch creating index again and indexing the docs instead of giving Index not exists
我正在为 1000 个批次的大数据编制索引。我有 100000 个文档。如果我直接从弹性搜索中删除索引(来自:http://localhost:9200/_plugin/head/),它会重新创建索引并为新文档编制索引,因此旧文档会丢失。如果索引不存在,我需要通过错误。
下面是json查询两条记录(实际是1000条)
{ "index" : {"_index":"1020","_type":"PROGRAMS","_id":"3149012","_routing":"PROGRAMS"} }
{
"OBJID": 3149015,
"MAINTITLE": "SPEAR TALK",
"A_S_DESC": null,
"A_S_ORIG_NA": "PFT",
"EPISODE_NAME": "DEMOPROG",
"A_S_DURATION": null,
"A_S_EP_NU": "111",
"A_S_EP_NA": "DEMOPROG",
"S_SOM": "10:00:00:00",
"S_FRAMERAT": 25,
"A_S_QUALITY": "HD",
"A_S_TX_TIME": "150200",
"A_S_TX_DATE": "20150228",
"TX_DATE_TIME": "20150228150200",
"REGISTRATION": "20150228",
"REGISTRATIO2": "150240",
"REGISTRATION_DATE_TIME": "20150228150240",
"CreatedOn": "2015-02-28T15:02:40",
"A_S_VERSION": "MIX",
"SUG_MAINTITLE": "DEMO PROGRAM 1 EP 111",
"A_DISPLAY_NA": "DEMO PROGRAM 1",
"channel": "DEMO-CHANNEL1",
"CHANNEL_DISPLAY_NA": "DEMO CHANNEL 1",
"IS_ARCHIVED": "NOT ARCHIVED",
"IS_TXRC": "TX",
"OBJECTCLASS": "PROGRAMS",
"SUB_OBJECTCLASS_FACET": "Assets",
"OBJECTCLASS_FACET": "Programs",
"kxjrt94fbr": "kxjrt94fbr",
"SortOrderValue": "1",
"VideoURL": null,
"ThumbURL": null,
"GENRE": null,
"S_ArchivedInstanceInfo": null,
"searchColumn": "SPEAR TALK DEMOPROG DEMO PROGRAM 1",
"RowNum": 2
}
{ "index" : {"_index":"1020","_type":"PROGRAMS","_id":"3149015","_routing":"PROGRAMS"} }
{
"OBJID": 3149015,
"MAINTITLE": "SPEAR TALK",
"A_S_DESC": null,
"A_S_ORIG_NA": "PFT",
"EPISODE_NAME": "DEMOPROG",
"A_S_DURATION": null,
"A_S_EP_NU": "111",
"A_S_EP_NA": "DEMOPROG",
"S_SOM": "10:00:00:00",
"S_FRAMERAT": 25,
"A_S_QUALITY": "HD",
"A_S_TX_TIME": "150200",
"A_S_TX_DATE": "20150228",
"TX_DATE_TIME": "20150228150200",
"REGISTRATION": "20150228",
"REGISTRATIO2": "150240",
"REGISTRATION_DATE_TIME": "20150228150240",
"CreatedOn": "2015-02-28T15:02:40",
"A_S_VERSION": "MIX",
"SUG_MAINTITLE": "DEMO PROGRAM 1 EP 111",
"A_DISPLAY_NA": "DEMO PROGRAM 1",
"channel": "DEMO-CHANNEL1",
"CHANNEL_DISPLAY_NA": "DEMO CHANNEL 1",
"IS_ARCHIVED": "NOT ARCHIVED",
"IS_TXRC": "TX",
"OBJECTCLASS": "PROGRAMS",
"SUB_OBJECTCLASS_FACET": "Assets",
"OBJECTCLASS_FACET": "Programs",
"kxjrt94fbr": "kxjrt94fbr",
"SortOrderValue": "1",
"VideoURL": null,
"ThumbURL": null,
"GENRE": null,
"S_ArchivedInstanceInfo": null,
"searchColumn": "SPEAR TALK DEMOPROG DEMO PROGRAM 1",
"RowNum": 2
}
接下来的 1000 个批次将以相同的方式编入索引。我是否需要检查 Index 是否存在?或者有什么方法可以从 ElasticSearch 知道吗?
Elasticsearch 将根据您的对象结构自动创建索引,但请记住,任何自定义字段设置可能无法以这种方式维护。
如果要检查索引是否存在,使用:
curl -XHEAD -i 'http://localhost:9200/your_index'
或您使用的任何客户端的等效项。
如果要停止自动创建索引,请将此添加到 config/elasticsearch.yml:
action.auto_create_index: false
可在此处找到:Automatic index creation
我正在为 1000 个批次的大数据编制索引。我有 100000 个文档。如果我直接从弹性搜索中删除索引(来自:http://localhost:9200/_plugin/head/),它会重新创建索引并为新文档编制索引,因此旧文档会丢失。如果索引不存在,我需要通过错误。 下面是json查询两条记录(实际是1000条)
{ "index" : {"_index":"1020","_type":"PROGRAMS","_id":"3149012","_routing":"PROGRAMS"} }
{
"OBJID": 3149015,
"MAINTITLE": "SPEAR TALK",
"A_S_DESC": null,
"A_S_ORIG_NA": "PFT",
"EPISODE_NAME": "DEMOPROG",
"A_S_DURATION": null,
"A_S_EP_NU": "111",
"A_S_EP_NA": "DEMOPROG",
"S_SOM": "10:00:00:00",
"S_FRAMERAT": 25,
"A_S_QUALITY": "HD",
"A_S_TX_TIME": "150200",
"A_S_TX_DATE": "20150228",
"TX_DATE_TIME": "20150228150200",
"REGISTRATION": "20150228",
"REGISTRATIO2": "150240",
"REGISTRATION_DATE_TIME": "20150228150240",
"CreatedOn": "2015-02-28T15:02:40",
"A_S_VERSION": "MIX",
"SUG_MAINTITLE": "DEMO PROGRAM 1 EP 111",
"A_DISPLAY_NA": "DEMO PROGRAM 1",
"channel": "DEMO-CHANNEL1",
"CHANNEL_DISPLAY_NA": "DEMO CHANNEL 1",
"IS_ARCHIVED": "NOT ARCHIVED",
"IS_TXRC": "TX",
"OBJECTCLASS": "PROGRAMS",
"SUB_OBJECTCLASS_FACET": "Assets",
"OBJECTCLASS_FACET": "Programs",
"kxjrt94fbr": "kxjrt94fbr",
"SortOrderValue": "1",
"VideoURL": null,
"ThumbURL": null,
"GENRE": null,
"S_ArchivedInstanceInfo": null,
"searchColumn": "SPEAR TALK DEMOPROG DEMO PROGRAM 1",
"RowNum": 2
}
{ "index" : {"_index":"1020","_type":"PROGRAMS","_id":"3149015","_routing":"PROGRAMS"} }
{
"OBJID": 3149015,
"MAINTITLE": "SPEAR TALK",
"A_S_DESC": null,
"A_S_ORIG_NA": "PFT",
"EPISODE_NAME": "DEMOPROG",
"A_S_DURATION": null,
"A_S_EP_NU": "111",
"A_S_EP_NA": "DEMOPROG",
"S_SOM": "10:00:00:00",
"S_FRAMERAT": 25,
"A_S_QUALITY": "HD",
"A_S_TX_TIME": "150200",
"A_S_TX_DATE": "20150228",
"TX_DATE_TIME": "20150228150200",
"REGISTRATION": "20150228",
"REGISTRATIO2": "150240",
"REGISTRATION_DATE_TIME": "20150228150240",
"CreatedOn": "2015-02-28T15:02:40",
"A_S_VERSION": "MIX",
"SUG_MAINTITLE": "DEMO PROGRAM 1 EP 111",
"A_DISPLAY_NA": "DEMO PROGRAM 1",
"channel": "DEMO-CHANNEL1",
"CHANNEL_DISPLAY_NA": "DEMO CHANNEL 1",
"IS_ARCHIVED": "NOT ARCHIVED",
"IS_TXRC": "TX",
"OBJECTCLASS": "PROGRAMS",
"SUB_OBJECTCLASS_FACET": "Assets",
"OBJECTCLASS_FACET": "Programs",
"kxjrt94fbr": "kxjrt94fbr",
"SortOrderValue": "1",
"VideoURL": null,
"ThumbURL": null,
"GENRE": null,
"S_ArchivedInstanceInfo": null,
"searchColumn": "SPEAR TALK DEMOPROG DEMO PROGRAM 1",
"RowNum": 2
}
接下来的 1000 个批次将以相同的方式编入索引。我是否需要检查 Index 是否存在?或者有什么方法可以从 ElasticSearch 知道吗?
Elasticsearch 将根据您的对象结构自动创建索引,但请记住,任何自定义字段设置可能无法以这种方式维护。
如果要检查索引是否存在,使用:
curl -XHEAD -i 'http://localhost:9200/your_index'
或您使用的任何客户端的等效项。
如果要停止自动创建索引,请将此添加到 config/elasticsearch.yml:
action.auto_create_index: false
可在此处找到:Automatic index creation