Elasticsearch 1.5 curl delete 没有 queryapi
Elasticsearch 1.5 curl delete without queryapi
在我的 elasticsearch 1.4 上,我曾经像这样使用 DeleteByQuery API 删除文档:
curl -XDELETE http://my_elasticsearch:9200/_all/_query?q=some_field:some_value
这并不完美(因为经常出现 OutOfMemoryError),但这足以满足我的需要(此时)。
但现在我使用新的 elasticsearch 1.5 并且在文档中我读到了:
Deprecated in 1.5.0.
"Delete by Query will be removed in 2.0: it is problematic since it silently forces a refresh which can quickly cause OutOfMemoryError during concurrent indexing, and can also cause primary and replica to become inconsistent. Instead, use the scroll/scan API to find all matching ids and then issue a bulk request to delete them..
所以我想用 scroll/scan API 做同样的事情。但是如何删除使用这个?我不明白怎么办。 documentation API and documentation Java API 对我来说似乎不完整(缺少删除示例)。
PS:我正在寻找对 java 或 curl 的理解(无论对我来说在决赛中我都需要两者)。
我 运行 也关注这个问题,但找不到好的代码示例。我会告诉你我想出了什么。我不确定这是否是最好的方法,所以请随时评论如何改进它。请注意,我将查询结果的大小设置为 Integer.MAX_VALUE,以便查询将 return 所有(或尽可能多的)需要删除的结果。
- 运行查询获取所有要删除的ID
- 将所有 ID 的删除请求添加到批量请求中
- 运行批量请求
- 重新运行查询是否还有需要删除的记录
必要时重复
private void deleteAllByQuery(final String index, final String type, final QueryBuilder query) {
SearchResponse response = elasticSearchClient.prepareSearch(index)
.setTypes(type)
.setQuery(query)
.setSize(Integer.MAX_VALUE)
.execute().actionGet();
SearchHit[] searchHits = response.getHits().getHits();
while (searchHits.length > 0) {
LOGGER.debug("Need to delete " + searchHits.length + " records");
// Create bulk request
final BulkRequestBuilder bulkRequest = elasticSearchClient.prepareBulk().setRefresh(true);
// Add search results to bulk request
for (final SearchHit searchHit : searchHits) {
final DeleteRequest deleteRequest = new DeleteRequest(index, type, searchHit.getId());
bulkRequest.add(deleteRequest);
}
// Run bulk request
final BulkResponse bulkResponse = bulkRequest.execute().actionGet();
if (bulkResponse.hasFailures()) {
LOGGER.error(bulkResponse.buildFailureMessage());
}
// After deleting, we should check for more records
response = elasticSearchClient.prepareSearch(index)
.setTypes(type)
.setQuery(query)
.setSize(Integer.MAX_VALUE)
.execute().actionGet();
searchHits = response.getHits().getHits();
}
}
在我的 elasticsearch 1.4 上,我曾经像这样使用 DeleteByQuery API 删除文档:
curl -XDELETE http://my_elasticsearch:9200/_all/_query?q=some_field:some_value
这并不完美(因为经常出现 OutOfMemoryError),但这足以满足我的需要(此时)。
但现在我使用新的 elasticsearch 1.5 并且在文档中我读到了:
Deprecated in 1.5.0.
"Delete by Query will be removed in 2.0: it is problematic since it silently forces a refresh which can quickly cause OutOfMemoryError during concurrent indexing, and can also cause primary and replica to become inconsistent. Instead, use the scroll/scan API to find all matching ids and then issue a bulk request to delete them..
所以我想用 scroll/scan API 做同样的事情。但是如何删除使用这个?我不明白怎么办。 documentation API and documentation Java API 对我来说似乎不完整(缺少删除示例)。
PS:我正在寻找对 java 或 curl 的理解(无论对我来说在决赛中我都需要两者)。
我 运行 也关注这个问题,但找不到好的代码示例。我会告诉你我想出了什么。我不确定这是否是最好的方法,所以请随时评论如何改进它。请注意,我将查询结果的大小设置为 Integer.MAX_VALUE,以便查询将 return 所有(或尽可能多的)需要删除的结果。
- 运行查询获取所有要删除的ID
- 将所有 ID 的删除请求添加到批量请求中
- 运行批量请求
- 重新运行查询是否还有需要删除的记录
必要时重复
private void deleteAllByQuery(final String index, final String type, final QueryBuilder query) { SearchResponse response = elasticSearchClient.prepareSearch(index) .setTypes(type) .setQuery(query) .setSize(Integer.MAX_VALUE) .execute().actionGet(); SearchHit[] searchHits = response.getHits().getHits(); while (searchHits.length > 0) { LOGGER.debug("Need to delete " + searchHits.length + " records"); // Create bulk request final BulkRequestBuilder bulkRequest = elasticSearchClient.prepareBulk().setRefresh(true); // Add search results to bulk request for (final SearchHit searchHit : searchHits) { final DeleteRequest deleteRequest = new DeleteRequest(index, type, searchHit.getId()); bulkRequest.add(deleteRequest); } // Run bulk request final BulkResponse bulkResponse = bulkRequest.execute().actionGet(); if (bulkResponse.hasFailures()) { LOGGER.error(bulkResponse.buildFailureMessage()); } // After deleting, we should check for more records response = elasticSearchClient.prepareSearch(index) .setTypes(type) .setQuery(query) .setSize(Integer.MAX_VALUE) .execute().actionGet(); searchHits = response.getHits().getHits(); } }