Elasticsearch 1.5 curl delete 没有 queryapi

Elasticsearch 1.5 curl delete without queryapi

在我的 elasticsearch 1.4 上,我曾经像这样使用 DeleteByQuery API 删除文档:

curl -XDELETE http://my_elasticsearch:9200/_all/_query?q=some_field:some_value

这并不完美(因为经常出现 OutOfMemoryError),但这足以满足我的需要(此时)。

但现在我使用新的 elasticsearch 1.5 并且在文档中我读到了:

Deprecated in 1.5.0.

"Delete by Query will be removed in 2.0: it is problematic since it silently forces a refresh which can quickly cause OutOfMemoryError during concurrent indexing, and can also cause primary and replica to become inconsistent. Instead, use the scroll/scan API to find all matching ids and then issue a bulk request to delete them..

所以我想用 scroll/scan API 做同样的事情。但是如何删除使用这个?我不明白怎么办。 documentation API and documentation Java API 对我来说似乎不完整(缺少删除示例)。

PS:我正在寻找对 java 或 curl 的理解(无论对我来说在决赛中我都需要两者)。

我 运行 也关注这个问题,但找不到好的代码示例。我会告诉你我想出了什么。我不确定这是否是最好的方法,所以请随时评论如何改进它。请注意,我将查询结果的大小设置为 Integer.MAX_VALUE,以便查询将 return 所有(或尽可能多的)需要删除的结果。

  1. 运行查询获取所有要删除的ID
  2. 将所有 ID 的删除请求添加到批量请求中
  3. 运行批量请求
  4. 重新运行查询是否还有需要删除的记录
  5. 必要时重复

    private void deleteAllByQuery(final String index, final String type, final QueryBuilder query) {
        SearchResponse response = elasticSearchClient.prepareSearch(index)
                .setTypes(type)
                .setQuery(query)
                .setSize(Integer.MAX_VALUE)
                .execute().actionGet();
    
        SearchHit[] searchHits = response.getHits().getHits();
    
        while (searchHits.length > 0) {
            LOGGER.debug("Need to delete " + searchHits.length + " records");
    
            // Create bulk request
            final BulkRequestBuilder bulkRequest = elasticSearchClient.prepareBulk().setRefresh(true);
    
            // Add search results to bulk request
            for (final SearchHit searchHit : searchHits) {
                final DeleteRequest deleteRequest = new DeleteRequest(index, type, searchHit.getId());
                bulkRequest.add(deleteRequest);
            }
    
            // Run bulk request
            final BulkResponse bulkResponse = bulkRequest.execute().actionGet();
            if (bulkResponse.hasFailures()) {
                LOGGER.error(bulkResponse.buildFailureMessage());
            }
    
            // After deleting, we should check for more records
            response = elasticSearchClient.prepareSearch(index)
                .setTypes(type)
                .setQuery(query)
                .setSize(Integer.MAX_VALUE)
                .execute().actionGet();
    
            searchHits = response.getHits().getHits();
        }
    }