为什么 ElasticSearch Java 客户端索引 Future 在记录可搜索之前完成？

Question

我正在使用 elastic4s 客户端 returns 一个 Future 来响应索引请求，当那个 future 完成时我仍然需要做 Thread.sleep(1000) 才能查询该索引记录。大多数情况下正好是 1 秒。是否有我可以更改的 elasticsearch 设置，以便当 Future 完成时记录可用？

我尝试直接使用 java 客户端 client.prepareIndex....execute().actionGet(); 结果完全一样，我必须调用 Thread.sleep(1000)

我可以更改任何设置以使记录在未来完成后准备就绪吗？

execute(index into(foo, bar) id uuid fields baz).await
Thread.sleep(1000) // This is mandatory for search to find it
execute {search in foo}.await // returns empty without Thread.sleep(1000)

Answer 1

听起来您可能不得不等待默认索引刷新间隔开始发挥作用，然后才能查询新索引的数据。 The refresh interval is 1 second by default 并且可以通过以下方式更改

curl -XPUT localhost:9200/test/_settings -d '{
    "index" : {
        "refresh_interval" : "1s"
    } }'

或者，您可以在 including the refresh parameter in the query string of the index operation 索引操作后刷新 shard。这可能比全局更改刷新间隔更有用

curl -XPUT 'http://localhost:9200/{index}/{type}/{id}?refresh=true' -d '{
  "property" : "value"
}'

Answer 2

Russ 的回答是正确的，但我想补充一点关于 Scala 方面的内容。

当你做索引操作时，返回的future在Elasticsearch集群处理完命令后立即完成。这与文档可供搜索的时间不同。也就是说，正如 Russ 指出的那样，1 秒后（默认情况下）。

所以你的未来在 k 完成。您的文档在 k+1 秒可用。

创建索引时可以调整刷新间隔，例如Elastic4s

create index "myindex" refreshInterval "200ms" mappings ...

在下一个版本中，您可以使用 Scala 持续时间，例如

create index "myindex" refreshInterval 200.millis mappings ...

但请注意，如果对此进行过多调整，则会删除刷新间隔带来的一些优化。如果您正在做多个索引等，那么请查看批量 api。（在 Elastic4s 中，只需将您的调用包装在 bulk(seq) 中）

为什么 ElasticSearch Java 客户端索引 Future 在记录可搜索之前完成？

Why ElasticSearch Java Client index Future completes before the record is searchable?

elasticsearch

elastic4s