为什么通过 id 简单查询会导致超时异常?

Why simple query by id could cause timeout exception?

生产环境偶尔会出现以下异常,

2020-01-29 17:10:46.085 ERROR 2852 --- [o-8022-exec-258] c.c.p.common.dao.SearchDao               : Search person by id failed

java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-832 [ACTIVE]
        at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:789) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestClient.performRequest(RestClient.java:225) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestClient.performRequest(RestClient.java:212) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1433) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1403) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1373) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.get(RestHighLevelClient.java:699) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]

Caused by: java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-832 [ACTIVE]
        at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92) ~[httpasyncclient-4.1.4.jar!/:4.1.4]
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39) ~[httpasyncclient-4.1.4.jar!/:4.1.4]
        at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591) ~[httpcore-nio-4.4.11.jar!/:4.4.11]

但这只是一个简单的查询,而不是一个复杂的查询

 curl 'http://localhost:9201/person/_doc/30154410564?pretty'

而且这段时间,负载很低

那么为什么会存在这些超时异常呢?并且有很多搜索查询,但为什么只有这个简单的 query by id 会导致此异常?

人物索引是从Oracle DB同步过来的,有定时任务,每隔10分钟同步变化的人物索引,如果此时访问人物索引,会导致30,000 milliseconds timeout .那么如何解决呢?而且好像用Java客户端访问,会存在这个现象,但是用命令行curl访问就不会存在这个现象了。

PS:

health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   person              jb3msRw5S9ixgXN5SLd6bw   1   0  140754205     19239587     19.8gb         19.8gb

并且在这个时间里有为 person 索引写入的索引

RestClient 配置:

private final RestHighLevelClient restHighLevelClient;
restHighLevelClient = new RestHighLevelClient(RestClient.builder(new HttpHost(host, port)));

通过调用 hot_threads api

 curl 'http://localhost:9201/_nodes/hot_threads?pretty'

得到以下信息:

   100.9% (504.4ms out of 500ms) cpu usage by thread 'elasticsearch[node-1][get][T#6]'
     8/10 snapshots sharing following 33 elements
       app//org.apache.lucene.index.SingletonSortedNumericDocValues.nextDoc(SingletonSortedNumericDocValues.java:53)
       app//org.apache.lucene.codecs.lucene80.IndexedDISI.writeBitSet(IndexedDISI.java:196)
       app//org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.writeValues(Lucene80DocValuesConsumer.java:214)
       app//org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.addNumericField(Lucene80DocValuesConsumer.java:111)
       app//org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addNumericField(PerFieldDocValuesFormat.java:109)
       app//org.apache.lucene.index.ReadersAndUpdates.handleDVUpdates(ReadersAndUpdates.java:368)
       app//org.apache.lucene.index.ReadersAndUpdates.writeFieldUpdates(ReadersAndUpdates.java:570)
       app//org.apache.lucene.index.ReaderPool.writeAllDocValuesUpdates(ReaderPool.java:228)
       app//org.apache.lucene.index.IndexWriter.writeReaderPool(IndexWriter.java:3308)
       app//org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:520)
       app//org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:294)
       app//org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:269)
       app//org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:259)
       app//org.apache.lucene.index.FilterDirectoryReader.doOpenIfChanged(FilterDirectoryReader.java:112)
       app//org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:140)
       app//org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:156)
       app//org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58)
       app//org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176)
       app//org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:253)
       app//org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:1548)
       app//org.elasticsearch.index.engine.InternalEngine.get(InternalEngine.java:652)
       app//org.elasticsearch.index.shard.IndexShard.get(IndexShard.java:916)
       app//org.elasticsearch.index.get.ShardGetService.innerGet(ShardGetService.java:169)
       app//org.elasticsearch.index.get.ShardGetService.get(ShardGetService.java:93)
       app//org.elasticsearch.index.get.ShardGetService.get(ShardGetService.java:84)
       app//org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:106)
       app//org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:45)

貌似id查询person时,内部自动执行刷新,从官方文档中得到

By default, the get API is realtime, and is not affected by the refresh rate of the index (when data will become visible for search). If a document has been updated but is not yet refreshed, the get API will issue a refresh call in-place to make the document visible. This will also make other documents changed since the last refresh visible. In order to disable realtime GET, one can set the realtime parameter to false.

注意:每次访问人物详情页面,都会更新此人的浏览次数。

所以我明确地禁用了实时

        GetRequest getRequest = new GetRequest(personIndex, id.toString());
        getRequest.realtime(false);

什么时候做的,超时问题解决了。