无法检测 ES 版本 - 如果 network/Elasticsearch 集群不可访问 (HIVE),通常会发生这种情况

Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible (HIVE)

我目前正在尝试仅执行一个从 Hive 到 ElasticSearch 的 "SELECT * FROM table"。我正在使用 cloudera CDH 6.0.1。我已经将 elasticsearch-hadoop-hive-7.1.1 jar 添加到我的配置单元路径中。我有 ElasticSearch 7.1.1 Cloudera 堆栈和弹性 运行 在不同的服务器但在同一个网络中。

CREATE EXTERNAL TABLE ctrl_rater_resumen_lla_es  
( 
fecha_registro string, 
direccion string, 
linea_b_codigo_prestadora string, 
linea_b_tipo_numero string, 
es_roaming string,
call_duration string, 
linea_b_routing_number string, 
minutos string, fecha string 
) 
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES (
'es.resource' = 'ctrl_rater_resumen_lla/hb',
'es.node' = 'http://10.129.x.xxx',
'es.port' = '9200',
'es.index.auto.create' = 'true',
'es.index.read.missing.as.empty' = 'true',
'es.nodes.discovery'='true',
'es.net.ssl'='false'
'es.nodes.client.only'='false',
'es.nodes.wan.only' = 'true'
'es.net.http.auth.user'='xxxxx',
'es.net.http.auth.pass' = 'xxxxx'
);

创建成功

SELECT * FROM ctrl_rater_resumen_lla_es;

请求的错误状态 TFetchResultsReq(fetchType=0, operationHandle=TOperationHandle(hasResultSet=True, modifiedRowCount=None, operationType=0, operationId=THandleIdentifier(secret='\xbaYG*\xd4wI\xc0\xb8\xf6\x94Q\xa3\xa4IY ', guid='\xff\xca\xdb\xb5\x040E\x0e\x8eE\xe4\xf7?t\x1b\x01')), orientation=4, maxRows=100): TFetchResultsResp(status=TStatus(errorCode=0, errorMessage="java.io.IOException: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'", sqlState=None, infoMessages=["*org.apache.hive.service.cli.HiveSQLException:java.io.IOException: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: 无法检测 ES 版本 - 如果 network/Elasticsearch 集群不可访问或定位时通常会发生这种情况没有正确设置的 WAN/Cloud 实例 'es.nodes.wan.only':25:24", 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:492', 'org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:297', 'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:852', 'sun.reflect.GeneratedMethodAccessor24:invoke::-1', 'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43', 'java.lang.reflect.Method:invoke:Method.java:498', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78', 'org.apache.hive.service.cli.session.HiveSessionProxy:access[=29=]0:HiveSessionProxy.java:36', 'org.apache.hive.service.cli.session.HiveSessionProxy:run:HiveSessionProxy.java:63', 'java.security.AccessController:doPrivileged:AccessController.java:-2', 'javax.security.auth.Subject:doAs:Subject.java:422', 'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1726', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59', 'com.sun.proxy.$Proxy38:fetchResults::-1'、'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:505'、'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:702'、'org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1717'、'org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1702'、'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39'、'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39'、'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56'、'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286', 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149', 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624', 'java.lang.Thread:run:Thread.java:748', "*java.io.IOException:org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: 无法检测 ES 版本 - 通常如果 network/Elasticsearch 集群不可访问,或者在没有正确设置 'es.nodes.wan.only':29:4", 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:521', 'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:428', 'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:146', 'org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:2196', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:487', "*org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: 无法检测到 ES版本 - 如果无法访问 network/Elasticsearch 集群,或者在没有正确设置 'es.nodes.wan.only':35:6", 'org.elasticsearch.hadoop.rest.InitializationUtils:discoverClusterInfo:InitializationUtils.java:340', [=55] 的情况下以 WAN/Cloud 实例为目标,通常会发生这种情况=], 'org.elasticsearch.hadoop.hive.EsHiveInputFormat:getSplits:EsHiveInputFormat.java:112', 'org.elasticsearch.hadoop.hive.EsHiveInputFormat:getSplits:EsHiveInputFormat.java:51', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextSplits:FetchOperator.java:372', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getRecordReader:FetchOperator.java:304', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:459', '*org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException:连接错误(检查网络and/or代理设置)-所有节点都失败了;试过 [[localhost:9200]] :41:6', 'org.elasticsearch.hadoop.rest.NetworkClient:execute:NetworkClient.java:152', 'org.elasticsearch.hadoop.rest.RestClient:execute:RestClient.java:424', 'org.elasticsearch.hadoop.rest.RestClient:execute:RestClient.java:388', 'org.elasticsearch.hadoop.rest.RestClient:execute:RestClient.java:392', 'org.elasticsearch.hadoop.rest.RestClient:get:RestClient.java:168', 'org.elasticsearch.hadoop.rest.RestClient:mainInfo:RestClient.java:735', 'org.elasticsearch.hadoop.rest.InitializationUtils:discoverClusterInfo:InitializationUtils.java:330'], statusCode=3), results=None, hasMoreRows=None)

正确的 属性 是 "es.nodes",而不是 "es.node"。默认值为 "localhost",因此您正在尝试连接到您的本地主机而不是您要连接的节点。有关详细信息,请参阅 the documentation

您可能还想考虑如果您没有连接到云环境,是否需要将 属性 "es.nodes.wan.only" 设置为 true,因为那样会禁用其他的自动发现网络中的节点,如 the documentation 中所述(您需要向下滚动一点)。它将强制系统使用 "es.nodes" 属性,它默认尝试连接到本地主机。这就是您收到错误的原因,但即使您让它工作,该设置也会对您的性能产​​生影响(最初强调):

Note that in this mode, performance is highly affected.