查询 HiveServer2 时来自 org.apache.hadoop.hive.ql.exec.mr.MapRedTask 的 Impyla return 代码 1
Impyla return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask when querying HiveServer2
我正在使用 Impyla 从 HIVE 查询一些结果,但是,我遇到了这个问题:
来自因皮拉:
impala.error.OperationalError: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
来自 HiveServer2:
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hduser_20180827031927_fdb148b0-725b-434c-a0f8-98b6843d4348
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
我的源代码是:
from impala.dbapi import connect
import sys
dbName = sys.argv[1:][0]
query = sys.argv[1:][1]
conn = connect(host='192.168.0.10', port=10000, database=dbName , auth_mechanism='NOSASL' , use_ssl=True)
cursor = conn.cursor()
cursor.execute(query, configuration={'hive.exec.reducers.bytes.per.reducer': '100000', 'hive.auto.convert.join.noconditionaltask':'false','mapreduce.job.reduces':'1','hive.auto.convert.join':'false'})
returnData = []
for row in cursor:
returnData.append(row[0])
pprint(returnData)
如你所见,我添加了很多配置,但都不起作用
根据你的错误,无法知道发生了什么。
我不确定在 impyla 中启用调试日志记录,因此您需要转到 YARN UI 来查找查询。
如果 YARN 不是 运行,我认为您会得到一个更具描述性的错误,例如 "unable to submit job",尽管该错误可能不会从 HiveServer
传播
我遇到了同样的问题。在 google 上浏览了数千页后,我意识到 impyla 返回的错误消息没有帮助,所以我开始检查 hive 日志。在 hive-site.xml 文件中,我搜索了“log”关键字,幸运的是,我找到了这个参数:
hive.server2.logging.operation.log.location
所以我进入了这个目录,找到了我的 impyla 代码生成的日志消息。错误信息如下:
ERROR : Job Submission failed with exception 'org.apache.hadoop.security.AccessControlException(Permission denied: user=APP, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
到这个阶段,一切都清楚了,我在impyla代码中传入的用户名没有权限做查询,所以我换了一个用户名,世界又恢复了平静:)
我正在使用 Impyla 从 HIVE 查询一些结果,但是,我遇到了这个问题:
来自因皮拉:
impala.error.OperationalError: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
来自 HiveServer2:
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hduser_20180827031927_fdb148b0-725b-434c-a0f8-98b6843d4348
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
我的源代码是:
from impala.dbapi import connect
import sys
dbName = sys.argv[1:][0]
query = sys.argv[1:][1]
conn = connect(host='192.168.0.10', port=10000, database=dbName , auth_mechanism='NOSASL' , use_ssl=True)
cursor = conn.cursor()
cursor.execute(query, configuration={'hive.exec.reducers.bytes.per.reducer': '100000', 'hive.auto.convert.join.noconditionaltask':'false','mapreduce.job.reduces':'1','hive.auto.convert.join':'false'})
returnData = []
for row in cursor:
returnData.append(row[0])
pprint(returnData)
如你所见,我添加了很多配置,但都不起作用
根据你的错误,无法知道发生了什么。
我不确定在 impyla 中启用调试日志记录,因此您需要转到 YARN UI 来查找查询。
如果 YARN 不是 运行,我认为您会得到一个更具描述性的错误,例如 "unable to submit job",尽管该错误可能不会从 HiveServer
传播我遇到了同样的问题。在 google 上浏览了数千页后,我意识到 impyla 返回的错误消息没有帮助,所以我开始检查 hive 日志。在 hive-site.xml 文件中,我搜索了“log”关键字,幸运的是,我找到了这个参数:
hive.server2.logging.operation.log.location
所以我进入了这个目录,找到了我的 impyla 代码生成的日志消息。错误信息如下:
ERROR : Job Submission failed with exception 'org.apache.hadoop.security.AccessControlException(Permission denied: user=APP, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
到这个阶段,一切都清楚了,我在impyla代码中传入的用户名没有权限做查询,所以我换了一个用户名,世界又恢复了平静:)