Hive 存档分区(动态)失败:执行错误,return 来自 org.apache.hadoop.hive.ql.exec.DDLTask 的代码 1
Hive archive partition(dynamic) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
我正在尝试从 table 中存档一些旧数据。使用 ALTER TABLE TABLE_NAME ARCHIVE PARTITION(part_col)
查询。
Hadoop version - 2.7.3
Hive version - 1.2.1
Table结构如下,
hive> desc clicks_fact;
OK
time timestamp
user_id varchar(32)
advertiser_id int
buy_id int
ad_id int
creative_id int
creative_version smallint
creative_size varchar(10)
site_id int
page_id int
keyword varchar(48)
country_id varchar(10)
state varchar(10)
area_code int
browser_id smallint
browser_version varchar(10)
os_id int
zip varchar(10)
site_data varchar(20)
sv1 varchar(10)
day date
file_date varchar(8)
# Partition Information
# col_name data_type comment
day date
file_date varchar(8)
Time taken: 0.112 seconds, Fetched: 28 row(s)
现在,我正在尝试为特定分区归档数据,如下所示,
hive> ALTER TABLE clicks_fact ARCHIVE partition(day='2017-06-30', file_date='20170629');
intermediate.archived is hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629_INTERMEDIATE_ARCHIVED
intermediate.original is hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629_INTERMEDIATE_ORIGINAL
Creating data.har for hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629
in hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629/.hive-staging_hive_2017-10-12_22-03-17_129_6395228918576649008-1/-ext-10000/partlevel
Please wait... (this may take a while)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org/apache/hadoop/tools/HadoopArchives
我可以直接在 Hadoop 中创建 HAR,使用
$ hadoop archive -archiveName archive.har -p /mydir_* /
因此,这不是 Hadoop 内部的依赖性问题。
如有任何帮助,我们将不胜感激。
日志:
2017-10-23 22:26:39,210 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,211 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,211 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,213 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,213 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: alter table clicks_fact archive partition(day='2017-06-30', file_date='20170629')
2017-10-23 22:26:39,223 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(209)) - Parse Completed
2017-10-23 22:26:39,224 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=parse start=1508777799213 end=1508777799224 duration=11 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,225 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,234 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(746)) - 0: get_table : db=scheme tbl=clicks_fact
2017-10-23 22:26:39,235 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(371)) - ugi=sridhar ip=unknown-ip-addr cmd=get_table : db=scheme tbl=clicks_fact
2017-10-23 22:26:39,410 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(746)) - 0: get_partitions_ps_with_auth : db=scheme tbl=clicks_fact[2017-06-30,20170629]
2017-10-23 22:26:39,410 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(371)) - ugi=sridhar ip=unknown-ip-addr cmd=get_partitions_ps_with_auth : db=scheme tbl=clicks_fact[2017-06-30,20170629]
2017-10-23 22:26:39,463 INFO [main]: ql.Driver (Driver.java:compile(436)) - Semantic Analysis Completed
2017-10-23 22:26:39,463 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=semanticAnalyze start=1508777799225 end=1508777799463 duration=238 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,463 INFO [main]: ql.Driver (Driver.java:getSchema(240)) - Returning Hive schema: Schema(fieldSchemas:null, properties:null)
2017-10-23 22:26:39,463 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=compile start=1508777799211 end=1508777799463 duration=252 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,463 INFO [main]: ql.Driver (Driver.java:checkConcurrency(160)) - Concurrency mode is disabled, not creating a lock manager
2017-10-23 22:26:39,464 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,464 INFO [main]: ql.Driver (Driver.java:execute(1328)) - Starting command(queryId=sridhar_20171023222639_d1453a90-0340-411c-b131-77d112862acc): alter table clicks_fact archive partition(day='2017-06-30', file_date='20170629')
2017-10-23 22:26:39,465 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=TimeToSubmit start=1508777799211 end=1508777799465 duration=254 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,465 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,465 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,465 INFO [main]: ql.Driver (Driver.java:launchTask(1651)) - Starting task [Stage-0:DDL] in serial mode
2017-10-23 22:26:39,465 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(746)) - 0: get_table : db=scheme tbl=clicks_fact
2017-10-23 22:26:39,466 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(371)) - ugi=sridhar ip=unknown-ip-addr cmd=get_table : db=scheme tbl=clicks_fact
2017-10-23 22:26:39,489 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(746)) - 0: get_partitions_ps_with_auth : db=scheme tbl=clicks_fact[2017-06-30,20170629]
2017-10-23 22:26:39,489 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(371)) - ugi=sridhar ip=unknown-ip-addr cmd=get_partitions_ps_with_auth : db=scheme tbl=clicks_fact[2017-06-30,20170629]
2017-10-23 22:26:39,526 INFO [main]: exec.Task (SessionState.java:printInfo(951)) - intermediate.archived is hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629_INTERMEDIATE_ARCHIVED
2017-10-23 22:26:39,526 INFO [main]: exec.Task (SessionState.java:printInfo(951)) - intermediate.original is hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629_INTERMEDIATE_ORIGINAL
2017-10-23 22:26:39,542 INFO [main]: common.FileUtils (FileUtils.java:mkdir(501)) - Creating directory if it doesn't exist: hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629/.hive-staging_hive_2017-10-23_22-26-39_212_2574575409261622278-1
2017-10-23 22:26:39,616 INFO [main]: exec.Task (SessionState.java:printInfo(951)) - Creating data.har for hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629
2017-10-23 22:26:39,616 INFO [main]: exec.Task (SessionState.java:printInfo(951)) - in hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629/.hive-staging_hive_2017-10-23_22-26-39_212_2574575409261622278-1/-ext-10000/partlevel
2017-10-23 22:26:39,616 INFO [main]: exec.Task (SessionState.java:printInfo(951)) - Please wait... (this may take a while)
2017-10-23 22:26:39,645 INFO [main]: Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - session.id is deprecated. Instead, use dfs.metrics.session-id
2017-10-23 22:26:39,646 INFO [main]: jvm.JvmMetrics (JvmMetrics.java:init(76)) - Initializing JVM Metrics with processName=JobTracker, sessionId=
2017-10-23 22:26:39,656 ERROR [main]: exec.DDLTask (DDLTask.java:failed(520)) - java.lang.NoSuchMethodError: org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(Lorg/apache/hadoop/mapred/JobClient;Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/fs/Path;
at org.apache.hadoop.tools.HadoopArchives.archive(HadoopArchives.java:476)
at org.apache.hadoop.tools.HadoopArchives.run(HadoopArchives.java:862)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.hive.ql.exec.DDLTask.archive(DDLTask.java:1359)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:360)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
2017-10-23 22:26:39,656 ERROR [main]: ql.Driver (SessionState.java:printError(960)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(Lorg/apache/hadoop/mapred/JobClient;Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/fs/Path;
2017-10-23 22:26:39,656 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=Driver.execute start=1508777799464 end=1508777799656 duration=192 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,656 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,656 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=releaseLocks start=1508777799656 end=1508777799656 duration=0 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,673 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,673 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=releaseLocks start=1508777799673 end=1508777799673 duration=0 from=org.apache.hadoop.hive.ql.Driver>
需要检查的几点
set hive.archive.enabled=true;
set hive.metastore.schema.verification=true;
Hive 现在将模式版本记录在 Metastore 数据库中,并验证 Metastore 模式版本是否与将要访问 Metastore 的 Hive 二进制文件兼容。请注意,隐式创建或更改现有模式的 Hive 属性在默认情况下被禁用。 Hive 不会尝试隐式更改 Metastore 模式。当您针对旧模式执行 Hive 查询时,它将无法访问 Metastore。
看起来依赖性是问题所在。
我首先添加了 hadoop-tools.jar
作为依赖项(在 hive_home/lib
内)。
这就是导致问题的原因。在我添加 hadoop-archives.jar
作为依赖项而不是 hadoop-tools.jar
.
后它得到解决
感谢@Joby 和@Max08 的帮助
我用 hive --auxpath $HADOOP_HOME/share/hadoop/tools/lib/hadoop-archives-2.7.2.jar
并且有效。
Hive 使用 --auxpath 指定辅助 jar,它将在创建新会话时加载。如果没有--auxpath,默认不会加载这个jar。
我正在尝试从 table 中存档一些旧数据。使用 ALTER TABLE TABLE_NAME ARCHIVE PARTITION(part_col)
查询。
Hadoop version - 2.7.3
Hive version - 1.2.1
Table结构如下,
hive> desc clicks_fact;
OK
time timestamp
user_id varchar(32)
advertiser_id int
buy_id int
ad_id int
creative_id int
creative_version smallint
creative_size varchar(10)
site_id int
page_id int
keyword varchar(48)
country_id varchar(10)
state varchar(10)
area_code int
browser_id smallint
browser_version varchar(10)
os_id int
zip varchar(10)
site_data varchar(20)
sv1 varchar(10)
day date
file_date varchar(8)
# Partition Information
# col_name data_type comment
day date
file_date varchar(8)
Time taken: 0.112 seconds, Fetched: 28 row(s)
现在,我正在尝试为特定分区归档数据,如下所示,
hive> ALTER TABLE clicks_fact ARCHIVE partition(day='2017-06-30', file_date='20170629');
intermediate.archived is hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629_INTERMEDIATE_ARCHIVED
intermediate.original is hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629_INTERMEDIATE_ORIGINAL
Creating data.har for hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629
in hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629/.hive-staging_hive_2017-10-12_22-03-17_129_6395228918576649008-1/-ext-10000/partlevel
Please wait... (this may take a while)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org/apache/hadoop/tools/HadoopArchives
我可以直接在 Hadoop 中创建 HAR,使用
$ hadoop archive -archiveName archive.har -p /mydir_* /
因此,这不是 Hadoop 内部的依赖性问题。
如有任何帮助,我们将不胜感激。
日志:
2017-10-23 22:26:39,210 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,211 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,211 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,213 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,213 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: alter table clicks_fact archive partition(day='2017-06-30', file_date='20170629')
2017-10-23 22:26:39,223 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(209)) - Parse Completed
2017-10-23 22:26:39,224 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=parse start=1508777799213 end=1508777799224 duration=11 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,225 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,234 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(746)) - 0: get_table : db=scheme tbl=clicks_fact
2017-10-23 22:26:39,235 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(371)) - ugi=sridhar ip=unknown-ip-addr cmd=get_table : db=scheme tbl=clicks_fact
2017-10-23 22:26:39,410 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(746)) - 0: get_partitions_ps_with_auth : db=scheme tbl=clicks_fact[2017-06-30,20170629]
2017-10-23 22:26:39,410 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(371)) - ugi=sridhar ip=unknown-ip-addr cmd=get_partitions_ps_with_auth : db=scheme tbl=clicks_fact[2017-06-30,20170629]
2017-10-23 22:26:39,463 INFO [main]: ql.Driver (Driver.java:compile(436)) - Semantic Analysis Completed
2017-10-23 22:26:39,463 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=semanticAnalyze start=1508777799225 end=1508777799463 duration=238 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,463 INFO [main]: ql.Driver (Driver.java:getSchema(240)) - Returning Hive schema: Schema(fieldSchemas:null, properties:null)
2017-10-23 22:26:39,463 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=compile start=1508777799211 end=1508777799463 duration=252 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,463 INFO [main]: ql.Driver (Driver.java:checkConcurrency(160)) - Concurrency mode is disabled, not creating a lock manager
2017-10-23 22:26:39,464 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,464 INFO [main]: ql.Driver (Driver.java:execute(1328)) - Starting command(queryId=sridhar_20171023222639_d1453a90-0340-411c-b131-77d112862acc): alter table clicks_fact archive partition(day='2017-06-30', file_date='20170629')
2017-10-23 22:26:39,465 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=TimeToSubmit start=1508777799211 end=1508777799465 duration=254 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,465 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,465 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,465 INFO [main]: ql.Driver (Driver.java:launchTask(1651)) - Starting task [Stage-0:DDL] in serial mode
2017-10-23 22:26:39,465 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(746)) - 0: get_table : db=scheme tbl=clicks_fact
2017-10-23 22:26:39,466 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(371)) - ugi=sridhar ip=unknown-ip-addr cmd=get_table : db=scheme tbl=clicks_fact
2017-10-23 22:26:39,489 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(746)) - 0: get_partitions_ps_with_auth : db=scheme tbl=clicks_fact[2017-06-30,20170629]
2017-10-23 22:26:39,489 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(371)) - ugi=sridhar ip=unknown-ip-addr cmd=get_partitions_ps_with_auth : db=scheme tbl=clicks_fact[2017-06-30,20170629]
2017-10-23 22:26:39,526 INFO [main]: exec.Task (SessionState.java:printInfo(951)) - intermediate.archived is hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629_INTERMEDIATE_ARCHIVED
2017-10-23 22:26:39,526 INFO [main]: exec.Task (SessionState.java:printInfo(951)) - intermediate.original is hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629_INTERMEDIATE_ORIGINAL
2017-10-23 22:26:39,542 INFO [main]: common.FileUtils (FileUtils.java:mkdir(501)) - Creating directory if it doesn't exist: hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629/.hive-staging_hive_2017-10-23_22-26-39_212_2574575409261622278-1
2017-10-23 22:26:39,616 INFO [main]: exec.Task (SessionState.java:printInfo(951)) - Creating data.har for hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629
2017-10-23 22:26:39,616 INFO [main]: exec.Task (SessionState.java:printInfo(951)) - in hdfs://localhost:54310/user/hive/warehouse/scheme.db/clicks_fact/day=2017-06-30/file_date=20170629/.hive-staging_hive_2017-10-23_22-26-39_212_2574575409261622278-1/-ext-10000/partlevel
2017-10-23 22:26:39,616 INFO [main]: exec.Task (SessionState.java:printInfo(951)) - Please wait... (this may take a while)
2017-10-23 22:26:39,645 INFO [main]: Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - session.id is deprecated. Instead, use dfs.metrics.session-id
2017-10-23 22:26:39,646 INFO [main]: jvm.JvmMetrics (JvmMetrics.java:init(76)) - Initializing JVM Metrics with processName=JobTracker, sessionId=
2017-10-23 22:26:39,656 ERROR [main]: exec.DDLTask (DDLTask.java:failed(520)) - java.lang.NoSuchMethodError: org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(Lorg/apache/hadoop/mapred/JobClient;Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/fs/Path;
at org.apache.hadoop.tools.HadoopArchives.archive(HadoopArchives.java:476)
at org.apache.hadoop.tools.HadoopArchives.run(HadoopArchives.java:862)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.hive.ql.exec.DDLTask.archive(DDLTask.java:1359)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:360)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
2017-10-23 22:26:39,656 ERROR [main]: ql.Driver (SessionState.java:printError(960)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(Lorg/apache/hadoop/mapred/JobClient;Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/fs/Path;
2017-10-23 22:26:39,656 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=Driver.execute start=1508777799464 end=1508777799656 duration=192 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,656 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,656 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=releaseLocks start=1508777799656 end=1508777799656 duration=0 from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,673 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2017-10-23 22:26:39,673 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=releaseLocks start=1508777799673 end=1508777799673 duration=0 from=org.apache.hadoop.hive.ql.Driver>
需要检查的几点
set hive.archive.enabled=true;
set hive.metastore.schema.verification=true;
Hive 现在将模式版本记录在 Metastore 数据库中,并验证 Metastore 模式版本是否与将要访问 Metastore 的 Hive 二进制文件兼容。请注意,隐式创建或更改现有模式的 Hive 属性在默认情况下被禁用。 Hive 不会尝试隐式更改 Metastore 模式。当您针对旧模式执行 Hive 查询时,它将无法访问 Metastore。
看起来依赖性是问题所在。
我首先添加了 hadoop-tools.jar
作为依赖项(在 hive_home/lib
内)。
这就是导致问题的原因。在我添加 hadoop-archives.jar
作为依赖项而不是 hadoop-tools.jar
.
感谢@Joby 和@Max08 的帮助
我用 hive --auxpath $HADOOP_HOME/share/hadoop/tools/lib/hadoop-archives-2.7.2.jar
并且有效。
Hive 使用 --auxpath 指定辅助 jar,它将在创建新会话时加载。如果没有--auxpath,默认不会加载这个jar。