Hive shell 在执行查询时抛出 Filenotfound 异常，尽管使用 "ADD JAR" 添加了 jar 文件

Question

1) 我使用 "ADD JAR /home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar;"

添加了 serde jar 文件

2) 创建 table

3) table 创建成功

4) 但是当我执行任何 select 查询时它抛出文件未找到异常

hive> select count(*) from tab_tweets;

Query ID = hduser_20150604145353_51b4def4-11fb-4638-acac-77301c1c1806
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
java.io.FileNotFoundException: File does not exist: hdfs://node1:9000/home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar
    at org.apache.hadoop.hdfs.DistributedFileSystem.doCall(DistributedFileSystem.java:1122)
    at org.apache.hadoop.hdfs.DistributedFileSystem.doCall(DistributedFileSystem.java:1114)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:269)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:390)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:483)
    at org.apache.hadoop.mapreduce.Job.run(Job.java:1296)
    at org.apache.hadoop.mapreduce.Job.run(Job.java:1293)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
    at org.apache.hadoop.mapred.JobClient.run(JobClient.java:562)
    at org.apache.hadoop.mapred.JobClient.run(JobClient.java:557)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
    at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:428)
    at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1638)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1397)
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
    at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

作业提交失败，出现异常 'java.io.FileNotFoundException(File does not exist: hdfs://node1:9000/home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar)' 失败：执行错误，return 来自 org.apache.hadoop.hive.ql.exec.mr.MapRedTask

的代码 1

Answer 1

检查 jar 存在于 /home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar

Answer 2

方法 1： 将 hive-serdes-1.0-SNAPSHOT.jar 文件从本地文件系统复制到 HDFS。

hadoop fs -mkdir /home/hduser/softwares/hive/
hadoop fs -put /home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar /home/hduser/softwares/hive/

Note: Use hdfs dfs instead of hadoop fs, if you are using latest hadoop versions.

方法 2： 将 hive-site.xml 中的 hive.aux.jars.path 的值更改为：

<property>
 <name>hive.aux.jars.path</name>
 <value>file:///home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar</value>
</property>

方法 3： 在 hadoop 类路径中添加 hive-serdes-1.0-SNAPSHOT.jar。即，在 hadoop-env.sh:

中添加这一行

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar

NOTE: I have mentioned the paths considering you have installed hive in /home/hduser/softwares/hive. If you have hive installed elsewhere, please change /home/hduser/softwares/hive to point to your hive installation folder.

Answer 3

注意：无需将hive-serdes-1.0-SNAPSHOT.jar复制到hdfs中，将其保留在本地Fs中。

在查询执行时。 Hive 将处理它，使其在所有节点中可用，如 D.C

有关详细信息，请参阅此 link：official link

仅供参考 - 参考 Hive 资源

将资源添加到会话后，Hive 查询可以通过其名称引用它（在 map/reduce/transform 子句中）并且该资源在整个 Hadoop 集群的执行时在本地可用。 Hive使用Hadoop的Distributed Cache在查询执行时将添加的资源分发到集群中的所有机器

您可以通过多种方式添加额外的 Jar：

当前配置单元会话:

hive > add jar /local/fs/path/to/your/file.jar

hive > list jars //-- to check

在你所在的节点上添加运行 hive in .hiverc like .bashrc

cd $HOME

创建文件 .hiverc

cat $HOME/.hiverc

add jar /local/fs/path/to/your/file.jar // 添加这一行
添加jar文件到hive-site.xml

hive.aux.jars.path 文件:///home/user/path/to/your/hive-serdes-1.0-SNAPSHOT.jar

Hive shell 在执行查询时抛出 Filenotfound 异常，尽管使用 "ADD JAR" 添加了 jar 文件

Hive shell throws Filenotfound exception while executing queries, inspite of adding jar files using "ADD JAR"

java

hadoop

hive

hdfs

hiveql