Hive shell 在执行查询时抛出 Filenotfound 异常,尽管使用 "ADD JAR" 添加了 jar 文件
Hive shell throws Filenotfound exception while executing queries, inspite of adding jar files using "ADD JAR"
1) 我使用 "ADD JAR /home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar;"
添加了 serde jar 文件
2) 创建 table
3) table 创建成功
4) 但是当我执行任何 select 查询时它抛出文件未找到异常
hive> select count(*) from tab_tweets;
Query ID = hduser_20150604145353_51b4def4-11fb-4638-acac-77301c1c1806
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
java.io.FileNotFoundException: File does not exist: hdfs://node1:9000/home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar
at org.apache.hadoop.hdfs.DistributedFileSystem.doCall(DistributedFileSystem.java:1122)
at org.apache.hadoop.hdfs.DistributedFileSystem.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:269)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:390)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:483)
at org.apache.hadoop.mapreduce.Job.run(Job.java:1296)
at org.apache.hadoop.mapreduce.Job.run(Job.java:1293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
at org.apache.hadoop.mapred.JobClient.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:428)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1638)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1397)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
作业提交失败,出现异常 'java.io.FileNotFoundException(File does not exist: hdfs://node1:9000/home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar)'
失败:执行错误,return 来自 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
的代码 1
检查 jar 存在于 /home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar
方法 1: 将 hive-serdes-1.0-SNAPSHOT.jar
文件从本地文件系统复制到 HDFS。
hadoop fs -mkdir /home/hduser/softwares/hive/
hadoop fs -put /home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar /home/hduser/softwares/hive/
Note: Use hdfs dfs instead of hadoop fs, if you are using latest hadoop versions.
方法 2: 将 hive-site.xml
中的 hive.aux.jars.path
的值更改为:
<property>
<name>hive.aux.jars.path</name>
<value>file:///home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar</value>
</property>
方法 3: 在 hadoop 类路径中添加 hive-serdes-1.0-SNAPSHOT.jar
。即,在 hadoop-env.sh
:
中添加这一行
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar
NOTE: I have mentioned the paths considering you have installed hive in /home/hduser/softwares/hive. If you have hive installed
elsewhere, please change /home/hduser/softwares/hive to point to
your hive installation folder.
注意:无需将hive-serdes-1.0-SNAPSHOT.jar复制到hdfs中,将其保留在本地Fs中。
在查询执行时。 Hive 将处理它,使其在所有节点中可用,如 D.C
有关详细信息,请参阅此 link:official link
仅供参考 - 参考 Hive 资源
将资源添加到会话后,Hive 查询可以通过其名称引用它(在 map/reduce/transform 子句中)
并且该资源在整个 Hadoop 集群的执行时在本地可用。
Hive使用Hadoop的Distributed Cache在查询执行时将添加的资源分发到集群中的所有机器
您可以通过多种方式添加额外的 Jar:
- 当前配置单元会话:
hive > add jar /local/fs/path/to/your/file.jar
hive > list jars //-- to check
在你所在的节点上添加 运行 hive in .hiverc like .bashrc
cd $HOME
创建文件 .hiverc
cat $HOME/.hiverc
add jar /local/fs/path/to/your/file.jar // 添加这一行
添加jar文件到hive-site.xml
hive.aux.jars.path
文件:///home/user/path/to/your/hive-serdes-1.0-SNAPSHOT.jar
1) 我使用 "ADD JAR /home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar;"
添加了 serde jar 文件2) 创建 table
3) table 创建成功
4) 但是当我执行任何 select 查询时它抛出文件未找到异常
hive> select count(*) from tab_tweets;
Query ID = hduser_20150604145353_51b4def4-11fb-4638-acac-77301c1c1806
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
java.io.FileNotFoundException: File does not exist: hdfs://node1:9000/home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar
at org.apache.hadoop.hdfs.DistributedFileSystem.doCall(DistributedFileSystem.java:1122)
at org.apache.hadoop.hdfs.DistributedFileSystem.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:269)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:390)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:483)
at org.apache.hadoop.mapreduce.Job.run(Job.java:1296)
at org.apache.hadoop.mapreduce.Job.run(Job.java:1293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
at org.apache.hadoop.mapred.JobClient.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:428)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1638)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1397)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
作业提交失败,出现异常 'java.io.FileNotFoundException(File does not exist: hdfs://node1:9000/home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar)' 失败:执行错误,return 来自 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
的代码 1检查 jar 存在于 /home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar
方法 1: 将 hive-serdes-1.0-SNAPSHOT.jar
文件从本地文件系统复制到 HDFS。
hadoop fs -mkdir /home/hduser/softwares/hive/
hadoop fs -put /home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar /home/hduser/softwares/hive/
Note: Use hdfs dfs instead of hadoop fs, if you are using latest hadoop versions.
方法 2: 将 hive-site.xml
中的 hive.aux.jars.path
的值更改为:
<property>
<name>hive.aux.jars.path</name>
<value>file:///home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar</value>
</property>
方法 3: 在 hadoop 类路径中添加 hive-serdes-1.0-SNAPSHOT.jar
。即,在 hadoop-env.sh
:
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/home/hduser/softwares/hive/hive-serdes-1.0-SNAPSHOT.jar
NOTE: I have mentioned the paths considering you have installed hive in /home/hduser/softwares/hive. If you have hive installed elsewhere, please change /home/hduser/softwares/hive to point to your hive installation folder.
注意:无需将hive-serdes-1.0-SNAPSHOT.jar复制到hdfs中,将其保留在本地Fs中。
在查询执行时。 Hive 将处理它,使其在所有节点中可用,如 D.C
有关详细信息,请参阅此 link:official link
仅供参考 - 参考 Hive 资源
将资源添加到会话后,Hive 查询可以通过其名称引用它(在 map/reduce/transform 子句中) 并且该资源在整个 Hadoop 集群的执行时在本地可用。 Hive使用Hadoop的Distributed Cache在查询执行时将添加的资源分发到集群中的所有机器
您可以通过多种方式添加额外的 Jar:
- 当前配置单元会话:
hive > add jar /local/fs/path/to/your/file.jar
hive > list jars //-- to check
在你所在的节点上添加 运行 hive in .hiverc like .bashrc
cd $HOME
创建文件 .hiverc
cat $HOME/.hiverc
add jar /local/fs/path/to/your/file.jar // 添加这一行
添加jar文件到hive-site.xml
hive.aux.jars.path 文件:///home/user/path/to/your/hive-serdes-1.0-SNAPSHOT.jar