在 Hive 查询中定义特定字段时出错
Error at defining a specific field in a Hive Query
我有一个 Orion Context Broker 通过 cygnus 连接到 Cosmos。
它工作正常,我的意思是我将新元素发送到 Context Broker,cygnus 将它们发送到 Cosmos 并将它们保存在文件中。
我遇到的问题是当我尝试进行一些搜索时。
我启动 hive,我看到创建了一些与 cosmos 创建的文件相关的表,所以我启动了一些查询。
简单的工作正常:
select * from Table_name;
Hive 不启动任何 mapReduce 作业。
但是当我只想过滤、加入、计数或获取某些字段时。事情就是这样:
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
Starting Job = JOB_NAME, Tracking URL = JOB_DETAILS_URL
Kill Command = /usr/lib/hadoop-0.20/bin/hadoop job -kill JOB_NAME
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2015-07-08 14:35:12,723 Stage-1 map = 0%, reduce = 0%
2015-07-08 14:35:38,943 Stage-1 map = 100%, reduce = 100%
Ended Job = JOB_NAME with errors
Error during job, obtaining debugging information...
Examining task ID: TASK_NAME (and more) from job JOB_NAME
Task with the most failures(4):
-----
Task ID:
task_201409031055_6337_m_000000
URL: TASK_DETAIL_URL
-----
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL
我发现 Cygnus 创建的文件与其他文件不同,因为在 cygnus 的情况下,它们必须用 jar 反序列化。
所以,我怀疑在那些情况下我是否必须应用任何 MapReduce 方法,或者是否已经有任何通用方法来执行此操作。
在执行任何 Hive 语句之前,执行以下操作:
hive> add jar /usr/local/hive-0.9.0-shark-0.8.0-bin/lib/json-serde-1.1.9.3-SNAPSHOT.jar;
如果您通过 JDBC 使用 Hive,执行与任何其他语句一样:
Connection con = ...
Statement stmt = con.createStatement();
stmt.executeQuery("add jar /usr/local/hive-0.9.0-shark-0.8.0-bin/lib/json-serde-1.1.9.3-SNAPSHOT.jar");
stmt.close();
stmt = con.createStatement();
ResultSet rs = stmt.executeQuery("select ...");
我有一个 Orion Context Broker 通过 cygnus 连接到 Cosmos。
它工作正常,我的意思是我将新元素发送到 Context Broker,cygnus 将它们发送到 Cosmos 并将它们保存在文件中。
我遇到的问题是当我尝试进行一些搜索时。
我启动 hive,我看到创建了一些与 cosmos 创建的文件相关的表,所以我启动了一些查询。
简单的工作正常:
select * from Table_name;
Hive 不启动任何 mapReduce 作业。
但是当我只想过滤、加入、计数或获取某些字段时。事情就是这样:
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
Starting Job = JOB_NAME, Tracking URL = JOB_DETAILS_URL
Kill Command = /usr/lib/hadoop-0.20/bin/hadoop job -kill JOB_NAME
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2015-07-08 14:35:12,723 Stage-1 map = 0%, reduce = 0%
2015-07-08 14:35:38,943 Stage-1 map = 100%, reduce = 100%
Ended Job = JOB_NAME with errors
Error during job, obtaining debugging information...
Examining task ID: TASK_NAME (and more) from job JOB_NAME
Task with the most failures(4):
-----
Task ID:
task_201409031055_6337_m_000000
URL: TASK_DETAIL_URL
-----
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL
我发现 Cygnus 创建的文件与其他文件不同,因为在 cygnus 的情况下,它们必须用 jar 反序列化。
所以,我怀疑在那些情况下我是否必须应用任何 MapReduce 方法,或者是否已经有任何通用方法来执行此操作。
在执行任何 Hive 语句之前,执行以下操作:
hive> add jar /usr/local/hive-0.9.0-shark-0.8.0-bin/lib/json-serde-1.1.9.3-SNAPSHOT.jar;
如果您通过 JDBC 使用 Hive,执行与任何其他语句一样:
Connection con = ...
Statement stmt = con.createStatement();
stmt.executeQuery("add jar /usr/local/hive-0.9.0-shark-0.8.0-bin/lib/json-serde-1.1.9.3-SNAPSHOT.jar");
stmt.close();
stmt = con.createStatement();
ResultSet rs = stmt.executeQuery("select ...");