Impala 查询无法通过 NullPointerException 检索结果
Impala query cannot retrieve result with NullPointerException
我有以下查询 运行 超过 hive/impala:
select count(p.id) as tweet_count, p.author as author,p.profile_image_url as profile_image_url,p.screen_name as screen_name,
concat_ws('/',min(p.postday),min(p.postmonth),min(p.postyear) ) as creation_date,p.message message,af.followerid as follower
from post p
inner join author_follower af on af.id like if(p.author= null, '', concat(p.author,'%'))
where p.hashtaglist like 'hashtagtobeused'
group by author,profile_image_url,screen_name,message,follower
ORDER BY cast(min(postyear) as int),cast(min(postmonth) as int),cast(min(postday) as int),cast(min(posthour) as int) ASC;
但由于某种原因我得到以下错误结果
您的查询有以下错误:
Bad status for request 3304: TGetOperationStatusResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), operationState=5, errorMessage=None, sqlState=None, errorCode=None)
我检查了查询,但没有发现问题,请问有人能帮忙指导一下问题出在哪里吗?为什么我有这个错误而不是结果集
考虑重新格式化查询,因为在某些情况下,当 SQL 解析本身由于空格等简单问题而失败时,Impala 会与 SEGV 崩溃。如果您是 运行 Cloudera,您会在 运行 查询的节点上的 /run/cloudera-scm-agent/process
中找到日志。
我们通过注意 SQL 格式(这也是一种很好的做法,因为它使查询错误更容易被发现)解决了这些问题,例如
SELECT
COUNT(p.id) AS tweet_count,
p.author AS author,
p.profile_image_url AS profile_image_url,
p.screen_name AS screen_name,
concat_ws('/', MIN(p.postday), MIN(p.postmonth), MIN(p.postyear) ) AS creation_date,
p.message AS MESSAGE,
af.followerid AS follower
FROM
post p
INNER JOIN
author_follower af
ON
af.id LIKE IF(p.author = NULL, '', concat(p.author, '%'))
WHERE
p.hashtaglist LIKE 'hashtagtobeused'
GROUP BY
author,
profile_image_url,
screen_name,
MESSAGE,
follower
ORDER BY
CAST(MIN(postyear) AS INT),
CAST(MIN(postmonth) AS INT),
CAST(MIN(postday) AS INT),
CAST(MIN(posthour) AS INT) ASC;
(顺便说一句,我使用 dbVisualizer 来验证和重新格式化查询语法——值得考虑的好工具)
我有以下查询 运行 超过 hive/impala:
select count(p.id) as tweet_count, p.author as author,p.profile_image_url as profile_image_url,p.screen_name as screen_name,
concat_ws('/',min(p.postday),min(p.postmonth),min(p.postyear) ) as creation_date,p.message message,af.followerid as follower
from post p
inner join author_follower af on af.id like if(p.author= null, '', concat(p.author,'%'))
where p.hashtaglist like 'hashtagtobeused'
group by author,profile_image_url,screen_name,message,follower
ORDER BY cast(min(postyear) as int),cast(min(postmonth) as int),cast(min(postday) as int),cast(min(posthour) as int) ASC;
但由于某种原因我得到以下错误结果
您的查询有以下错误:
Bad status for request 3304: TGetOperationStatusResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), operationState=5, errorMessage=None, sqlState=None, errorCode=None)
我检查了查询,但没有发现问题,请问有人能帮忙指导一下问题出在哪里吗?为什么我有这个错误而不是结果集
考虑重新格式化查询,因为在某些情况下,当 SQL 解析本身由于空格等简单问题而失败时,Impala 会与 SEGV 崩溃。如果您是 运行 Cloudera,您会在 运行 查询的节点上的 /run/cloudera-scm-agent/process
中找到日志。
我们通过注意 SQL 格式(这也是一种很好的做法,因为它使查询错误更容易被发现)解决了这些问题,例如
SELECT
COUNT(p.id) AS tweet_count,
p.author AS author,
p.profile_image_url AS profile_image_url,
p.screen_name AS screen_name,
concat_ws('/', MIN(p.postday), MIN(p.postmonth), MIN(p.postyear) ) AS creation_date,
p.message AS MESSAGE,
af.followerid AS follower
FROM
post p
INNER JOIN
author_follower af
ON
af.id LIKE IF(p.author = NULL, '', concat(p.author, '%'))
WHERE
p.hashtaglist LIKE 'hashtagtobeused'
GROUP BY
author,
profile_image_url,
screen_name,
MESSAGE,
follower
ORDER BY
CAST(MIN(postyear) AS INT),
CAST(MIN(postmonth) AS INT),
CAST(MIN(postday) AS INT),
CAST(MIN(posthour) AS INT) ASC;
(顺便说一句,我使用 dbVisualizer 来验证和重新格式化查询语法——值得考虑的好工具)