Impala 查询无法通过 NullPointerException 检索结果

Impala query cannot retrieve result with NullPointerException

我有以下查询 运行 超过 hive/impala:

select count(p.id) as tweet_count, p.author as author,p.profile_image_url as profile_image_url,p.screen_name as screen_name,
concat_ws('/',min(p.postday),min(p.postmonth),min(p.postyear) ) as creation_date,p.message message,af.followerid as follower 
from post p 
inner join author_follower af on af.id like if(p.author= null, '', concat(p.author,'%'))
where p.hashtaglist like 'hashtagtobeused' 
group by author,profile_image_url,screen_name,message,follower
ORDER BY cast(min(postyear) as int),cast(min(postmonth) as int),cast(min(postday) as int),cast(min(posthour) as int) ASC;

但由于某种原因我得到以下错误结果

您的查询有以下错误:

Bad status for request 3304: TGetOperationStatusResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), operationState=5, errorMessage=None, sqlState=None, errorCode=None)

我检查了查询,但没有发现问题,请问有人能帮忙指导一下问题出在哪里吗?为什么我有这个错误而不是结果集

考虑重新格式化查询,因为在某些情况下,当 SQL 解析本身由于空格等简单问题而失败时,Impala 会与 SEGV 崩溃。如果您是 运行 Cloudera,您会在 运行 查询的节点上的 /run/cloudera-scm-agent/process 中找到日志。

我们通过注意 SQL 格式(这也是一种很好的做法,因为它使查询错误更容易被发现)解决了这些问题,例如

SELECT
    COUNT(p.id)                                                     AS tweet_count,
    p.author                                                        AS author,
    p.profile_image_url                                             AS profile_image_url,
    p.screen_name                                                   AS screen_name,
    concat_ws('/', MIN(p.postday), MIN(p.postmonth), MIN(p.postyear) ) AS creation_date,
    p.message                                                       AS MESSAGE,
    af.followerid                                                   AS follower
FROM
    post p
INNER JOIN
    author_follower af
ON
    af.id LIKE IF(p.author = NULL, '', concat(p.author, '%'))
WHERE
    p.hashtaglist LIKE 'hashtagtobeused'
GROUP BY
    author,
    profile_image_url,
    screen_name,
    MESSAGE,
    follower
ORDER BY
    CAST(MIN(postyear) AS INT),
    CAST(MIN(postmonth) AS INT),
    CAST(MIN(postday) AS INT),
    CAST(MIN(posthour) AS INT) ASC;

(顺便说一句,我使用 dbVisualizer 来验证和重新格式化查询语法——值得考虑的好工具)