Impala return 'is not equal' 条件后的空值

Impala return null values after 'is not equal' condition

我正在尝试根据持续时间打印前 5 名的运动项目。 我 运行 以下查询:

with t1
AS
(
 select cast(duration as int),race,lap,sport
 from db.table1
 where exchange_code in ("tennis", "golf", "football")
 and table1_date = 20201010
 and duration is not null
 and race is not null 
 and lap is not null 
 and sport is not null
)
select sum(duration_int) ,race,lap,sport
from t1
group by race,lap,sport
order by sum(duration_int) desc
limit 5;

结果如下:

 sum(duration_int) race   lap      sport
 [null]            first  second   golf
 408439363026         
 65886284          fourth third    football
 33687102          fifth  first    american-football    
 22642805          tenth  fifth    english-football 

如您所见,我在条件 IS NOT NULL 之后有空值。

正如 Impala documentation 所解释的那样,CAST() 如果无法转换值则不会 return 错误,它 returns NULL:

If the expression value is of a type that cannot be converted to the target type, the result is NULL.

因此,条件duration is not null是不充分的。相反:

cast(duration as int) is not null

我不确定为什么 racelapsport 会有 NULL 值。我怀疑这些值可能是字符串 '[null]'.