尝试在 impala 中查找 LAST_VALUE() 时出错
Taking an Error while trying to find LAST_VALUE() in impala
我试图找到每个 id 的最后一个 blnc 值,但它抛出一个错误:
AnalysisException: select list expression not produced by aggregation
output (missing from GROUP BY clause?): last_value(blnc) OVER
(PARTITION BY id ORDER BY id date ASC ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) lasted.
SELECT id, number, type,
LAST_VALUE(blnc) OVER (PARTITION BY id ORDER BY date rows between unbounded preceding and unbounded following ) AS lasted ,
to_timestamp(MAX(date),'yyyyMMdd') as end_date,
concat(substr(date,1,6),"01") as start_date,
substr(date,1,6) as id_month
FROM table
GROUP BY id,number,type,concat(substr(date,1,6),"01"),substr(date,1,6)
我也将所有 LAST_VALUE() 语句放在 group by
中,但出现另一个错误。
问题是你的表达:
LAST_VALUE(blnc) OVER (PARTITION BY id
ORDER BY date
rows between unbounded preceding and unbounded following
) AS lasted ,
在 聚合之后 运行 的范围。因此,只有在聚合后可以理解的表达式才有效。并且没有 date
或 blnc
。您可以使用聚合函数解决此问题:
LAST_VALUE(MAX(blnc)) OVER (PARTITION BY id
ORDER BY MAX(date)
rows between unbounded preceding and unbounded following
) AS lasted ,
虽然这回答了您的问题并修复了语法错误,但它可能没有任何用处。我认为你想要条件聚合。您没有解释您想要的逻辑或提供示例数据,但想法是:
SELECT id, number, type,
to_timestamp(MAX(date), 'yyyyMMdd') as end_date,
concat(substr(date,1,6),"01") as start_date,
substr(date, 1, 6) as id_month,
MAX(CASE WHEN seqnum = 1 THEN blnc END) as lasted
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY id, number, type, concat(substr(date, 1, 6), '01'), substr(date,1,6)
ORDER BY date DESC
) as seqnum
FROM table t
) t
GROUP BY id, number, type, concat(substr(date, 1, 6), '01'), substr(date,1,6)
注意:日期的字符串操作看起来有误。如果列存储正确,您应该使用内置的 date/time 函数。
我试图找到每个 id 的最后一个 blnc 值,但它抛出一个错误:
AnalysisException: select list expression not produced by aggregation output (missing from GROUP BY clause?): last_value(blnc) OVER (PARTITION BY id ORDER BY id date ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) lasted.
SELECT id, number, type,
LAST_VALUE(blnc) OVER (PARTITION BY id ORDER BY date rows between unbounded preceding and unbounded following ) AS lasted ,
to_timestamp(MAX(date),'yyyyMMdd') as end_date,
concat(substr(date,1,6),"01") as start_date,
substr(date,1,6) as id_month
FROM table
GROUP BY id,number,type,concat(substr(date,1,6),"01"),substr(date,1,6)
我也将所有 LAST_VALUE() 语句放在 group by
中,但出现另一个错误。
问题是你的表达:
LAST_VALUE(blnc) OVER (PARTITION BY id
ORDER BY date
rows between unbounded preceding and unbounded following
) AS lasted ,
在 聚合之后 运行 的范围。因此,只有在聚合后可以理解的表达式才有效。并且没有 date
或 blnc
。您可以使用聚合函数解决此问题:
LAST_VALUE(MAX(blnc)) OVER (PARTITION BY id
ORDER BY MAX(date)
rows between unbounded preceding and unbounded following
) AS lasted ,
虽然这回答了您的问题并修复了语法错误,但它可能没有任何用处。我认为你想要条件聚合。您没有解释您想要的逻辑或提供示例数据,但想法是:
SELECT id, number, type,
to_timestamp(MAX(date), 'yyyyMMdd') as end_date,
concat(substr(date,1,6),"01") as start_date,
substr(date, 1, 6) as id_month,
MAX(CASE WHEN seqnum = 1 THEN blnc END) as lasted
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY id, number, type, concat(substr(date, 1, 6), '01'), substr(date,1,6)
ORDER BY date DESC
) as seqnum
FROM table t
) t
GROUP BY id, number, type, concat(substr(date, 1, 6), '01'), substr(date,1,6)
注意:日期的字符串操作看起来有误。如果列存储正确,您应该使用内置的 date/time 函数。