在 Zeppelin 按日期排序的条形图中,9 月 1 日早于 8 月 31 日,请问如何解决?

1st Sept coming before 31st Aug in bar chart ordering by date in Zeppelin, how to fix please?

在 Zeppelin 中,我有一个简单的值(Y 轴)与日期(X 轴)的条形图,它在本月新开始(今天)之前工作正常,当时它将“9 月 1 日”放在“ 8 月 31 日”。我按日期字符串排序(因为这是我需要在图表上显示的字符串)。

查询:

%impala
SELECT FROM_TIMESTAMP(DATE_TRUNC('HOUR', concat(replace(my_timestamp,'"',''), "Z")), 'd MMM HH:mm') AS hours, COUNT(my_number) AS "number per hour"
FROM my_table
WHERE unix_timestamp(my_timestamp) > (unix_timestamp(now()) - 86400)
GROUP BY 1
ORDER BY 1 ASC
LIMIT 24;

我意识到这个问题是由于日期字符串的字母数字比较引起的。我想我可以通过为日期的 unix_timestamp() 添加第三列然后按它排序来修复它,但这会产生分组错误:

java.sql.SQLException: [Cloudera][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, errorMessage:AnalysisException: select list expression not produced by aggregation output (missing from GROUP BY clause?): unix_timestamp(my_timestamp)

对于此查询:

%impala
SELECT FROM_TIMESTAMP(DATE_TRUNC('HOUR', concat(replace(my_timestamp,'"',''), "Z")), 'd MMM HH:mm') AS hours, COUNT(my_number) AS "number per hour", unix_timestamp(my_timestamp)
FROM my_table
WHERE unix_timestamp(my_timestamp) > (unix_timestamp(now()) - 86400)
GROUP BY 1
ORDER BY 3 ASC
LIMIT 24;

请问如何解决才能得到正确顺序的图表?

计算 yyyy-MM-dd HH:mm 格式的附加列(与小时的粒度相同,但采用可排序格式)并将其添加到 groupby(小时列之前)和 order by(而不是小时列):

SELECT FROM_TIMESTAMP(DATE_TRUNC('HOUR', concat(replace(my_timestamp,'"',''), "Z")), 'd MMM HH:mm') AS hours, 
       FROM_TIMESTAMP(DATE_TRUNC('HOUR', concat(replace(my_timestamp,'"',''), "Z")), 'yyyy-MM-dd HH:mm') as dt,
       COUNT(my_number) AS "number per hour"
FROM my_table
WHERE unix_timestamp(my_timestamp) --also it seems Z should be removed, etc 
      > (unix_timestamp(now()) - 86400)
GROUP BY dt, hours
ORDER BY dt
LIMIT 24;