生成最小和最大日期之间的日期范围 Athena presto SQL 序列错误
generate date range between min and max dates Athena presto SQL sequence error
我正在尝试使用 unnest
和 sequence
在 Presto SQL (Athena) 中生成一系列日期,类似于 postgres 中的 generate_series
。
我的table长得像
job_name | run_date
A | '2021-08-21'
A | '2021-08-25'
B | '2021-08-07'
B | '2021-08-24'
SELECT d.job_name, d.run_date
FROM (
VALUES
('A', '2021-08-21'), ('A', '2021-08-25'),
('B', '2021-08-07'), ('B', '2021-08-24')
) d(job_name, run_date)
我的目标是输出如下
job_name | run_date
A | 2021-08-21
A | 2021-08-22
A | 2021-08-23
A | 2021-08-24
A | 2021-08-25
B | 2021-08-07
B | 2021-08-08
B | 2021-08-09
B | 2021-08-10
B | 2021-08-11
B | 2021-08-12
B | 2021-08-13
B | 2021-08-14
B | 2021-08-15
B | 2021-08-16
B | 2021-08-17
B | 2021-08-18
B | 2021-08-19
B | 2021-08-20
B | 2021-08-21
B | 2021-08-22
B | 2021-08-23
B | 2021-08-24
我尝试使用以下查询来实现此目的 - 但是在尝试取消嵌套我的日期序列时出现错误
SELECT t.job_name, d.dte
FROM (SELECT job_name
, min(run_date) as mind
, max(run_date) as maxd
, SEQUENCE(min(run_date), max(run_date)) as date_arr
FROM job_log_table t
GROUP BY job_name
) jd
CROSS JOIN
UNNEST(jd.date_arr) d(dte)
LEFT JOIN job_log_table t
ON t.job_name = jd.job_name
AND t.latest_date = d.dte;
这会产生以下错误:
[HY000][100071] [Simba][AthenaJDBC](100071) An error has been thrown from the AWS Athena client. [ErrorCategory:USER_ERROR, ErrorCode:SYNTAX_ERROR], Detail:SYNTAX_ERROR: line 5:14: Unexpected parameters (date, date) for function sequence. Expected: sequence(bigint, bigint, bigint) , sequence(bigint, bigint) , sequence(timestamp, timestamp, interval day to second) , sequence(timestamp, timestamp, interval year to month)
这是 Athena 的 Presto 风味的局限性 SQL 还是我在某处犯了一个小学生错误?
您需要提供 interval
以生成日期序列(在本例中为 interval '1' day
):
WITH dataset AS (
SELECT *
FROM
( VALUES
('A', DATE '2021-08-21'), ('A', DATE '2021-08-25'),
('B', DATE '2021-08-07'), ('B', DATE '2021-08-24')
) AS d (job_name, run_date)
)
select job_name, sequence(min(run_date), max(run_date), interval '1' day) seq
from dataset
group by job_name
输出:
job_name
seq
A
[2021-08-21 00:00:00.000, 2021-08-22 00:00:00.000, 2021-08-23 00:00:00.000, 2021-08-24 00:00:00.000, 2021-08-25 00:00:00.000]
B
[2021-08-07 00:00:00.000, 2021-08-08 00:00:00.000, 2021-08-09 00:00:00.000, 2021-08-10 00:00:00.000, 2021-08-11 00:00:00.000, 2021-08-12 00:00:00.000, 2021-08-13 00:00:00.000, 2021-08-14 00:00:00.000, 2021-08-15 00:00:00.000, 2021-08-16 00:00:00.000, 2021-08-17 00:00:00.000, 2021-08-18 00:00:00.000, 2021-08-19 00:00:00.000, 2021-08-20 00:00:00.000, 2021-08-21 00:00:00.000, 2021-08-22 00:00:00.000, 2021-08-23 00:00:00.000, 2021-08-24 00:00:00.000]
我正在尝试使用 unnest
和 sequence
在 Presto SQL (Athena) 中生成一系列日期,类似于 postgres 中的 generate_series
。
我的table长得像
job_name | run_date
A | '2021-08-21'
A | '2021-08-25'
B | '2021-08-07'
B | '2021-08-24'
SELECT d.job_name, d.run_date
FROM (
VALUES
('A', '2021-08-21'), ('A', '2021-08-25'),
('B', '2021-08-07'), ('B', '2021-08-24')
) d(job_name, run_date)
我的目标是输出如下
job_name | run_date
A | 2021-08-21
A | 2021-08-22
A | 2021-08-23
A | 2021-08-24
A | 2021-08-25
B | 2021-08-07
B | 2021-08-08
B | 2021-08-09
B | 2021-08-10
B | 2021-08-11
B | 2021-08-12
B | 2021-08-13
B | 2021-08-14
B | 2021-08-15
B | 2021-08-16
B | 2021-08-17
B | 2021-08-18
B | 2021-08-19
B | 2021-08-20
B | 2021-08-21
B | 2021-08-22
B | 2021-08-23
B | 2021-08-24
我尝试使用以下查询来实现此目的 - 但是在尝试取消嵌套我的日期序列时出现错误
SELECT t.job_name, d.dte
FROM (SELECT job_name
, min(run_date) as mind
, max(run_date) as maxd
, SEQUENCE(min(run_date), max(run_date)) as date_arr
FROM job_log_table t
GROUP BY job_name
) jd
CROSS JOIN
UNNEST(jd.date_arr) d(dte)
LEFT JOIN job_log_table t
ON t.job_name = jd.job_name
AND t.latest_date = d.dte;
这会产生以下错误:
[HY000][100071] [Simba][AthenaJDBC](100071) An error has been thrown from the AWS Athena client. [ErrorCategory:USER_ERROR, ErrorCode:SYNTAX_ERROR], Detail:SYNTAX_ERROR: line 5:14: Unexpected parameters (date, date) for function sequence. Expected: sequence(bigint, bigint, bigint) , sequence(bigint, bigint) , sequence(timestamp, timestamp, interval day to second) , sequence(timestamp, timestamp, interval year to month)
这是 Athena 的 Presto 风味的局限性 SQL 还是我在某处犯了一个小学生错误?
您需要提供 interval
以生成日期序列(在本例中为 interval '1' day
):
WITH dataset AS (
SELECT *
FROM
( VALUES
('A', DATE '2021-08-21'), ('A', DATE '2021-08-25'),
('B', DATE '2021-08-07'), ('B', DATE '2021-08-24')
) AS d (job_name, run_date)
)
select job_name, sequence(min(run_date), max(run_date), interval '1' day) seq
from dataset
group by job_name
输出:
job_name | seq |
---|---|
A | [2021-08-21 00:00:00.000, 2021-08-22 00:00:00.000, 2021-08-23 00:00:00.000, 2021-08-24 00:00:00.000, 2021-08-25 00:00:00.000] |
B | [2021-08-07 00:00:00.000, 2021-08-08 00:00:00.000, 2021-08-09 00:00:00.000, 2021-08-10 00:00:00.000, 2021-08-11 00:00:00.000, 2021-08-12 00:00:00.000, 2021-08-13 00:00:00.000, 2021-08-14 00:00:00.000, 2021-08-15 00:00:00.000, 2021-08-16 00:00:00.000, 2021-08-17 00:00:00.000, 2021-08-18 00:00:00.000, 2021-08-19 00:00:00.000, 2021-08-20 00:00:00.000, 2021-08-21 00:00:00.000, 2021-08-22 00:00:00.000, 2021-08-23 00:00:00.000, 2021-08-24 00:00:00.000] |