SQL Select 仅缺少月份

SQL Select only missing months

请注意输出中缺少 2017-04-01、2018-02-01、2018-07-01 和 2019-01-01 月份。我只想显示那些缺失的月份。有人知道怎么做吗?

查询:

SELECT TO_DATE("Month", 'mon''yy') as dates FROM sample_sheet
group by dates
order by dates asc;

输出:

2017-01-01
2017-02-01
2017-03-01
2017-05-01
2017-06-01
2017-07-01
2017-08-01
2017-09-01
2017-10-01
2017-11-01
2017-12-01
2018-01-01
2018-03-01
2018-04-01
2018-05-01
2018-06-01
2018-08-01
2018-09-01
2018-10-01
2018-11-01
2018-12-01
2019-02-01
2019-03-01
2019-04-01

我不了解 Vertica,所以我在 Microsoft SQL Server 中编写了一个有效的概念证明,并尝试根据在线文档将其转换为 Vertica 语法。

它应该是这样的:

with 
months as (
   select 2017 as date_year, 1 as date_month, to_date('2017-01-01', 'YYYY-MM-DD') as first_date, to_date('2017-01-31', 'yyyy-mm-dd') as last_date
   union all
   select
      year(add_months(first_date, 1)) as date_year,
      month(add_months(first_date, 1)) as date_month, 
      add_months(first_date, 1) as first_date, 
      last_day(add_months(first_date, 1)) as last_date 
   from months
   where first_date < current_date
),
sample_dates (a_date) as (
   select to_date('2017-01-15', 'YYYY-MM-DD') union all
   select to_date('2017-01-22', 'YYYY-MM-DD') union all
   select to_date('2017-02-01', 'YYYY-MM-DD') union all
   select to_date('2017-04-15', 'YYYY-MM-DD') union all
   select to_date('2017-06-15', 'YYYY-MM-DD') 
)
select * 
from sample_dates right join months on sample_dates.a_date between first_date and last_date
where sample_dates.a_date is null

Months 是一个递归动态 table,它包含自 2017-01 以来的所有月份,包括该月的第一天和最后一天。 sample_dates 只是用于测试逻辑的日期列表 - 您应该将其替换为您自己的 table.

一旦你建立了那个月历table你需要做的就是使用外部查询检查你的日期以查看哪些日期不在[=之间的任何时期之间28=]last_date 列。

您可以构建第一个输入日期和最后一个输入日期之间的所有日期的 TIMESERIES(TIMESERIES 的最高粒度是天。),并仅从中过滤掉月份的第一天;然后将创建的第一个月序列与您的输入结合起来,以找出连接失败的位置,从连接的输入分支中检查 NULLS:

WITH
-- your input
input(mth1st) AS (
          SELECT DATE '2017-01-01'
UNION ALL SELECT DATE '2017-02-01'
UNION ALL SELECT DATE '2017-03-01'
UNION ALL SELECT DATE '2017-05-01'
UNION ALL SELECT DATE '2017-06-01'
UNION ALL SELECT DATE '2017-07-01'
UNION ALL SELECT DATE '2017-08-01'
UNION ALL SELECT DATE '2017-09-01'
UNION ALL SELECT DATE '2017-10-01'
UNION ALL SELECT DATE '2017-11-01'
UNION ALL SELECT DATE '2017-12-01'
UNION ALL SELECT DATE '2018-01-01'
UNION ALL SELECT DATE '2018-03-01'
UNION ALL SELECT DATE '2018-04-01'
UNION ALL SELECT DATE '2018-05-01'
UNION ALL SELECT DATE '2018-06-01'
UNION ALL SELECT DATE '2018-08-01'
UNION ALL SELECT DATE '2018-09-01'
UNION ALL SELECT DATE '2018-10-01'
UNION ALL SELECT DATE '2018-11-01'
UNION ALL SELECT DATE '2018-12-01'
UNION ALL SELECT DATE '2019-02-01'
UNION ALL SELECT DATE '2019-03-01'
UNION ALL SELECT DATE '2019-04-01'
)
,
-- need a series of month's firsts
-- TIMESERIES works for INTERVAL DAY TO SECOND
-- so build that timeseries, and filter out
-- the month's firsts
limits(mth1st) AS (
          SELECT MIN(mth1st) FROM input
UNION ALL SELECT MAX(mth1st) FROM input
)
,
alldates AS (
  SELECT dt::DATE FROM limits
  TIMESERIES dt AS '1 day' OVER(ORDER BY mth1st::TIMESTAMP)
)
,
allfirsts(mth1st) AS (
  SELECT dt FROM alldates WHERE DAY(dt)=1
)
SELECT
  allfirsts.mth1st
FROM allfirsts
LEFT JOIN input USING(mth1st)
WHERE input.mth1st IS NULL;
-- out    mth1st   
-- out ------------
-- out  2017-04-01
-- out  2018-02-01
-- out  2018-07-01
-- out  2019-01-01