每天每个状态的 PostgreSQL 计数
PostgreSQL count of each status per day
我有以下 table:
Reservations
| id | status | created_at |
| 1 | Opened | 2019-11-12 11:46:11 |
| 1 | Completed | 2019-11-19 23:03:24 |
| 1 | Pending | 2019-11-15 12:04:13 |
| 2 | Opened | 2019-11-14 11:46:11 |
| 2 | Completed | 2019-11-20 23:03:24 |
| 2 | Pending | 2019-11-17 12:04:13 |
我也有一个 table 从 2019-11-01 到 2019-12-31 的每个日历日。
我需要找出在上面列出的时间跨度内,每个日历日每种状态出现的次数。
如果状态在 2019-12-14 为 Open,在 2019-12-17 为 Pending,我需要计算从 2019-12-14 到 2019-12-17 每天都打开。
理想:
|2019-11-12 00:00:00 | Opened | 1 |
|2019-11-12 00:00:00 | Pending | 0 |
|2019-11-12 00:00:00 | Completed | 0 |
|2019-11-13 00:00:00 | Opened | 1 |
|2019-11-13 00:00:00 | Pending | 0 |
|2019-11-13 00:00:00 | Completed | 0 |
|2019-11-14 00:00:00 | Opened | 2 |
|2019-11-14 00:00:00 | Pending | 0 |
|2019-11-14 00:00:00 | Completed | 0 |
|2019-11-15 00:00:00 | Opened | 1 |
|2019-11-15 00:00:00 | Pending | 1 |
|2019-11-15 00:00:00 | Completed | 0 |
非常感谢任何帮助。
编辑:
下面 GMB 的解决方案非常接近,但它给我留下了以下 table:
| status | created_at | ended_at |
| Opened | 2019-11-12 11:46:11 | 2019-11-15 12:04:13 |
| Pending | 2019-11-15 12:04:13 | 2019-11-19 23:03:24 |
| Completed | 2019-11-19 23:03:24 | |
| Opened | 2019-11-14 11:46:11 | 2019-11-17 12:04:13 |
| Pending | 2019-11-17 12:04:13 | 2019-11-20 23:03:24 |
| Completed | 2019-11-20 23:03:24 | |
如何将结束日期添加到我的范围 (2019-12-31) 到缺失的列值中?
我会这样做:
从 2019-11-01 到 2019-12-31 的每个日历日,使用您的 table 获取每个 ID 的每个状态的开始和结束,并按状态和日期进行基本计数
with Reservations cte as
(
select
a.id, a.status, a.created_at::date,
LAG(a.created_at::date, 1,0) OVER (PARTITION BY YEAR(a.id) ORDER BY YEAR(a.created_at))
AS Ended_at
Reservations a
)
Select b.day, status, count(*)
from Reservations a inner join calendar b on b.day >= created_at and
b.day < Ended_at
group by b.day, status
考虑以下查询:
select
c.dt,
s.status,
count(t.status)
from
calendar c
cross join (select distinct status from reservations) s
left join (
select
status,
created_at,
lead(created_at) over(partition by id order by created_at) ended_at
from reservations
) t
on t.status = s.status
and c.dt + interval '1 day' >= t.created_at
and c.dt + interval '1 day' < t.ended_at
group by c.dt, s.status
order by c.dt, s.status
这是通过将日历 table 与 table 中可用的不同状态列表交叉连接,然后将其与使用 lead()
的子查询连接来实现的与每条记录关联的 下一个 状态的日期。如果您有 table 个状态,则可以使用它代替选择不同状态的子查询。
dt | status | count
:--------------------- | :-------- | ----:
2019-11-12 00:00:00+00 | Completed | 0
2019-11-12 00:00:00+00 | Opened | 1
2019-11-12 00:00:00+00 | Pending | 0
2019-11-13 00:00:00+00 | Completed | 0
2019-11-13 00:00:00+00 | Opened | 1
2019-11-13 00:00:00+00 | Pending | 0
2019-11-14 00:00:00+00 | Completed | 0
2019-11-14 00:00:00+00 | Opened | 2
2019-11-14 00:00:00+00 | Pending | 0
2019-11-15 00:00:00+00 | Completed | 0
2019-11-15 00:00:00+00 | Opened | 1
2019-11-15 00:00:00+00 | Pending | 1
请注意,数据库 Fiddle 演示了如何使用方便的 Postgres 函数 generate_series()
来填写日历 table。
我有以下 table:
Reservations
| id | status | created_at |
| 1 | Opened | 2019-11-12 11:46:11 |
| 1 | Completed | 2019-11-19 23:03:24 |
| 1 | Pending | 2019-11-15 12:04:13 |
| 2 | Opened | 2019-11-14 11:46:11 |
| 2 | Completed | 2019-11-20 23:03:24 |
| 2 | Pending | 2019-11-17 12:04:13 |
我也有一个 table 从 2019-11-01 到 2019-12-31 的每个日历日。
我需要找出在上面列出的时间跨度内,每个日历日每种状态出现的次数。
如果状态在 2019-12-14 为 Open,在 2019-12-17 为 Pending,我需要计算从 2019-12-14 到 2019-12-17 每天都打开。
理想:
|2019-11-12 00:00:00 | Opened | 1 |
|2019-11-12 00:00:00 | Pending | 0 |
|2019-11-12 00:00:00 | Completed | 0 |
|2019-11-13 00:00:00 | Opened | 1 |
|2019-11-13 00:00:00 | Pending | 0 |
|2019-11-13 00:00:00 | Completed | 0 |
|2019-11-14 00:00:00 | Opened | 2 |
|2019-11-14 00:00:00 | Pending | 0 |
|2019-11-14 00:00:00 | Completed | 0 |
|2019-11-15 00:00:00 | Opened | 1 |
|2019-11-15 00:00:00 | Pending | 1 |
|2019-11-15 00:00:00 | Completed | 0 |
非常感谢任何帮助。
编辑: 下面 GMB 的解决方案非常接近,但它给我留下了以下 table:
| status | created_at | ended_at |
| Opened | 2019-11-12 11:46:11 | 2019-11-15 12:04:13 |
| Pending | 2019-11-15 12:04:13 | 2019-11-19 23:03:24 |
| Completed | 2019-11-19 23:03:24 | |
| Opened | 2019-11-14 11:46:11 | 2019-11-17 12:04:13 |
| Pending | 2019-11-17 12:04:13 | 2019-11-20 23:03:24 |
| Completed | 2019-11-20 23:03:24 | |
如何将结束日期添加到我的范围 (2019-12-31) 到缺失的列值中?
我会这样做: 从 2019-11-01 到 2019-12-31 的每个日历日,使用您的 table 获取每个 ID 的每个状态的开始和结束,并按状态和日期进行基本计数
with Reservations cte as
(
select
a.id, a.status, a.created_at::date,
LAG(a.created_at::date, 1,0) OVER (PARTITION BY YEAR(a.id) ORDER BY YEAR(a.created_at))
AS Ended_at
Reservations a
)
Select b.day, status, count(*)
from Reservations a inner join calendar b on b.day >= created_at and
b.day < Ended_at
group by b.day, status
考虑以下查询:
select
c.dt,
s.status,
count(t.status)
from
calendar c
cross join (select distinct status from reservations) s
left join (
select
status,
created_at,
lead(created_at) over(partition by id order by created_at) ended_at
from reservations
) t
on t.status = s.status
and c.dt + interval '1 day' >= t.created_at
and c.dt + interval '1 day' < t.ended_at
group by c.dt, s.status
order by c.dt, s.status
这是通过将日历 table 与 table 中可用的不同状态列表交叉连接,然后将其与使用 lead()
的子查询连接来实现的与每条记录关联的 下一个 状态的日期。如果您有 table 个状态,则可以使用它代替选择不同状态的子查询。
dt | status | count :--------------------- | :-------- | ----: 2019-11-12 00:00:00+00 | Completed | 0 2019-11-12 00:00:00+00 | Opened | 1 2019-11-12 00:00:00+00 | Pending | 0 2019-11-13 00:00:00+00 | Completed | 0 2019-11-13 00:00:00+00 | Opened | 1 2019-11-13 00:00:00+00 | Pending | 0 2019-11-14 00:00:00+00 | Completed | 0 2019-11-14 00:00:00+00 | Opened | 2 2019-11-14 00:00:00+00 | Pending | 0 2019-11-15 00:00:00+00 | Completed | 0 2019-11-15 00:00:00+00 | Opened | 1 2019-11-15 00:00:00+00 | Pending | 1
请注意,数据库 Fiddle 演示了如何使用方便的 Postgres 函数 generate_series()
来填写日历 table。