在保留空行的同时过滤使用 generate_series 的查询
Filtering a query that uses generate_series while keeping null rows
我正在 Django 中执行原始查询,以便利用 PostgreSQL 的 generate_series
函数并获取间隔中每个日期的行,将生成的日期与日期时间范围进行比较(order_dates
) 在 BaseEntry 模型中(这在 ORM 中是不可能直接实现的)。
我使用的是非托管模型:
class OrderBinnedStat(models.Model):
bin = models.DateTimeField()
id_count = models.IntegerField()
avg_flow = models.IntegerField()
sum_flow = models.IntegerField()
class Meta:
managed = False
和运行下面的代码查询并查看结果:
from django.utils import timezone
timezone.activate("UTC")
end_dt = timezone.datetime(year=2021, month=12, day=26, hour=0, minute=0, second=0)
end_dt = timezone.make_aware(end_dt)
start_dt = end_dt - timezone.timedelta(days=1)
minutes = 60
report = OrderBinnedStat.objects.raw("""
SELECT row_number() OVER () AS id,
dt_range.bin,
count(DISTINCT t.id) AS id_count,
avg(DISTINCT t.flow) AS avg_flow,
sum(DISTINCT t.flow) AS sum_flow
FROM (
SELECT generate_series(%s, %s - interval '1 milliseconds', interval '%s min')
FROM public.orders_baseentry t -- is this line needed?
) dt_range(bin)
LEFT JOIN public.orders_baseentry t ON t.order_dates @> dt_range.bin
-- WHERE t.status = ANY(ARRAY['new', 'cancelled'])
GROUP BY dt_range.bin
ORDER BY dt_range.bin;
""", [start_dt, end_dt, minutes])
for item in report:
print(item.id, item.bin, item.id_count, item.avg_flow, item.sum_flow)
当我不过滤 baseentry
table 时效果很好。正如预期的那样,我得到 24 行(请求间隔的每小时一行)。没有 order_dates
与 2021-12-25 21:00:00+00:00 及以后重叠,因此这些行的值都是 0
或 None
。完美!
1 2021-12-25 00:00:00+00:00 1 0.65128747160000000000 0.6512874716
2 2021-12-25 01:00:00+00:00 1 0.65128747160000000000 0.6512874716
3 2021-12-25 02:00:00+00:00 1 0.65128747160000000000 0.6512874716
4 2021-12-25 03:00:00+00:00 1 0.65128747160000000000 0.6512874716
5 2021-12-25 04:00:00+00:00 1 0.65128747160000000000 0.6512874716
6 2021-12-25 05:00:00+00:00 1 0.65128747160000000000 0.6512874716
7 2021-12-25 06:00:00+00:00 1 0.65128747160000000000 0.6512874716
8 2021-12-25 07:00:00+00:00 1 0.65128747160000000000 0.6512874716
9 2021-12-25 08:00:00+00:00 1 0.65128747160000000000 0.6512874716
10 2021-12-25 09:00:00+00:00 1 0.65128747160000000000 0.6512874716
11 2021-12-25 10:00:00+00:00 1 0.65128747160000000000 0.6512874716
12 2021-12-25 11:00:00+00:00 1 0.65128747160000000000 0.6512874716
13 2021-12-25 12:00:00+00:00 1 0.65128747160000000000 0.6512874716
14 2021-12-25 13:00:00+00:00 1 0.65128747160000000000 0.6512874716
15 2021-12-25 14:00:00+00:00 1 0.65128747160000000000 0.6512874716
16 2021-12-25 15:00:00+00:00 1 0.65128747160000000000 0.6512874716
17 2021-12-25 16:00:00+00:00 1 0.65128747160000000000 0.6512874716
18 2021-12-25 17:00:00+00:00 1 0.65128747160000000000 0.6512874716
19 2021-12-25 18:00:00+00:00 1 0.65128747160000000000 0.6512874716
20 2021-12-25 19:00:00+00:00 1 0.65128747160000000000 0.6512874716
21 2021-12-25 20:00:00+00:00 1 0.65128747160000000000 0.6512874716
22 2021-12-25 21:00:00+00:00 0 None None
23 2021-12-25 22:00:00+00:00 0 None None
24 2021-12-25 23:00:00+00:00 0 None None
但是如果我取消注释 WHERE
子句以开始过滤(或添加任何其他过滤),我会得到以下结果:
1 2021-12-25 00:00:00+00:00 1 0.65128747160000000000 0.6512874716
2 2021-12-25 01:00:00+00:00 1 0.65128747160000000000 0.6512874716
3 2021-12-25 02:00:00+00:00 1 0.65128747160000000000 0.6512874716
4 2021-12-25 03:00:00+00:00 1 0.65128747160000000000 0.6512874716
5 2021-12-25 04:00:00+00:00 1 0.65128747160000000000 0.6512874716
6 2021-12-25 05:00:00+00:00 1 0.65128747160000000000 0.6512874716
7 2021-12-25 06:00:00+00:00 1 0.65128747160000000000 0.6512874716
8 2021-12-25 07:00:00+00:00 1 0.65128747160000000000 0.6512874716
9 2021-12-25 08:00:00+00:00 1 0.65128747160000000000 0.6512874716
10 2021-12-25 09:00:00+00:00 1 0.65128747160000000000 0.6512874716
11 2021-12-25 10:00:00+00:00 1 0.65128747160000000000 0.6512874716
12 2021-12-25 11:00:00+00:00 1 0.65128747160000000000 0.6512874716
13 2021-12-25 12:00:00+00:00 1 0.65128747160000000000 0.6512874716
14 2021-12-25 13:00:00+00:00 1 0.65128747160000000000 0.6512874716
15 2021-12-25 14:00:00+00:00 1 0.65128747160000000000 0.6512874716
16 2021-12-25 15:00:00+00:00 1 0.65128747160000000000 0.6512874716
17 2021-12-25 16:00:00+00:00 1 0.65128747160000000000 0.6512874716
18 2021-12-25 17:00:00+00:00 1 0.65128747160000000000 0.6512874716
19 2021-12-25 18:00:00+00:00 1 0.65128747160000000000 0.6512874716
20 2021-12-25 19:00:00+00:00 1 0.65128747160000000000 0.6512874716
21 2021-12-25 20:00:00+00:00 1 0.65128747160000000000 0.6512874716
- 我的
WHERE
子句应该放在其他地方吗?
- 我做错了什么?
我需要能够过滤 baseentry
table,同时保留空行。
将 WHERE t.status = ANY(ARRAY['new', 'cancelled'])
替换为 AND t.status = ANY(ARRAY['new', 'cancelled'])
,它可以正常工作!
我正在 Django 中执行原始查询,以便利用 PostgreSQL 的 generate_series
函数并获取间隔中每个日期的行,将生成的日期与日期时间范围进行比较(order_dates
) 在 BaseEntry 模型中(这在 ORM 中是不可能直接实现的)。
我使用的是非托管模型:
class OrderBinnedStat(models.Model):
bin = models.DateTimeField()
id_count = models.IntegerField()
avg_flow = models.IntegerField()
sum_flow = models.IntegerField()
class Meta:
managed = False
和运行下面的代码查询并查看结果:
from django.utils import timezone
timezone.activate("UTC")
end_dt = timezone.datetime(year=2021, month=12, day=26, hour=0, minute=0, second=0)
end_dt = timezone.make_aware(end_dt)
start_dt = end_dt - timezone.timedelta(days=1)
minutes = 60
report = OrderBinnedStat.objects.raw("""
SELECT row_number() OVER () AS id,
dt_range.bin,
count(DISTINCT t.id) AS id_count,
avg(DISTINCT t.flow) AS avg_flow,
sum(DISTINCT t.flow) AS sum_flow
FROM (
SELECT generate_series(%s, %s - interval '1 milliseconds', interval '%s min')
FROM public.orders_baseentry t -- is this line needed?
) dt_range(bin)
LEFT JOIN public.orders_baseentry t ON t.order_dates @> dt_range.bin
-- WHERE t.status = ANY(ARRAY['new', 'cancelled'])
GROUP BY dt_range.bin
ORDER BY dt_range.bin;
""", [start_dt, end_dt, minutes])
for item in report:
print(item.id, item.bin, item.id_count, item.avg_flow, item.sum_flow)
当我不过滤 baseentry
table 时效果很好。正如预期的那样,我得到 24 行(请求间隔的每小时一行)。没有 order_dates
与 2021-12-25 21:00:00+00:00 及以后重叠,因此这些行的值都是 0
或 None
。完美!
1 2021-12-25 00:00:00+00:00 1 0.65128747160000000000 0.6512874716
2 2021-12-25 01:00:00+00:00 1 0.65128747160000000000 0.6512874716
3 2021-12-25 02:00:00+00:00 1 0.65128747160000000000 0.6512874716
4 2021-12-25 03:00:00+00:00 1 0.65128747160000000000 0.6512874716
5 2021-12-25 04:00:00+00:00 1 0.65128747160000000000 0.6512874716
6 2021-12-25 05:00:00+00:00 1 0.65128747160000000000 0.6512874716
7 2021-12-25 06:00:00+00:00 1 0.65128747160000000000 0.6512874716
8 2021-12-25 07:00:00+00:00 1 0.65128747160000000000 0.6512874716
9 2021-12-25 08:00:00+00:00 1 0.65128747160000000000 0.6512874716
10 2021-12-25 09:00:00+00:00 1 0.65128747160000000000 0.6512874716
11 2021-12-25 10:00:00+00:00 1 0.65128747160000000000 0.6512874716
12 2021-12-25 11:00:00+00:00 1 0.65128747160000000000 0.6512874716
13 2021-12-25 12:00:00+00:00 1 0.65128747160000000000 0.6512874716
14 2021-12-25 13:00:00+00:00 1 0.65128747160000000000 0.6512874716
15 2021-12-25 14:00:00+00:00 1 0.65128747160000000000 0.6512874716
16 2021-12-25 15:00:00+00:00 1 0.65128747160000000000 0.6512874716
17 2021-12-25 16:00:00+00:00 1 0.65128747160000000000 0.6512874716
18 2021-12-25 17:00:00+00:00 1 0.65128747160000000000 0.6512874716
19 2021-12-25 18:00:00+00:00 1 0.65128747160000000000 0.6512874716
20 2021-12-25 19:00:00+00:00 1 0.65128747160000000000 0.6512874716
21 2021-12-25 20:00:00+00:00 1 0.65128747160000000000 0.6512874716
22 2021-12-25 21:00:00+00:00 0 None None
23 2021-12-25 22:00:00+00:00 0 None None
24 2021-12-25 23:00:00+00:00 0 None None
但是如果我取消注释 WHERE
子句以开始过滤(或添加任何其他过滤),我会得到以下结果:
1 2021-12-25 00:00:00+00:00 1 0.65128747160000000000 0.6512874716
2 2021-12-25 01:00:00+00:00 1 0.65128747160000000000 0.6512874716
3 2021-12-25 02:00:00+00:00 1 0.65128747160000000000 0.6512874716
4 2021-12-25 03:00:00+00:00 1 0.65128747160000000000 0.6512874716
5 2021-12-25 04:00:00+00:00 1 0.65128747160000000000 0.6512874716
6 2021-12-25 05:00:00+00:00 1 0.65128747160000000000 0.6512874716
7 2021-12-25 06:00:00+00:00 1 0.65128747160000000000 0.6512874716
8 2021-12-25 07:00:00+00:00 1 0.65128747160000000000 0.6512874716
9 2021-12-25 08:00:00+00:00 1 0.65128747160000000000 0.6512874716
10 2021-12-25 09:00:00+00:00 1 0.65128747160000000000 0.6512874716
11 2021-12-25 10:00:00+00:00 1 0.65128747160000000000 0.6512874716
12 2021-12-25 11:00:00+00:00 1 0.65128747160000000000 0.6512874716
13 2021-12-25 12:00:00+00:00 1 0.65128747160000000000 0.6512874716
14 2021-12-25 13:00:00+00:00 1 0.65128747160000000000 0.6512874716
15 2021-12-25 14:00:00+00:00 1 0.65128747160000000000 0.6512874716
16 2021-12-25 15:00:00+00:00 1 0.65128747160000000000 0.6512874716
17 2021-12-25 16:00:00+00:00 1 0.65128747160000000000 0.6512874716
18 2021-12-25 17:00:00+00:00 1 0.65128747160000000000 0.6512874716
19 2021-12-25 18:00:00+00:00 1 0.65128747160000000000 0.6512874716
20 2021-12-25 19:00:00+00:00 1 0.65128747160000000000 0.6512874716
21 2021-12-25 20:00:00+00:00 1 0.65128747160000000000 0.6512874716
- 我的
WHERE
子句应该放在其他地方吗? - 我做错了什么?
我需要能够过滤 baseentry
table,同时保留空行。
将 WHERE t.status = ANY(ARRAY['new', 'cancelled'])
替换为 AND t.status = ANY(ARRAY['new', 'cancelled'])
,它可以正常工作!