Postgres,每天从日期范围 select 获取唯一记录
Postgres, get unique records per day from date range select
我需要按日期范围生成一个包含登录用户的报告,但不能在同一天重复(如果某人在同一天登录两次,我们将不会列出两次)。不幸的是,我们将登录信息保留为 json(是的,我无法将其更改为单独的 table,我不知道是谁设计了这个数据库)。
查询以查看所有登录用户:
select a.id, username, email, ah.modified as login_date
from accounts a join
account_history ah
on modified_acc_id = a.id
where ah.data::jsonb->>'message' = 'Logon';
修改为带有时区的时间戳,用作登录日期。
我只找到了每天有不同 ID 计数的示例,但我不知道如何将其修改为每天 return 个不同结果
示例数据:
id | username | email | login_date
-----+-------------------------+---------------------------------+----------------------------
102 | example | example@example.com | 2018-12-06 09:30:10.573+00
102 | example | example@example.com | 2018-12-06 09:32:34.235+00
42 | rafal | rafal@example.com | 2018-12-06 09:45:24.884+00
576 | john | john@example.com | 2018-12-06 09:35:24.922+00
576 | john | john@example.com | 2018-12-07 09:58:04.253+00
想要的数据:
id | username | email | login_date
-----+-------------------------+---------------------------------+----------------------------
102 | example | example@example.com | 2018-12-06 09:30:10.573+00
42 | rafal | rafal@example.com | 2018-12-06 09:45:24.884+00
576 | john | john@example.com | 2018-12-06 09:35:24.922+00
576 | john | john@example.com | 2018-12-07 09:58:04.253+00
如你所见,没有第二行
您似乎想要一段时间内的用户天数。如果我理解正确的话:
select count(*) as num_user_days_in_range
from (select a.username, date_trunc('day', ah.modified) as login_date
from accounts a join
account_history ah
on modified_acc_id = a.id
where ah.data::jsonb->>'message' = 'Logon'
group by a.username, login_date
) u
where login_date >= $date1 and login_date < $date2
使用window函数row_number()
select id,username,email,login_date from
(
select a.id, username, email, ah.modified as login_date,
row_number() over(partition by a.id, username,email order by ah.modified) rn
from accounts a join
account_history ah
on modified_acc_id = a.id
where ah.data::jsonb->>'message' = 'Logon'
) t where t.rn=1
DISTINCT ON
准确地给出有序组的第一行。在您的示例中,该组是 id
和 login_date
时间戳
的 date
部分
SELECT DISTINCT ON (id, login_date::date)
*
FROM (
-- <your query>
) s
ORDER BY id, login_date::date, login_date
ORDER BY
子句的解释:
您必须先按 DISTINCT
列排序。但在你的情况下,你真的不想只按日期排序,而是按时间部分排序。因此,在按日期排序后(这是必要的,因为你的 DISTINCT
列)你也必须按时间戳排序。
所以整个查询可以简化为(没有子查询):
SELECT DISTINCT ON (a.id, ah.modified::date)
a.id,
username,
email,
ah.modified as login_date
FROM accounts a
JOIN account_history ah
ON modified_acc_id = a.id
WHERE ah.data::jsonb->>'message' = 'Logon'
ORDER BY a.id, ah.modified::date, ah.modified
看来有骗子的时候,你是最早约会的。如果是这样,这行得通吗?
select
a.id, username, email, min (ah.modified) as login_date
from accounts a join
account_history ah
on modified_acc_id = a.id
where ah.data::jsonb->>'message' = 'Logon'
group by a.id, username, email, ah.modified::date
我需要按日期范围生成一个包含登录用户的报告,但不能在同一天重复(如果某人在同一天登录两次,我们将不会列出两次)。不幸的是,我们将登录信息保留为 json(是的,我无法将其更改为单独的 table,我不知道是谁设计了这个数据库)。 查询以查看所有登录用户:
select a.id, username, email, ah.modified as login_date
from accounts a join
account_history ah
on modified_acc_id = a.id
where ah.data::jsonb->>'message' = 'Logon';
修改为带有时区的时间戳,用作登录日期。
我只找到了每天有不同 ID 计数的示例,但我不知道如何将其修改为每天 return 个不同结果
示例数据:
id | username | email | login_date
-----+-------------------------+---------------------------------+----------------------------
102 | example | example@example.com | 2018-12-06 09:30:10.573+00
102 | example | example@example.com | 2018-12-06 09:32:34.235+00
42 | rafal | rafal@example.com | 2018-12-06 09:45:24.884+00
576 | john | john@example.com | 2018-12-06 09:35:24.922+00
576 | john | john@example.com | 2018-12-07 09:58:04.253+00
想要的数据:
id | username | email | login_date
-----+-------------------------+---------------------------------+----------------------------
102 | example | example@example.com | 2018-12-06 09:30:10.573+00
42 | rafal | rafal@example.com | 2018-12-06 09:45:24.884+00
576 | john | john@example.com | 2018-12-06 09:35:24.922+00
576 | john | john@example.com | 2018-12-07 09:58:04.253+00
如你所见,没有第二行
您似乎想要一段时间内的用户天数。如果我理解正确的话:
select count(*) as num_user_days_in_range
from (select a.username, date_trunc('day', ah.modified) as login_date
from accounts a join
account_history ah
on modified_acc_id = a.id
where ah.data::jsonb->>'message' = 'Logon'
group by a.username, login_date
) u
where login_date >= $date1 and login_date < $date2
使用window函数row_number()
select id,username,email,login_date from
(
select a.id, username, email, ah.modified as login_date,
row_number() over(partition by a.id, username,email order by ah.modified) rn
from accounts a join
account_history ah
on modified_acc_id = a.id
where ah.data::jsonb->>'message' = 'Logon'
) t where t.rn=1
DISTINCT ON
准确地给出有序组的第一行。在您的示例中,该组是 id
和 login_date
时间戳
date
部分
SELECT DISTINCT ON (id, login_date::date)
*
FROM (
-- <your query>
) s
ORDER BY id, login_date::date, login_date
ORDER BY
子句的解释:
您必须先按 DISTINCT
列排序。但在你的情况下,你真的不想只按日期排序,而是按时间部分排序。因此,在按日期排序后(这是必要的,因为你的 DISTINCT
列)你也必须按时间戳排序。
所以整个查询可以简化为(没有子查询):
SELECT DISTINCT ON (a.id, ah.modified::date)
a.id,
username,
email,
ah.modified as login_date
FROM accounts a
JOIN account_history ah
ON modified_acc_id = a.id
WHERE ah.data::jsonb->>'message' = 'Logon'
ORDER BY a.id, ah.modified::date, ah.modified
看来有骗子的时候,你是最早约会的。如果是这样,这行得通吗?
select
a.id, username, email, min (ah.modified) as login_date
from accounts a join
account_history ah
on modified_acc_id = a.id
where ah.data::jsonb->>'message' = 'Logon'
group by a.id, username, email, ah.modified::date