Postgres，每天从日期范围 select 获取唯一记录

Question

我需要按日期范围生成一个包含登录用户的报告，但不能在同一天重复（如果某人在同一天登录两次，我们将不会列出两次）。不幸的是，我们将登录信息保留为 json（是的，我无法将其更改为单独的 table，我不知道是谁设计了这个数据库）。查询以查看所有登录用户：

select a.id, username, email, ah.modified as login_date
from accounts a join
     account_history ah
     on modified_acc_id = a.id
 where ah.data::jsonb->>'message' = 'Logon';

修改为带有时区的时间戳，用作登录日期。

我只找到了每天有不同 ID 计数的示例，但我不知道如何将其修改为每天 return 个不同结果

示例数据：

 id  |        username  |              email       |         login_date
-----+-------------------------+---------------------------------+----------------------------
 102 | example          | example@example.com      | 2018-12-06 09:30:10.573+00
 102 | example          | example@example.com      | 2018-12-06 09:32:34.235+00
  42 | rafal            | rafal@example.com        | 2018-12-06 09:45:24.884+00
 576 | john             | john@example.com         | 2018-12-06 09:35:24.922+00
 576 | john             | john@example.com         | 2018-12-07 09:58:04.253+00

想要的数据：

 id  |        username  |              email       |         login_date
-----+-------------------------+---------------------------------+----------------------------
 102 | example          | example@example.com      | 2018-12-06 09:30:10.573+00
  42 | rafal            | rafal@example.com        | 2018-12-06 09:45:24.884+00
 576 | john             | john@example.com         | 2018-12-06 09:35:24.922+00
 576 | john             | john@example.com         | 2018-12-07 09:58:04.253+00

如你所见，没有第二行

Answer 1

您似乎想要一段时间内的用户天数。如果我理解正确的话：

select count(*) as num_user_days_in_range
from (select a.username, date_trunc('day', ah.modified) as login_date
      from accounts a join
           account_history ah
           on modified_acc_id = a.id
      where ah.data::jsonb->>'message' = 'Logon'
      group by a.username, login_date
     ) u
where login_date >= $date1 and login_date < $date2

Answer 2

使用window函数row_number()

select id,username,email,login_date from 
(
 select a.id, username, email, ah.modified as login_date,
row_number() over(partition by a.id, username,email order by ah.modified) rn
 from accounts a join
 account_history ah
 on modified_acc_id = a.id
 where ah.data::jsonb->>'message' = 'Logon'
) t where t.rn=1

Answer 3

DISTINCT ON 准确地给出有序组的第一行。在您的示例中，该组是 id 和 login_date 时间戳

的 date 部分

SELECT DISTINCT ON (id, login_date::date)
    *
FROM (
    -- <your query>
) s
ORDER BY id, login_date::date, login_date

demo:db<>fiddle

ORDER BY 子句的解释：

您必须先按 DISTINCT 列排序。但在你的情况下，你真的不想只按日期排序，而是按时间部分排序。因此，在按日期排序后（这是必要的，因为你的 DISTINCT 列）你也必须按时间戳排序。

所以整个查询可以简化为（没有子查询）：

SELECT DISTINCT ON (a.id, ah.modified::date) 
    a.id, 
    username, 
    email, 
    ah.modified as login_date
FROM accounts a 
JOIN account_history ah
    ON modified_acc_id = a.id
WHERE ah.data::jsonb->>'message' = 'Logon'
ORDER BY a.id, ah.modified::date, ah.modified

Answer 4

看来有骗子的时候，你是最早约会的。如果是这样，这行得通吗？

select
  a.id, username, email, min (ah.modified) as login_date
from accounts a join
     account_history ah
     on modified_acc_id = a.id
 where ah.data::jsonb->>'message' = 'Logon'
group by a.id, username, email, ah.modified::date

Postgres，每天从日期范围 select 获取唯一记录

Postgres, get unique records per day from date range select

postgresql

json

distinct-values