按日期和其他列计算最后一个条目
Count last entry by date and other column
我目前正在处理一个统计页面,其中一个 table 让我很纠结。
+----+-----------+----------+---------------------+
| id | id_status | id_queue | datetime | // Comments
+----+-----------+----------+---------------------+
| 1 | 1 | 1 | 2021-07-01 17:03:13 | //<- last_entry: id_queue:1 for: #1# (id_status: 1)
| 2 | 1 | 2 | 2021-07-01 17:03:18 | //<- last_entry: id_queue:2 for: #1#2# (id_status: 1)
| 9 | 1 | 9 | 2021-07-01 17:03:45 |
| 10 | 1 | 10 | 2021-07-01 17:03:49 |
| 11 | 2 | 7 | 2021-07-01 17:04:10 |
| 12 | 3 | 7 | 2021-07-01 17:07:36 |
| 13 | 2 | 10 | 2021-07-01 17:07:54 |
| 14 | 3 | 10 | 2021-07-01 17:08:36 | //<- last_entry: id_queue:10 for: #1# (id_status: 3)
| 15 | 2 | 9 | 2021-07-01 17:15:04 |
| 16 | 5 | 9 | 2021-07-01 17:15:24 | //<- last_entry: id_queue:9 for: #1#2#3#4# (id_status: 5)
| 18 | 3 | 7 | 2021-07-01 17:35:58 | //<- last_entry: id_queue:7 for: #1# (id_status: 3)
//- - - - - new day #2# - - - - - -
| 19 | 2 | 7 | 2021-07-02 18:36:23 |
| 21 | 3 | 1 | 2021-07-02 18:39:49 |
| 22 | 14 | 1 | 2021-07-02 18:40:17 |
| 23 | 14 | 10 | 2021-07-02 18:40:17 |
| 24 | 2 | 1 | 2021-07-02 19:14:21 |
| 25 | 1 | 1 | 2021-07-02 19:14:32 | //<- last_entry: id_queue:1 for: #2#3#4# (id_status: 1)
| 26 | 2 | 10 | 2021-07-02 19:14:35 |
| 27 | 1 | 10 | 2021-07-02 19:14:39 | //<- last_entry: id_queue:10 for: #2#3#4# (id_status: 1)
| 28 | 1 | 7 | 2021-07-02 19:14:46 | //<- last_entry: id_queue:7 for: #2#3#4# (id_status: 1)
//- - - - - new day #3# - - - - - -
| 31 | 2 | 2 | 2021-07-05 17:20:39 |
| 32 | 3 | 2 | 2021-07-05 17:24:59 | //<- last_entry: id_queue:2 for: #3# (id_status: 3)
//- - - - - new day #4# - - - - - -
| 33 | 2 | 3 | 2021-07-06 09:38:03 |
| 34 | 3 | 3 | 2021-07-06 09:38:16 | //<- last_entry: id_queue:3 for: #4# (id_status: 3)
| 35 | 2 | 6 | 2021-07-06 10:12:18 | //<- last_entry: id_queue:6 for: #4# (id_status: 2)
| 37 | 2 | 2 | 2021-07-06 11:37:50 |
| 38 | 13 | 2 | 2021-07-06 12:02:19 |
| 39 | 2 | 2 | 2021-07-06 12:02:21 |
| 40 | 13 | 2 | 2021-07-06 12:04:12 | //<- last_entry: id_queue:2 for: #4# (id_status: 13)
+----+-----------+----------+---------------------+
我想为每个 %Y/%m/%d
从每个 id_queue.
获取每个最后条目的计数,直到 id_status 当前日期
基本上,对于每一天,它都会根据实际的 datetime
列计算最后一个条目。
基于上面前面table的输出示例:
+-----------+--------------+------------+-------------------------------------------+
| id_status | occurrences | day | HELP_COLUMN(comment id from table above) |
+-----------+--------------+------------+-------------------------------------------+
| 1 | 2 | 2021-07-01 | #1# |
| 3 | 2 | 2021-07-01 | #1# |
| 5 | 1 | 2021-07-01 | #1# |
| 2 | 4 | 2021-07-02 | #2# |
| 5 | 1 | 2021-07-02 | #2# |
| 1 | 3 | 2021-07-05 | #3# |
| 3 | 1 | 2021-07-05 | #3# |
| 5 | 1 | 2021-07-05 | #4# |
| 1 | 3 | 2021-07-05 | #4# |
| 3 | 1 | 2021-07-05 | #4# |
| 2 | 1 | 2021-07-05 | #4# |
| 13 | 1 | 2021-07-05 | #4# |
+-----------+--------------+------------+-------------------------------------------+
- 对于 #1#(第一天)取 id_queue 的最后条目(基于
最大(日期时间).
- #2#(第二天)取 id_queue 的最后条目
(基于 max(datetime))(如您所见
id_queue 在前一天)。等...
我尝试了多种方法,但我开始觉得这看起来需要用迭代器来解决...但我不能每天执行一个 SQL 查询,这会消耗太多性能.
有人知道我可以使用哪个 SQL 请求吗?
坦克.
编辑:这是另一个例子:
输入:
+-----+-----------+----------+---------------------+
| id | id_status | id_queue | datetime |
+-----+-----------+----------+---------------------+
| 61 | 5 | 1 | 2021-07-01 15:03:40 |
| 132 | 5 | 1 | 2021-07-01 16:39:13 |
| 1 | 1 | 1 | 2021-07-01 17:03:13 | <- last 1 : 1 #1#
| 2 | 1 | 2 | 2021-07-01 17:03:18 | <- last 2 : 1 #1#2#
| 3 | 1 | 3 | 2021-07-01 17:03:21 | <- last 3 : 1 #1#2#
| 4 | 1 | 4 | 2021-07-01 17:03:25 | <- last 4 : 1 #1#2#3#
| 5 | 1 | 5 | 2021-07-01 17:03:29 | <- last 5 : 1 #1#2#3#
| 6 | 1 | 6 | 2021-07-01 17:03:33 | <- last 6 : 1 #1#2#3#
| 7 | 1 | 7 | 2021-07-01 17:03:37 |
| 8 | 1 | 8 | 2021-07-01 17:03:41 | <- last 8 : 1 #1#2#3#
| 9 | 1 | 9 | 2021-07-01 17:03:45 |
| 10 | 1 | 10 | 2021-07-01 17:03:49 |
| 11 | 2 | 7 | 2021-07-01 17:04:10 |
| 12 | 3 | 7 | 2021-07-01 17:07:36 |
| 13 | 2 | 10 | 2021-07-01 17:07:54 |
| 14 | 3 | 10 | 2021-07-01 17:08:36 | <- last 10 : 3 #1#
| 15 | 2 | 9 | 2021-07-01 17:15:04 |
| 16 | 5 | 9 | 2021-07-01 17:15:24 | <- last 9 : 5 #1#2#3#
| 17 | 2 | 7 | 2021-07-01 17:35:36 |
| 18 | 3 | 7 | 2021-07-01 17:35:58 | <- last 7 : 3 #1#
| 19 | 2 | 7 | 2021-07-02 18:36:23 |
| 20 | 2 | 1 | 2021-07-02 18:36:39 |
| 21 | 3 | 1 | 2021-07-02 18:39:49 |
| 23 | 14 | 10 | 2021-07-02 18:40:17 |
| 22 | 14 | 1 | 2021-07-02 18:40:17 |
| 24 | 2 | 1 | 2021-07-02 19:14:21 |
| 25 | 1 | 1 | 2021-07-02 19:14:32 | <-- last 1 : 1 #2#3#
| 26 | 2 | 10 | 2021-07-02 19:14:35 |
| 27 | 1 | 10 | 2021-07-02 19:14:39 | <-- last 10 : 1 #2#3#
| 28 | 1 | 7 | 2021-07-02 19:14:46 | <-- last 7 : 1 #2#3#
| 29 | 2 | 3 | 2021-07-05 15:26:27 |
| 30 | 3 | 3 | 2021-07-05 15:26:48 | <--- last 3 : 3 #3#
| 31 | 2 | 2 | 2021-07-05 17:20:39 |
| 32 | 3 | 2 | 2021-07-05 17:24:59 | <--- last 2 : 3 #3#
+-----+-----------+----------+---------------------+
#1 (2021-07-01):
- 1,1,1,1,1,1,1(出现 7 次)
- 3,3(出现 2 次)
- 5(出现 1 次)
#2 (2021-07-02): : https://i.ibb.co/vDhL05q/sublime-text-or-MEzs-GFh-Q.jpg
- 1,1,1,1,1,1,1,1,1(出现 9 次)
- 5(出现 1 次)
#3 (2021-07-05):
- 1,1,1,1,1,1,1(出现 7 次)
- 3,3(出现 2 次)
- 5(出现 1 次)
输出:
+-----------+-------------+------------+
| id_status | occurences | day |
+-----------+-------------+------------+
| 1 | 7 | 2021-07-01 |
| 3 | 2 | 2021-07-01 |
| 5 | 1 | 2021-07-01 |
| 1 | 9 | 2021-07-02 |
| 5 | 1 | 2021-07-02 |
| 1 | 7 | 2021-07-05 |
| 3 | 2 | 2021-07-05 |
| 5 | 1 | 2021-07-05 |
+-----------+-------------+------------+
select date(datetime) as day, id_status, count(*) as occurrences
from (
select *,row_number() over (partition by date(datetime),id_queue order by datetime desc) rn
from tablename
) t where rn = 1
group by date(datetime) , id_status
order by date(datetime) , id_status
此查询每天对相同的行进行排序 id_queue 并按最新的顺序首先为它们提供行号顺序,然后选择第一个 (rn = 1) 所以你有最新的唯一 id_queue 在每一天然后你分组并计算 queu_ids
的数量
我通过编写这个查询成功了:
我确信它可以被优化并且可以减少到更少的子查询,如果有人有想法,我很乐意select另一个答案作为“解决答案”
SELECT id_status, COUNT(entries_status.id_status) as occurrences, entries_status.day
FROM (
SELECT history.id_status, history.id_queue, last_entries.day
FROM history
INNER JOIN(
SELECT id_queue, max(datetime) last_entry_of_day, day
FROM (
SELECT *
FROM history
LEFT JOIN (
SELECT date(datetime) AS day
FROM `history`
GROUP BY date(datetime)) as days
ON date(history.datetime) <= days.day
ORDER BY datetime ASC) entries
GROUP BY id_queue, day
) as last_entries
ON history.id_queue = last_entries.id_queue AND
history.datetime = last_entries.last_entry_of_day) as entries_status
GROUP BY entries_status.day, entries_status.id_status
我目前正在处理一个统计页面,其中一个 table 让我很纠结。
+----+-----------+----------+---------------------+
| id | id_status | id_queue | datetime | // Comments
+----+-----------+----------+---------------------+
| 1 | 1 | 1 | 2021-07-01 17:03:13 | //<- last_entry: id_queue:1 for: #1# (id_status: 1)
| 2 | 1 | 2 | 2021-07-01 17:03:18 | //<- last_entry: id_queue:2 for: #1#2# (id_status: 1)
| 9 | 1 | 9 | 2021-07-01 17:03:45 |
| 10 | 1 | 10 | 2021-07-01 17:03:49 |
| 11 | 2 | 7 | 2021-07-01 17:04:10 |
| 12 | 3 | 7 | 2021-07-01 17:07:36 |
| 13 | 2 | 10 | 2021-07-01 17:07:54 |
| 14 | 3 | 10 | 2021-07-01 17:08:36 | //<- last_entry: id_queue:10 for: #1# (id_status: 3)
| 15 | 2 | 9 | 2021-07-01 17:15:04 |
| 16 | 5 | 9 | 2021-07-01 17:15:24 | //<- last_entry: id_queue:9 for: #1#2#3#4# (id_status: 5)
| 18 | 3 | 7 | 2021-07-01 17:35:58 | //<- last_entry: id_queue:7 for: #1# (id_status: 3)
//- - - - - new day #2# - - - - - -
| 19 | 2 | 7 | 2021-07-02 18:36:23 |
| 21 | 3 | 1 | 2021-07-02 18:39:49 |
| 22 | 14 | 1 | 2021-07-02 18:40:17 |
| 23 | 14 | 10 | 2021-07-02 18:40:17 |
| 24 | 2 | 1 | 2021-07-02 19:14:21 |
| 25 | 1 | 1 | 2021-07-02 19:14:32 | //<- last_entry: id_queue:1 for: #2#3#4# (id_status: 1)
| 26 | 2 | 10 | 2021-07-02 19:14:35 |
| 27 | 1 | 10 | 2021-07-02 19:14:39 | //<- last_entry: id_queue:10 for: #2#3#4# (id_status: 1)
| 28 | 1 | 7 | 2021-07-02 19:14:46 | //<- last_entry: id_queue:7 for: #2#3#4# (id_status: 1)
//- - - - - new day #3# - - - - - -
| 31 | 2 | 2 | 2021-07-05 17:20:39 |
| 32 | 3 | 2 | 2021-07-05 17:24:59 | //<- last_entry: id_queue:2 for: #3# (id_status: 3)
//- - - - - new day #4# - - - - - -
| 33 | 2 | 3 | 2021-07-06 09:38:03 |
| 34 | 3 | 3 | 2021-07-06 09:38:16 | //<- last_entry: id_queue:3 for: #4# (id_status: 3)
| 35 | 2 | 6 | 2021-07-06 10:12:18 | //<- last_entry: id_queue:6 for: #4# (id_status: 2)
| 37 | 2 | 2 | 2021-07-06 11:37:50 |
| 38 | 13 | 2 | 2021-07-06 12:02:19 |
| 39 | 2 | 2 | 2021-07-06 12:02:21 |
| 40 | 13 | 2 | 2021-07-06 12:04:12 | //<- last_entry: id_queue:2 for: #4# (id_status: 13)
+----+-----------+----------+---------------------+
我想为每个 %Y/%m/%d
从每个 id_queue.
基本上,对于每一天,它都会根据实际的 datetime
列计算最后一个条目。
基于上面前面table的输出示例:
+-----------+--------------+------------+-------------------------------------------+
| id_status | occurrences | day | HELP_COLUMN(comment id from table above) |
+-----------+--------------+------------+-------------------------------------------+
| 1 | 2 | 2021-07-01 | #1# |
| 3 | 2 | 2021-07-01 | #1# |
| 5 | 1 | 2021-07-01 | #1# |
| 2 | 4 | 2021-07-02 | #2# |
| 5 | 1 | 2021-07-02 | #2# |
| 1 | 3 | 2021-07-05 | #3# |
| 3 | 1 | 2021-07-05 | #3# |
| 5 | 1 | 2021-07-05 | #4# |
| 1 | 3 | 2021-07-05 | #4# |
| 3 | 1 | 2021-07-05 | #4# |
| 2 | 1 | 2021-07-05 | #4# |
| 13 | 1 | 2021-07-05 | #4# |
+-----------+--------------+------------+-------------------------------------------+
- 对于 #1#(第一天)取 id_queue 的最后条目(基于 最大(日期时间).
- #2#(第二天)取 id_queue 的最后条目 (基于 max(datetime))(如您所见 id_queue 在前一天)。等...
我尝试了多种方法,但我开始觉得这看起来需要用迭代器来解决...但我不能每天执行一个 SQL 查询,这会消耗太多性能.
有人知道我可以使用哪个 SQL 请求吗? 坦克.
编辑:这是另一个例子:
输入:
+-----+-----------+----------+---------------------+
| id | id_status | id_queue | datetime |
+-----+-----------+----------+---------------------+
| 61 | 5 | 1 | 2021-07-01 15:03:40 |
| 132 | 5 | 1 | 2021-07-01 16:39:13 |
| 1 | 1 | 1 | 2021-07-01 17:03:13 | <- last 1 : 1 #1#
| 2 | 1 | 2 | 2021-07-01 17:03:18 | <- last 2 : 1 #1#2#
| 3 | 1 | 3 | 2021-07-01 17:03:21 | <- last 3 : 1 #1#2#
| 4 | 1 | 4 | 2021-07-01 17:03:25 | <- last 4 : 1 #1#2#3#
| 5 | 1 | 5 | 2021-07-01 17:03:29 | <- last 5 : 1 #1#2#3#
| 6 | 1 | 6 | 2021-07-01 17:03:33 | <- last 6 : 1 #1#2#3#
| 7 | 1 | 7 | 2021-07-01 17:03:37 |
| 8 | 1 | 8 | 2021-07-01 17:03:41 | <- last 8 : 1 #1#2#3#
| 9 | 1 | 9 | 2021-07-01 17:03:45 |
| 10 | 1 | 10 | 2021-07-01 17:03:49 |
| 11 | 2 | 7 | 2021-07-01 17:04:10 |
| 12 | 3 | 7 | 2021-07-01 17:07:36 |
| 13 | 2 | 10 | 2021-07-01 17:07:54 |
| 14 | 3 | 10 | 2021-07-01 17:08:36 | <- last 10 : 3 #1#
| 15 | 2 | 9 | 2021-07-01 17:15:04 |
| 16 | 5 | 9 | 2021-07-01 17:15:24 | <- last 9 : 5 #1#2#3#
| 17 | 2 | 7 | 2021-07-01 17:35:36 |
| 18 | 3 | 7 | 2021-07-01 17:35:58 | <- last 7 : 3 #1#
| 19 | 2 | 7 | 2021-07-02 18:36:23 |
| 20 | 2 | 1 | 2021-07-02 18:36:39 |
| 21 | 3 | 1 | 2021-07-02 18:39:49 |
| 23 | 14 | 10 | 2021-07-02 18:40:17 |
| 22 | 14 | 1 | 2021-07-02 18:40:17 |
| 24 | 2 | 1 | 2021-07-02 19:14:21 |
| 25 | 1 | 1 | 2021-07-02 19:14:32 | <-- last 1 : 1 #2#3#
| 26 | 2 | 10 | 2021-07-02 19:14:35 |
| 27 | 1 | 10 | 2021-07-02 19:14:39 | <-- last 10 : 1 #2#3#
| 28 | 1 | 7 | 2021-07-02 19:14:46 | <-- last 7 : 1 #2#3#
| 29 | 2 | 3 | 2021-07-05 15:26:27 |
| 30 | 3 | 3 | 2021-07-05 15:26:48 | <--- last 3 : 3 #3#
| 31 | 2 | 2 | 2021-07-05 17:20:39 |
| 32 | 3 | 2 | 2021-07-05 17:24:59 | <--- last 2 : 3 #3#
+-----+-----------+----------+---------------------+
#1 (2021-07-01):
- 1,1,1,1,1,1,1(出现 7 次)
- 3,3(出现 2 次)
- 5(出现 1 次)
#2 (2021-07-02): : https://i.ibb.co/vDhL05q/sublime-text-or-MEzs-GFh-Q.jpg
- 1,1,1,1,1,1,1,1,1(出现 9 次)
- 5(出现 1 次)
#3 (2021-07-05):
- 1,1,1,1,1,1,1(出现 7 次)
- 3,3(出现 2 次)
- 5(出现 1 次)
输出:
+-----------+-------------+------------+
| id_status | occurences | day |
+-----------+-------------+------------+
| 1 | 7 | 2021-07-01 |
| 3 | 2 | 2021-07-01 |
| 5 | 1 | 2021-07-01 |
| 1 | 9 | 2021-07-02 |
| 5 | 1 | 2021-07-02 |
| 1 | 7 | 2021-07-05 |
| 3 | 2 | 2021-07-05 |
| 5 | 1 | 2021-07-05 |
+-----------+-------------+------------+
select date(datetime) as day, id_status, count(*) as occurrences
from (
select *,row_number() over (partition by date(datetime),id_queue order by datetime desc) rn
from tablename
) t where rn = 1
group by date(datetime) , id_status
order by date(datetime) , id_status
此查询每天对相同的行进行排序 id_queue 并按最新的顺序首先为它们提供行号顺序,然后选择第一个 (rn = 1) 所以你有最新的唯一 id_queue 在每一天然后你分组并计算 queu_ids
的数量我通过编写这个查询成功了:
我确信它可以被优化并且可以减少到更少的子查询,如果有人有想法,我很乐意select另一个答案作为“解决答案”
SELECT id_status, COUNT(entries_status.id_status) as occurrences, entries_status.day
FROM (
SELECT history.id_status, history.id_queue, last_entries.day
FROM history
INNER JOIN(
SELECT id_queue, max(datetime) last_entry_of_day, day
FROM (
SELECT *
FROM history
LEFT JOIN (
SELECT date(datetime) AS day
FROM `history`
GROUP BY date(datetime)) as days
ON date(history.datetime) <= days.day
ORDER BY datetime ASC) entries
GROUP BY id_queue, day
) as last_entries
ON history.id_queue = last_entries.id_queue AND
history.datetime = last_entries.last_entry_of_day) as entries_status
GROUP BY entries_status.day, entries_status.id_status