SQL 查询以获取 id 内的日期对
SQL query to get date pairs within id
我有一个包含以下行的 table:
| item_id | change_type | change_date | change_id | other columns...
| :------ | :---------- | :---------- | :-------- |
| 123 | off | 2019-06-04 | 321 |
| 123 | on | 2019-07-11 | 741 |
| 123 | off | 2019-07-13 | 987 |
| 123 | on | 2019-08-01 | 951 |
| 123 | off | 2019-08-07 | 357 |
| 456 | off | 2019-08-01 | 125 |
| 456 | on | 2019-11-18 | 878 |
| 789 | on | 2019-12-18 | 373 |
| 012 | off | 2019-12-25 | 654 |
| 698 | off | 2019-08-01 | 741 |
| 698 | on | 2018-01-03 | 147 |
我正在尝试 运行 产生以下结果的查询:
| item_id | on_date | off_date | on_id | off_id | other columns...
| :------ | :--------- | :--------- | :---- | :----- |
| 123 | | 2019-06-04 | | 321 |
| 123 | 2019-07-11 | 2019-07-13 | 741 | 987 |
| 123 | 2019-08-01 | 2019-08-07 | 951 | 357 |
| 456 | | 2019-08-01 | | 125 |
| 456 | 2019-11-18 | | 878 | |
| 789 | 2019-12-18 | | 373 | |
| 012 | | 2019-12-25 | | 654 |
| 698 | 2018-01-03 | 2019-08-01 | 147 | 741 |
The result I need is a table wherein the dates "on" and dates "off" are noted in decending order (grouped by item_id
), with the "off" dates on the same row as the previous (in time) "on" date.
我最接近的是以下变体:
尝试一:
SELECT
changes_main.item_id,
`on_date`,
`off_date`,
`on_id`,
`off_id`
FROM (
SELECT DISTINCT `item_id`
FROM item_changes
) AS changes_main
LEFT OUTER JOIN (
SELECT
`item_id`, -- for joining purposes only
`change_date` AS `on_date`,
`change_id` AS `on_id`
FROM item_changes
WHERE `change_type` = 'on'
) AS changes_ons ON changes_ons.item_id = changes_main.item_id
RIGHT OUTER JOIN ( -- although LEFT or RIGHT doesn't seem to matter
SELECT
`item_id`, -- for joining purposes only
`change_date` AS `off_date`,
`change_id` AS `off_id`
FROM item_changes
WHERE `change_type` = 'off'
) AS changes_offs ON changes_offs.item_id = changes_main.item_id
;
但是,这实际上是在 on_date
和 off_date
之间实现了 CROSS JOIN
。
第二次尝试的唯一变化是添加了一个 WHERE
子句。这是我从 this question.
那里得到的想法
尝试二:
-- Same exact query as the above, however with the following
-- WHERE statement placed where the semicolon is above:
WHERE
`off_date` = (
SELECT MIN(offs2.change_date)
FROM item_changes AS offs2
WHERE offs2.change_type = 'off' AND
offs2.change_date > changes_ons.on_date
)
;
问题在于,如果 item_id 中的 "on/off" 数量不是偶数,那么多余的 "on" 或 "off" 就会被过滤掉。
我尝试了上述 WHERE
子句的变体,包括 OR off_date IS NULL
、OR on_date IS NULL
等
更新:
第三次尝试是使用 UNION
和一些 SCALAR SUBQUERIES
。这是我最接近我需要的结果。但是,仍然不足(例如,它不包括 change_id
,也没有创建完美匹配)。
SELECT
changes_on.item_id,
changes_on.change_date AS `on_date`,
(SELECT MIN(offs2.change_date)
FROM item_changes AS offs2
WHERE offs2.change_type = 'off' AND
offs2.change_date > changes_ons.change_date
) AS `off_date`,
changes_on.change_id AS `on_id`,
NULL AS `off_id` -- odd
FROM item_changes AS changes_on
WHERE `change_type` = 'on'
UNION
SELECT
changes_offs.item_id,
changes_offs.change_date AS `off_date`,
(SELECT MIN(ons2.change_date)
FROM item_changes AS ons2
WHERE ons2.change_type = 'on' AND
ons2.change_date < changes_offs.on_date
) AS `off_date`,
NULL AS `on_id`, -- odd
changes_offs.change_id AS `off_id`
FROM item_changes AS changes_offs
WHERE `change_type` = 'off'
;
助理/输入/指导将不胜感激。
根据每行前 "on" 的数量分配一个组。然后使用条件聚合:
select item_id,
max(case when change_type = 'on' then date end) as on_date,
max(case when change_type = 'on' then change_id end) as on_change_id,
max(case when change_type = 'off' then date end) as off_date,
max(case when change_type = 'off' then change_id end) as off_change_id
from (select t.*,
sum(case when change_type = 'on' then 1 else 0 end) over (partition by item_id order by change_date) as grp
from t
) t
group by item_id, grp;
编辑:
在 MySQL 的早期版本中,您可以将其表示为:
select item_id,
max(case when change_type = 'on' then date end) as on_date,
max(case when change_type = 'on' then change_id end) as on_change_id,
max(case when change_type = 'off' then date end) as off_date,
max(case when change_type = 'off' then change_id end) as off_change_id
from (select t.*,
(select count(*)
from t t2
where t2.item_id = t.item_id and
t2.change_date <= t.change_date and
t2.change_type = 'on'
) as grp
from t
) t
group by item_id, grp;
性能不如使用 window 函数,但 (item_id, change_type, change_date)
上的索引会有所帮助。
我有一个包含以下行的 table:
| item_id | change_type | change_date | change_id | other columns...
| :------ | :---------- | :---------- | :-------- |
| 123 | off | 2019-06-04 | 321 |
| 123 | on | 2019-07-11 | 741 |
| 123 | off | 2019-07-13 | 987 |
| 123 | on | 2019-08-01 | 951 |
| 123 | off | 2019-08-07 | 357 |
| 456 | off | 2019-08-01 | 125 |
| 456 | on | 2019-11-18 | 878 |
| 789 | on | 2019-12-18 | 373 |
| 012 | off | 2019-12-25 | 654 |
| 698 | off | 2019-08-01 | 741 |
| 698 | on | 2018-01-03 | 147 |
我正在尝试 运行 产生以下结果的查询:
| item_id | on_date | off_date | on_id | off_id | other columns...
| :------ | :--------- | :--------- | :---- | :----- |
| 123 | | 2019-06-04 | | 321 |
| 123 | 2019-07-11 | 2019-07-13 | 741 | 987 |
| 123 | 2019-08-01 | 2019-08-07 | 951 | 357 |
| 456 | | 2019-08-01 | | 125 |
| 456 | 2019-11-18 | | 878 | |
| 789 | 2019-12-18 | | 373 | |
| 012 | | 2019-12-25 | | 654 |
| 698 | 2018-01-03 | 2019-08-01 | 147 | 741 |
The result I need is a table wherein the dates "on" and dates "off" are noted in decending order (grouped by
item_id
), with the "off" dates on the same row as the previous (in time) "on" date.
我最接近的是以下变体:
尝试一:
SELECT
changes_main.item_id,
`on_date`,
`off_date`,
`on_id`,
`off_id`
FROM (
SELECT DISTINCT `item_id`
FROM item_changes
) AS changes_main
LEFT OUTER JOIN (
SELECT
`item_id`, -- for joining purposes only
`change_date` AS `on_date`,
`change_id` AS `on_id`
FROM item_changes
WHERE `change_type` = 'on'
) AS changes_ons ON changes_ons.item_id = changes_main.item_id
RIGHT OUTER JOIN ( -- although LEFT or RIGHT doesn't seem to matter
SELECT
`item_id`, -- for joining purposes only
`change_date` AS `off_date`,
`change_id` AS `off_id`
FROM item_changes
WHERE `change_type` = 'off'
) AS changes_offs ON changes_offs.item_id = changes_main.item_id
;
但是,这实际上是在 on_date
和 off_date
之间实现了 CROSS JOIN
。
第二次尝试的唯一变化是添加了一个 WHERE
子句。这是我从 this question.
尝试二:
-- Same exact query as the above, however with the following
-- WHERE statement placed where the semicolon is above:
WHERE
`off_date` = (
SELECT MIN(offs2.change_date)
FROM item_changes AS offs2
WHERE offs2.change_type = 'off' AND
offs2.change_date > changes_ons.on_date
)
;
问题在于,如果 item_id 中的 "on/off" 数量不是偶数,那么多余的 "on" 或 "off" 就会被过滤掉。
我尝试了上述 WHERE
子句的变体,包括 OR off_date IS NULL
、OR on_date IS NULL
等
更新:
第三次尝试是使用 UNION
和一些 SCALAR SUBQUERIES
。这是我最接近我需要的结果。但是,仍然不足(例如,它不包括 change_id
,也没有创建完美匹配)。
SELECT
changes_on.item_id,
changes_on.change_date AS `on_date`,
(SELECT MIN(offs2.change_date)
FROM item_changes AS offs2
WHERE offs2.change_type = 'off' AND
offs2.change_date > changes_ons.change_date
) AS `off_date`,
changes_on.change_id AS `on_id`,
NULL AS `off_id` -- odd
FROM item_changes AS changes_on
WHERE `change_type` = 'on'
UNION
SELECT
changes_offs.item_id,
changes_offs.change_date AS `off_date`,
(SELECT MIN(ons2.change_date)
FROM item_changes AS ons2
WHERE ons2.change_type = 'on' AND
ons2.change_date < changes_offs.on_date
) AS `off_date`,
NULL AS `on_id`, -- odd
changes_offs.change_id AS `off_id`
FROM item_changes AS changes_offs
WHERE `change_type` = 'off'
;
助理/输入/指导将不胜感激。
根据每行前 "on" 的数量分配一个组。然后使用条件聚合:
select item_id,
max(case when change_type = 'on' then date end) as on_date,
max(case when change_type = 'on' then change_id end) as on_change_id,
max(case when change_type = 'off' then date end) as off_date,
max(case when change_type = 'off' then change_id end) as off_change_id
from (select t.*,
sum(case when change_type = 'on' then 1 else 0 end) over (partition by item_id order by change_date) as grp
from t
) t
group by item_id, grp;
编辑:
在 MySQL 的早期版本中,您可以将其表示为:
select item_id,
max(case when change_type = 'on' then date end) as on_date,
max(case when change_type = 'on' then change_id end) as on_change_id,
max(case when change_type = 'off' then date end) as off_date,
max(case when change_type = 'off' then change_id end) as off_change_id
from (select t.*,
(select count(*)
from t t2
where t2.item_id = t.item_id and
t2.change_date <= t.change_date and
t2.change_type = 'on'
) as grp
from t
) t
group by item_id, grp;
性能不如使用 window 函数,但 (item_id, change_type, change_date)
上的索引会有所帮助。