使用 union 按查询优化组
Optimizing group by query with union
我有一个 MySQL table 看起来像这样:
我想找到一个将我的 table 分组的查询,如下所示:
详情:
a_id
= 地图上的分隔区域
is_flag
= 1-如果传感器在区域内/0-如果传感器不在区域内
基本上第一个 table 描述了我的传感器在每个时间戳的哪个区域。
第二个 table 告诉我我的传感器在每个区域停留的时间。
我对每个 areas_id 和 union all 使用下面的查询,以便在单个 table 中输出我的传感器如何在区域之间移动的时间段以及它停留了多少 in/out每个区.
select t.a_id, min(t.timestamp) starttime,max(t.timestamp) endtime,
t.is_flag from(SELECT *,
ROW_NUMBER() OVER(ORDER BY a.timestamp) - ROW_NUMBER() OVER(PARTITION BY
a.is_flag ORDER BY a.timestamp) as GRP
FROM tablename a where areas_id=25 ) t
group by is_flag , GRP, a_id
这是我的 dbfiddle:https://www.db-fiddle.com/f/5pHiYKyx4yHoirRbGX4kP4/0
我的查询满足了我的需要,但需要一整天的时间。
更多信息(例如示例数据和查询失败的原因)会有所帮助,但看起来您可以按以下方式进行分组。
select a_id, is_flag, min(timestamp) as starttime, max(timestamp) as endtime
from tablename
group by a_id, is_flag
WITH
cte1 AS (SELECT CAST(JSON_UNQUOTE(`timestamp`) AS DATETIME) ts,
areas_id,
is_in_or_out
FROM inouts),
cte2 AS (SELECT ts,
areas_id,
is_in_or_out,
CAST(ROW_NUMBER() OVER (PARTITION BY areas_id ORDER BY ts ASC) AS SIGNED)
-CAST(ROW_NUMBER() OVER (PARTITION BY areas_id ORDER BY is_in_or_out, ts ASC) AS SIGNED) AS grp
FROM cte1)
SELECT areas_id,
ANY_VALUE(is_in_or_out) is_in_or_out,
MIN(ts) min_ts,
MAX(ts) max_ts
FROM cte2
GROUP BY areas_id,
grp
ORDER BY areas_id, min_ts;
PS1。源数据略有改动。
PS2。在 MySQL 中需要 CAST
,因为 ROW_NUMBER()
生成无符号的 bigint。可以替换为 0.0 + ...
.
这是 sql 服务器的语法,但在主要 dbms
中应该相同
with
x as (
-- find start/end of each period
select areas_id, is_in_or_out is_flag, timestamp t1
, ISNULL(ABS(is_in_or_out - LAG(is_in_or_out, 1) over (partition by areas_id order by timestamp)), 1) T_START
, ISNULL(ABS(is_in_or_out - LEAD(is_in_or_out, 1) over (partition by areas_id order by timestamp)), 1) T_END
from inouts
),
y as (
select *, LEAD(t1, 1) over (partition by areas_id order by t1) t2
from x
WHERE T_START<>0 OR T_END<>0
)
select areas_id, is_flag, t1 starttime, t2 endtime
from y
WHERE T_START<>0
order by areas_id, t1
应该这样做
我在这里错过了什么?你有没有可能“想太多”了?下面的 SQL 给出了与您的示例 db-fiddle 相同的结果集(我在副本上测试过),它非常简单并且运行速度更快。它为每个 areas_id/is_in_or_out 组合(根据 GROUP BY)给出一行。我不太明白为什么您需要 UNION 和 ROW_NUMBER() OVER 来使查询复杂化。希望这可以帮助。亲自尝试一下,如果有任何问题,请告诉我!
SELECT areas_id,
starttime,
endtime,
is_in_or_out
FROM (SELECT areas_id,
MIN(timestamp) starttime,
MAX(timestamp) endtime,
is_in_or_out
FROM inouts
GROUP BY is_in_or_out,
areas_id) x
ORDER BY starttime;
P.S。我认为 MBeale 的解决方案实际上也是正确的(尽管它错过了 ORDER BY)。
我有一个 MySQL table 看起来像这样:
我想找到一个将我的 table 分组的查询,如下所示:
详情:
a_id
= 地图上的分隔区域
is_flag
= 1-如果传感器在区域内/0-如果传感器不在区域内
基本上第一个 table 描述了我的传感器在每个时间戳的哪个区域。
第二个 table 告诉我我的传感器在每个区域停留的时间。
我对每个 areas_id 和 union all 使用下面的查询,以便在单个 table 中输出我的传感器如何在区域之间移动的时间段以及它停留了多少 in/out每个区.
select t.a_id, min(t.timestamp) starttime,max(t.timestamp) endtime,
t.is_flag from(SELECT *,
ROW_NUMBER() OVER(ORDER BY a.timestamp) - ROW_NUMBER() OVER(PARTITION BY
a.is_flag ORDER BY a.timestamp) as GRP
FROM tablename a where areas_id=25 ) t
group by is_flag , GRP, a_id
这是我的 dbfiddle:https://www.db-fiddle.com/f/5pHiYKyx4yHoirRbGX4kP4/0
我的查询满足了我的需要,但需要一整天的时间。
更多信息(例如示例数据和查询失败的原因)会有所帮助,但看起来您可以按以下方式进行分组。
select a_id, is_flag, min(timestamp) as starttime, max(timestamp) as endtime
from tablename
group by a_id, is_flag
WITH
cte1 AS (SELECT CAST(JSON_UNQUOTE(`timestamp`) AS DATETIME) ts,
areas_id,
is_in_or_out
FROM inouts),
cte2 AS (SELECT ts,
areas_id,
is_in_or_out,
CAST(ROW_NUMBER() OVER (PARTITION BY areas_id ORDER BY ts ASC) AS SIGNED)
-CAST(ROW_NUMBER() OVER (PARTITION BY areas_id ORDER BY is_in_or_out, ts ASC) AS SIGNED) AS grp
FROM cte1)
SELECT areas_id,
ANY_VALUE(is_in_or_out) is_in_or_out,
MIN(ts) min_ts,
MAX(ts) max_ts
FROM cte2
GROUP BY areas_id,
grp
ORDER BY areas_id, min_ts;
PS1。源数据略有改动。
PS2。在 MySQL 中需要 CAST
,因为 ROW_NUMBER()
生成无符号的 bigint。可以替换为 0.0 + ...
.
这是 sql 服务器的语法,但在主要 dbms
中应该相同with
x as (
-- find start/end of each period
select areas_id, is_in_or_out is_flag, timestamp t1
, ISNULL(ABS(is_in_or_out - LAG(is_in_or_out, 1) over (partition by areas_id order by timestamp)), 1) T_START
, ISNULL(ABS(is_in_or_out - LEAD(is_in_or_out, 1) over (partition by areas_id order by timestamp)), 1) T_END
from inouts
),
y as (
select *, LEAD(t1, 1) over (partition by areas_id order by t1) t2
from x
WHERE T_START<>0 OR T_END<>0
)
select areas_id, is_flag, t1 starttime, t2 endtime
from y
WHERE T_START<>0
order by areas_id, t1
应该这样做
我在这里错过了什么?你有没有可能“想太多”了?下面的 SQL 给出了与您的示例 db-fiddle 相同的结果集(我在副本上测试过),它非常简单并且运行速度更快。它为每个 areas_id/is_in_or_out 组合(根据 GROUP BY)给出一行。我不太明白为什么您需要 UNION 和 ROW_NUMBER() OVER 来使查询复杂化。希望这可以帮助。亲自尝试一下,如果有任何问题,请告诉我!
SELECT areas_id,
starttime,
endtime,
is_in_or_out
FROM (SELECT areas_id,
MIN(timestamp) starttime,
MAX(timestamp) endtime,
is_in_or_out
FROM inouts
GROUP BY is_in_or_out,
areas_id) x
ORDER BY starttime;
P.S。我认为 MBeale 的解决方案实际上也是正确的(尽管它错过了 ORDER BY)。