如何获取在 MYSQL8 中列值更改之前首次出现的 select 行

How to get select rows that have first occurrences before column value changes in MYSQL8

我有一个 MYSQL8 table Event_TimeStampFinalStateand 它看起来像这样

+---------------------------+---------------+
|"Event_TimeStamp"          |"FinalState"   |
+---------------------------+---------------+
|"2020-03-09 04:57:45.729"  |"Available"    |
|"2020-03-09 05:14:59.659"  |"Available"    |
|"2020-03-09 05:27:56.341"  |"Available"    |
|"2020-03-09 05:41:01.554"  |"Available"    |
|"2020-03-09 05:58:07.803"  |"Available"    |
|"2020-03-09 06:06:09.745"  |"Available"    |
|"2020-03-09 06:18:07.663"  |"Available"    |
|"2020-03-09 06:26:24.273"  |"Available"    |
|"2020-03-09 09:29:53.165"  |"Offline"      |
|"2020-03-09 10:28:00.514"  |"Available"    |
|"2020-03-09 12:47:54.130"  |"Available"    |
|"2020-03-09 13:01:30.117"  |"Available"    |
|"2020-03-09 13:01:59.774"  |"Offline"      |
|"2020-03-09 13:19:15.772"  |"Available"    |
|"2020-03-09 14:19:51.521"  |"Available"    |
|"2020-03-09 14:50:16.872"  |"Offline"      |
+---------------------------+---------------+

我必须从上面提取行,这样它就会有第一个 "Available" 和 "Offline" 的行,所以输出看起来像这样

+---------------------------+---------------+
|"Event_TimeStamp"          |"FinalState"   |
+---------------------------+---------------+
|"2020-03-09 04:57:45.729"  |"Available"    |
|"2020-03-09 09:29:53.165"  |"Offline"      |
|"2020-03-09 10:28:00.514"  |"Available"    |
|"2020-03-09 13:01:59.774"  |"Offline"      |
|"2020-03-09 13:19:15.772"  |"Available"    |
|"2020-03-09 14:50:16.872"  |"Offline"      |
+---------------------------+---------------+

我用 GROUP BY 尝试了几种方法,但我只得到每个 FinalState 的第一个条目,而不是其余的条目。

有没有办法通过查询来完成这项工作,还是我应该在 PHP 中写出来?

可以使用联接。我使用 events 作为数据库名称示例。

SELECT x.* 
  FROM events x
  JOIN
     (
       SELECT MIN(c.id) id 
         FROM events a
         LEFT 
         JOIN events b 
           ON b.FinalState = a.FinalState 
          AND b.id = a.id - 1 
         LEFT 
         JOIN events c 
           ON c.FinalState = a.FinalState
          AND c.id >= a.id
         LEFT
         JOIN events d
           ON d.FinalState = a.FinalState
          AND d.id = c.id + 1
        WHERE b.id IS NULL 
          AND c.id IS NOT NULL
          AND d.id IS NULL
        GROUP 
           BY a.id
     ) y
    ON y.id = x.id; 

您可以使用滞后函数来检查状态是否发生变化:

with cte as
(select CAST('2020-03-09 04:57:45.729' as datetime) as Event_timestamp,'Available' as Finalstate union
select '2020-03-09 05:14:59.659','Available' union
select '2020-03-09 09:29:53.165','Offline' union
select '2020-03-09 10:28:00.514','Available')

select x.Event_timestamp,x.Finalstate
from
(select *,lag(Finalstate) Over(Order by Event_timestamp) as lag_status
from cte ) x
where x.Finalstate<>coalesce(lag_status,'Z')

这是一个 db-fiddle: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=cb59ec79e7be1363961971cea4308dff

希望对您有所帮助。

您可以使用 lag()lead() 显示 final_state 与上一行或下一行不同的记录:

select
    event_timestamp,
    final_state
from (
    select 
        t.*, 
        lag(final_state) over(order by event_timestamp)  lag_final_state,
        lead(final_state) over(order by event_timestamp) lead_final_state
    from mytable t
) t
where final_state <> lag_final_state or final_state <> lead_final_state

这个查询解决了我的问题。感谢@GMB

SELECT
    Event_TimeStamp,
    FinalState
FROM (
    SELECT
        t.*,
        COALESCE(LAG(FinalState) over(ORDER BY Event_TimeStamp), 'offline')  lag_final_state,
        COALESCE(lead(FinalState) over(ORDER BY Event_TimeStamp), 'offline') lead_final_state
    FROM (
        SELECT
            Event_TimeStamp, 
            FinalState
        FROM AgentTraceData
        WHERE Event_TimeStamp BETWEEN '2019-11-17' AND '2020-03-10 23:59:59.999' AND username = 'xxxx' ORDER BY Event_TimeStamp
        ) t
    ) t
WHERE FinalState <> lag_final_state

结果如下

+---------------------------+-------------+
|"Event_TimeStamp"          | "FinalState"|
+---------------------------+-------------| 
|"2019-11-18 02:01:16.395"  |"online"     |
|"2019-11-18 04:34:59.739"  |"offline"    |
|"2019-11-18 04:45:08.354"  |"online"     |
|"2019-11-18 07:30:13.909"  |"offline"    |
|"2019-11-18 08:00:20.647"  |"online"     |
|"2019-11-18 10:30:08.698"  |"offline"    |
+---------------------------+-------------+

我会尽可能多地解释。

首先,我们需要 运行 带有 SELECT * 的查询以查看出现的所有列。

输出看起来像这样

+---------------------------+---------------+-------------------+------------------+
|"Event_TimeStamp"          |"FinalState"   |"lag_final_state"  |"lead_final_state"|
+---------------------------+---------------+-------------------+------------------+
|"2019-11-18 02:01:16.395"  |"online"       |"offline"          |"online"          |
|"2019-11-18 04:34:59.739"  |"offline"      |"online"           |"online"          |
|"2019-11-18 04:45:08.354"  |"online"       |"offline"          |"online"          |
+---------------------------+---------------+-------------------+------------------+

根据我的要求,我想知道下一个 FinalState 值,因此不需要 final_state <> lead_final_state

我添加了 COALESCE(),因为 LEAD()LAG() 将为第一个 SELECT 查询之上或之后不存在的行提供 NULL 值。