按 xsecound 中不同的时间删除大量 mysql 数据库中的重复项

Remove duplicates in large amount mysql database by time different in xsecound

我已经查看了具有类似主题的其他问题,但它并没有解决我目前关注的问题 table。

action_table(actionid,cookieid,intime,page)

我有如下数据

235470 ,994341855.1473047915, 2016-09-05 07:01:57, index.aspx
235471, 994341855.1473047915,  2016-09-05 07:02:00, index.aspx
235472, 994341855.1473047915,  2016-09-05 07:02:02, index.aspx
235473, 994341855.1473047915,  2016-09-05 07:02:12, home.aspx
235474, 994341855.1473047915,  2016-09-05 07:04:12, index.aspx

用户可以无限次刷新他的页面,它应该像下面这样重复,所以只有自动递增(actionid)和 intime 不同,所以我只想像下面这样获取数据

235470 ,994341855.1473047915, 2016-09-05 07:01:57, index.aspx
235473, 994341855.1473047915,  2016-09-05 07:02:12, home.aspx
235474, 994341855.1473047915,  2016-09-05 07:04:12, index.aspx

避免重复条目,例如 cookie id 和页面相同,而且如果同一页面之间有任何页面,则它应该是一个新条目。

怎么可能 select 喜欢那个查询?有没有分组可用? 请帮助我

下面是使用 Oracle 中的分析 LEAD 函数的解决方案:

WITH input_data AS (
  SELECT 235470 AS actionid, 994341855.1473047915 AS cookieid, TO_DATE('2016-09-05 07:01:57', 'yyyy-mm-dd HH:MI:SS') AS intime, 'index.aspx' AS page  FROM DUAL
  UNION ALL
  SELECT 235471 AS actionid, 994341855.1473047915 AS cookieid, TO_DATE('2016-09-05 07:02:00', 'yyyy-mm-dd HH:MI:SS') AS intime, 'index.aspx' AS page  FROM DUAL
  UNION ALL
  SELECT 235472 AS actionid, 994341855.1473047915 AS cookieid, TO_DATE('2016-09-05 07:02:02', 'yyyy-mm-dd HH:MI:SS') AS intime, 'index.aspx' AS page  FROM DUAL
  UNION ALL
  SELECT 235473 AS actionid, 994341855.1473047915 AS cookieid, TO_DATE('2016-09-05 07:02:12', 'yyyy-mm-dd HH:MI:SS') AS intime, 'home.aspx' AS page  FROM DUAL
  UNION ALL
  SELECT 235474 AS actionid, 994341855.1473047915 AS cookieid, TO_DATE('2016-09-05 07:04:12', 'yyyy-mm-dd HH:MI:SS') AS intime, 'index.aspx' AS page  FROM DUAL
)
SELECT MIN(actionid) AS action_id, cookieid, MIN(intime) AS intime, page
FROM (
  SELECT input_data.*, LEAD(page, 1) OVER (ORDER BY intime) AS next_page 
  FROM input_data
)
WHERE page <> NVL(next_page, 'NULL')
GROUP BY cookieid, page, next_page
ORDER BY MIN(actionid)
;

输出:

ACTION_ID  COOKIEID     INTIME            PAGE
235472     994341855.1  05/09/2016 07:02  index.aspx
235473     994341855.1  05/09/2016 07:02  home.aspx
235474     994341855.1  05/09/2016 07:04  index.aspx

架构

create table action_table
(   actionid int not null,
    cookieid decimal(20,10) not null,
    intime datetime not null,
    page varchar(100) not null
)charset=utf8 engine=InnoDB;

insert action_table values
(235470 ,994341855.1473047915, '2016-09-05 07:01:57', 'index.aspx'),
(235471, 994341855.1473047915, '2016-09-05 07:02:00', 'index.aspx'),
(235472, 994341855.1473047915, '2016-09-05 07:02:02', 'index.aspx'),
(235473, 994341855.1473047915, '2016-09-05 07:02:12', 'home.aspx'),
(235474, 994341855.1473047915, '2016-09-05 07:04:12', 'index.aspx');

查询

select actionid,cookieid,intime,page 
from  
(   select actionid,cookieid,intime,page, 
    @num := if(@page = page, 2, 1) as thePage, 
    @page := `page` as dummy 
    from action_table 
    cross join (select @page:='',@num:=0) xParams 
    order by actionid,cookieid,intime,page 
) as x  
where x.thePage=1 
order by actionid,cookieid,intime,page; 
+----------+----------------------+---------------------+------------+
| actionid | cookieid             | intime              | page       |
+----------+----------------------+---------------------+------------+
|   235470 | 994341855.1473047915 | 2016-09-05 07:01:57 | index.aspx |
|   235473 | 994341855.1473047915 | 2016-09-05 07:02:12 | home.aspx  |
|   235474 | 994341855.1473047915 | 2016-09-05 07:04:12 | index.aspx |
+----------+----------------------+---------------------+------------+

使用 MySQL 变量和派生的 table x,如果变量 @num 为 1,我们将选择它作为最终输出。

cross join只是为了在开始时初始化变量。