Mysql 删除重复评论?

Mysql delete duplicate comments?

我想清除评论 table(100 万行)中的重复内容,其中用户发布了两次(或更多次)相同的评论。但是我想保留任何重复评论的一个实例。

这是我提出的用于查找和分组这些评论的查询:

SELECT author, body, COUNT(*) as count
FROM  db.comment
GROUP BY body
HAVING COUNT(*) > 1;

但不知道如何删除重复的行,同时只保留一个不变。 我见过类似的问题,但 none 对我有用。所以感谢你的提示。

更新:

mysql> describe comment;
+---------+-------------+------+-----+---------+----------------+
| Field   | Type        | Null | Key | Default | Extra          |
+---------+-------------+------+-----+---------+----------------+
| id      | int(11)     | NO   | PRI | NULL    | auto_increment |
| created | datetime    | NO   |     | NULL    |                |
| author  | varchar(60) | NO   |     | NULL    |                |
| body    | longtext    | NO   |     | NULL    |                |
| post_id | int(11)     | NO   | MUL | NULL    |                |
+---------+-------------+------+-----+---------+----------------+

与其他 DBMS 不同,MySQL 可以 select 来自 table 的所有字段,但仅按其中一个进行分组。在这种情况下,只有每个组的第一条记录将被 selected。

分两步完成这项工作:

保存 ID 以保留在临时 table:

INSERT INTO temp_comment(id)
SELECT id
FROM db.comment
GROUP BY author, body

删除除已保存行以外的所有行:

DELETE FROM db.comment WHERE id NOT IN (SELECT id FROM temp_comment);

当然你需要 temp_comment table 才能存在。

这是你想要的吗?

SELECT * FROM comments WHERE id NOT IN (
  SELECT id
  FROM  comments
  GROUP BY author,body
  HAVING COUNT(*) > 1
 )
AND author IN(
  SELECT author
  FROM  comments
  GROUP BY author,body
  HAVING COUNT(*) > 1
  )
AND body IN(
  SELECT body
  FROM  comments
  GROUP BY author,body
  HAVING COUNT(*) > 1
  );

对于delete重复行,将SELECT *改为DELETE

SQL Fiddle Demo

更新

要提高查询性能,您可以试试这个:

SELECT * FROM comments c
INNER JOIN 
(
  SELECT id,author,body
  FROM  comments
  GROUP BY author,body
  HAVING COUNT(*) > 1
 ) AS t
ON c.id NOT IN(t.id) AND c.author IN(t.author) AND c.body IN(t.body)