SQL :获取重复行以及其他变量
SQL : Getting duplicate rows along with other variables
我正在研究 Terradata SQL。我想获得重复字段及其计数和其他变量。我只能找到获得计数的方法,但也不能完全找到变量。
可用输入
+---------+----------+----------------------+
| id | name | Date |
+---------+----------+----------------------+
| 1 | abc | 21.03.2015 |
| 1 | def | 22.04.2015 |
| 2 | ajk | 22.03.2015 |
| 3 | ghi | 23.03.2015 |
| 3 | ghi | 23.03.2015 |
预期输出:
+---------+----------+----------------------+
| id | name | count | // Other fields
+---------+----------+----------------------+
| 1 | abc | 2 |
| 1 | def | 2 |
| 2 | ajk | 1 |
| 3 | ghi | 2 |
| 3 | ghi | 2 |
我在找什么:
我正在寻找所有重复的行,其中重复由 ID 决定,并检索重复的行。
我现在只有:
SELECT
id, name, other-variables, COUNT(*)
FROM
Table_NAME
GROUP BY
id, name
HAVING
COUNT(*) > 1
这显示的数据不正确。谢谢你。
您可以使用 window aggregate function,像这样:
SELECT *
FROM (
SELECT id, name, other-variables,
COUNT(*) OVER (PARTITION BY id) AS duplicates
FROM users
) AS sub
WHERE duplicates > 1
使用 teradata 对 ISO SQL 语法的扩展,您可以将上面的内容简化为:
SELECT id, name, other-variables,
COUNT(*) OVER (PARTITION BY id) AS duplicates
FROM users
QUALIFY duplicates > 1
试试这个,
SELECT
id, COUNT(id)
FROM
Table_NAME
GROUP BY
id
HAVING
COUNT(id) > 1
作为已接受且完全正确的答案的替代方案,您可以使用:
SELECT {all your required 'variables' (they are not variables, but attributes)}
, cnt.Count_Dups
FROM Table_NAME TN
INNER JOIN (
SELECT id
, COUNT(1) Count_Dups
GROUP BY id
HAVING COUNT(1) > 1 -- If you want only duplicates
) cnt
ON cnt.id = TN.id
编辑:根据您的编辑,重复项仅在 id
上。相应地编辑了我的查询。
我正在研究 Terradata SQL。我想获得重复字段及其计数和其他变量。我只能找到获得计数的方法,但也不能完全找到变量。
可用输入
+---------+----------+----------------------+
| id | name | Date |
+---------+----------+----------------------+
| 1 | abc | 21.03.2015 |
| 1 | def | 22.04.2015 |
| 2 | ajk | 22.03.2015 |
| 3 | ghi | 23.03.2015 |
| 3 | ghi | 23.03.2015 |
预期输出:
+---------+----------+----------------------+
| id | name | count | // Other fields
+---------+----------+----------------------+
| 1 | abc | 2 |
| 1 | def | 2 |
| 2 | ajk | 1 |
| 3 | ghi | 2 |
| 3 | ghi | 2 |
我在找什么:
我正在寻找所有重复的行,其中重复由 ID 决定,并检索重复的行。
我现在只有:
SELECT
id, name, other-variables, COUNT(*)
FROM
Table_NAME
GROUP BY
id, name
HAVING
COUNT(*) > 1
这显示的数据不正确。谢谢你。
您可以使用 window aggregate function,像这样:
SELECT *
FROM (
SELECT id, name, other-variables,
COUNT(*) OVER (PARTITION BY id) AS duplicates
FROM users
) AS sub
WHERE duplicates > 1
使用 teradata 对 ISO SQL 语法的扩展,您可以将上面的内容简化为:
SELECT id, name, other-variables,
COUNT(*) OVER (PARTITION BY id) AS duplicates
FROM users
QUALIFY duplicates > 1
试试这个,
SELECT
id, COUNT(id)
FROM
Table_NAME
GROUP BY
id
HAVING
COUNT(id) > 1
作为已接受且完全正确的答案的替代方案,您可以使用:
SELECT {all your required 'variables' (they are not variables, but attributes)}
, cnt.Count_Dups
FROM Table_NAME TN
INNER JOIN (
SELECT id
, COUNT(1) Count_Dups
GROUP BY id
HAVING COUNT(1) > 1 -- If you want only duplicates
) cnt
ON cnt.id = TN.id
编辑:根据您的编辑,重复项仅在 id
上。相应地编辑了我的查询。