MYSQL 将左连接中的重复行与所有数据合并为一行
MYSQL combine duplicate rows in left join into one row with all data
我正在尝试使用多个左联接来格式化一个 SQL 查询到一个可读的 CSV 文件。当我进行左连接时,它 returns 同一 ID 的多行(因为它应该连接多个非唯一行),但是这对于可读的 CSV 文件来说是不可接受的。
我想要的是将每个重复的行合并为一行,多列用于重复数据。
当前加入结果示例:
SELECT *
FROM people
LEFT JOIN attributes on ID
ID | Name | Attributes
1 | Ken | Tall
1 | Ken | Slender
1 | Ken | Blonde
2 | John | Short
期望的结果(导出为 CSV):
ID | Name | Attribute 1 | Attribute 2 | Attribute 3
1 Ken Tall Slender Blonde
2 John Short
我也尝试过按 ID 分组,但是当我这样做时,它只是 returns 每个 ID 的属性之一,这也是不可接受的。
也许我没有找对地方,但我似乎找不到任何功能来帮助我完成这个。
提前致谢!!!
您需要条件聚合,例如
SELECT ID, Name,
MAX( CASE WHEN rn = 1 THEN attributes END ) AS attribute1,
MAX( CASE WHEN rn = 2 THEN attributes END ) AS attribute2,
MAX( CASE WHEN rn = 3 THEN attributes END ) AS attribute3
FROM
(
SELECT p.ID, Name, attributes,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS rn
FROM people AS p
LEFT JOIN attributes AS a
ON a.ID = p.attribute_ID
) AS pa
GROUP BY ID, Name
其中 attribute_ID
是来自 people
table 的假定列,如果数据库版本为 8.0
[=16,则使用 window 函数=]
另一种选择是使用动态数据透视,其中不需要知道每个人有多少不同的属性,也不需要为属性的每个聚合编写每个条件,例如
SET @sql = NULL;
SELECT GROUP_CONCAT(
DISTINCT
CONCAT(
'MAX(CASE WHEN rn =', rn,' THEN attributes END) AS attribute',rn
)
)
INTO @sql
FROM
(
SELECT DISTINCT ROW_NUMBER() OVER (PARTITION BY id) AS rn
FROM people
ORDER BY rn
) AS r;
SET @sql = CONCAT('SELECT ID, Name, ',@sql,
' FROM
(
SELECT p.ID, Name, attributes,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS rn
FROM people AS p
LEFT JOIN attributes AS a
ON a.ID = p.attribute_ID
) AS pa
GROUP BY ID, Name');
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
也可以使用GROUP_CONCAT
,不需要知道最大属性数,csv导入也可以。
SELECT id,
name,
GROUP_CONCAT(attributes SEPARATOR ';') AS attributes
FROM people
GROUP BY id,
name;
您也可以轻松地将 GROUP_CONCAT 与 SUBSTRING_INDEX 一起使用,例如:
SELECT p.*, t.name
, SUBSTRING_INDEX(t.attribs, ',', 1) as attribute1
, SUBSTRING_INDEX( SUBSTRING_INDEX(t.attribs, ',', 2), ',' , -1) as attribute2
, SUBSTRING_INDEX( SUBSTRING_INDEX(t.attribs, ',', 3), ',' , -1) as attribute3
, SUBSTRING_INDEX( SUBSTRING_INDEX(t.attribs, ',', 4), ',' , -1) as attribute4
FROM people p
LEFT JOIN (
SELECT uid, name , CONCAT(GROUP_CONCAT(attribute),',,,,') as attribs FROM attributes GROUP BY name
) as t ON t.uid = p.uid;
样本
MariaDB [bernd]> select * from people;
+----+------+
| id | uid |
+----+------+
| 1 | 1 |
| 2 | 2 |
+----+------+
2 rows in set (0.00 sec)
MariaDB [bernd]> select * from attributes;
+----+------+------+-----------+
| id | uid | name | attribute |
+----+------+------+-----------+
| 1 | 1 | Ken | Tall |
| 2 | 1 | Ken | Slender |
| 3 | 1 | Ken | Blonde |
| 4 | 2 | John | Short |
+----+------+------+-----------+
4 rows in set (0.00 sec)
MariaDB [bernd]> SELECT p.*, t.name
-> , SUBSTRING_INDEX(t.attribs, ',', 1) as attribute1
-> , SUBSTRING_INDEX( SUBSTRING_INDEX(t.attribs, ',', 2), ',' , -1) as attribute2
-> , SUBSTRING_INDEX( SUBSTRING_INDEX(t.attribs, ',', 3), ',' , -1) as attribute3
-> , SUBSTRING_INDEX( SUBSTRING_INDEX(t.attribs, ',', 4), ',' , -1) as attribute4
-> FROM people p
-> LEFT JOIN (
-> SELECT uid, name , CONCAT(GROUP_CONCAT(attribute),',,,,') as attribs FROM attributes GROUP BY name
-> ) as t ON t.uid = p.uid;
+----+------+------+------------+------------+------------+------------+
| id | uid | name | attribute1 | attribute2 | attribute3 | attribute4 |
+----+------+------+------------+------------+------------+------------+
| 1 | 1 | Ken | Tall | Slender | Blonde | |
| 2 | 2 | John | Short | | | |
+----+------+------+------------+------------+------------+------------+
2 rows in set (0.01 sec)
MariaDB [bernd]>
也许这会有所帮助
SELECT *
FROM (SELECT ID, Name, Attributes,'Attribute ' || ROW_NUMBER () OVER (PARTITION BY Name ORDER BY ID) as columns
FROM people) ppl
PIVOT Attributes FOR columns IN('Attribute 1','Attribute 2','Attribute 3'))
ORDER BY ID
我正在尝试使用多个左联接来格式化一个 SQL 查询到一个可读的 CSV 文件。当我进行左连接时,它 returns 同一 ID 的多行(因为它应该连接多个非唯一行),但是这对于可读的 CSV 文件来说是不可接受的。
我想要的是将每个重复的行合并为一行,多列用于重复数据。
当前加入结果示例:
SELECT *
FROM people
LEFT JOIN attributes on ID
ID | Name | Attributes
1 | Ken | Tall
1 | Ken | Slender
1 | Ken | Blonde
2 | John | Short
期望的结果(导出为 CSV):
ID | Name | Attribute 1 | Attribute 2 | Attribute 3
1 Ken Tall Slender Blonde
2 John Short
我也尝试过按 ID 分组,但是当我这样做时,它只是 returns 每个 ID 的属性之一,这也是不可接受的。
也许我没有找对地方,但我似乎找不到任何功能来帮助我完成这个。
提前致谢!!!
您需要条件聚合,例如
SELECT ID, Name,
MAX( CASE WHEN rn = 1 THEN attributes END ) AS attribute1,
MAX( CASE WHEN rn = 2 THEN attributes END ) AS attribute2,
MAX( CASE WHEN rn = 3 THEN attributes END ) AS attribute3
FROM
(
SELECT p.ID, Name, attributes,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS rn
FROM people AS p
LEFT JOIN attributes AS a
ON a.ID = p.attribute_ID
) AS pa
GROUP BY ID, Name
其中 attribute_ID
是来自 people
table 的假定列,如果数据库版本为 8.0
[=16,则使用 window 函数=]
另一种选择是使用动态数据透视,其中不需要知道每个人有多少不同的属性,也不需要为属性的每个聚合编写每个条件,例如
SET @sql = NULL;
SELECT GROUP_CONCAT(
DISTINCT
CONCAT(
'MAX(CASE WHEN rn =', rn,' THEN attributes END) AS attribute',rn
)
)
INTO @sql
FROM
(
SELECT DISTINCT ROW_NUMBER() OVER (PARTITION BY id) AS rn
FROM people
ORDER BY rn
) AS r;
SET @sql = CONCAT('SELECT ID, Name, ',@sql,
' FROM
(
SELECT p.ID, Name, attributes,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS rn
FROM people AS p
LEFT JOIN attributes AS a
ON a.ID = p.attribute_ID
) AS pa
GROUP BY ID, Name');
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
也可以使用GROUP_CONCAT
,不需要知道最大属性数,csv导入也可以。
SELECT id,
name,
GROUP_CONCAT(attributes SEPARATOR ';') AS attributes
FROM people
GROUP BY id,
name;
您也可以轻松地将 GROUP_CONCAT 与 SUBSTRING_INDEX 一起使用,例如:
SELECT p.*, t.name
, SUBSTRING_INDEX(t.attribs, ',', 1) as attribute1
, SUBSTRING_INDEX( SUBSTRING_INDEX(t.attribs, ',', 2), ',' , -1) as attribute2
, SUBSTRING_INDEX( SUBSTRING_INDEX(t.attribs, ',', 3), ',' , -1) as attribute3
, SUBSTRING_INDEX( SUBSTRING_INDEX(t.attribs, ',', 4), ',' , -1) as attribute4
FROM people p
LEFT JOIN (
SELECT uid, name , CONCAT(GROUP_CONCAT(attribute),',,,,') as attribs FROM attributes GROUP BY name
) as t ON t.uid = p.uid;
样本
MariaDB [bernd]> select * from people;
+----+------+
| id | uid |
+----+------+
| 1 | 1 |
| 2 | 2 |
+----+------+
2 rows in set (0.00 sec)
MariaDB [bernd]> select * from attributes;
+----+------+------+-----------+
| id | uid | name | attribute |
+----+------+------+-----------+
| 1 | 1 | Ken | Tall |
| 2 | 1 | Ken | Slender |
| 3 | 1 | Ken | Blonde |
| 4 | 2 | John | Short |
+----+------+------+-----------+
4 rows in set (0.00 sec)
MariaDB [bernd]> SELECT p.*, t.name
-> , SUBSTRING_INDEX(t.attribs, ',', 1) as attribute1
-> , SUBSTRING_INDEX( SUBSTRING_INDEX(t.attribs, ',', 2), ',' , -1) as attribute2
-> , SUBSTRING_INDEX( SUBSTRING_INDEX(t.attribs, ',', 3), ',' , -1) as attribute3
-> , SUBSTRING_INDEX( SUBSTRING_INDEX(t.attribs, ',', 4), ',' , -1) as attribute4
-> FROM people p
-> LEFT JOIN (
-> SELECT uid, name , CONCAT(GROUP_CONCAT(attribute),',,,,') as attribs FROM attributes GROUP BY name
-> ) as t ON t.uid = p.uid;
+----+------+------+------------+------------+------------+------------+
| id | uid | name | attribute1 | attribute2 | attribute3 | attribute4 |
+----+------+------+------------+------------+------------+------------+
| 1 | 1 | Ken | Tall | Slender | Blonde | |
| 2 | 2 | John | Short | | | |
+----+------+------+------------+------------+------------+------------+
2 rows in set (0.01 sec)
MariaDB [bernd]>
也许这会有所帮助
SELECT *
FROM (SELECT ID, Name, Attributes,'Attribute ' || ROW_NUMBER () OVER (PARTITION BY Name ORDER BY ID) as columns
FROM people) ppl
PIVOT Attributes FOR columns IN('Attribute 1','Attribute 2','Attribute 3'))
ORDER BY ID