table 和同一 table 上的子查询之间的内部联接
Inner Join between table and a subquery on the same table
SQL 服务器 2012.
编辑:
我原来的查询比它应该的更复杂,因为我试图对 table 中的字段子集进行 Distinct 查询,并将其加入 table 本身以获得另一个(文本)领域。以下查询也可以解决问题:
SELECT DISTINCT
p1.id
,p1.Name
,CAST( p1.[Description] AS nvarchar(max)) AS Description
,( SELECT [Category] + ', '
FROM [dbo].[Company] AS p2
WHERE p2.Id = p1.Id
ORDER BY Name
FOR XML PATH('') ) AS Categories
FROM [dbo].[Company] AS p1
ORDER BY p1.Id
我有一个 table 的数据与此类似(每个公司的多条记录除了类别字段外都是相同的):
+----+------+-----------------+----------+
| Id | Name | Description | Category |
+----+------+-----------------+----------+
| 1 | AAA | <loads of text> | cat1 |
| 1 | AAA | <loads of text> | cat2 |
| 2 | BBB | <even more text>| cat1 |
| 2 | BBB | <even more text>| cat3 |
+----+------+-----------------+----------+
我正在尝试进行查询以获取此结果(每个公司和类别的唯一记录汇总到 1 个字段中):
| 1 | AAA | <loads of text> | cat1, cat2 |
| 2 | BBB | <even more text>| cat1, cat3 |
使用 SO 上各种主题的信息,我想出了这个:
SELECT
t1.Id
,t2.Name
,t2.[Description]
,t1.Category
FROM [dbo].[Company] AS t2
INNER JOIN (SELECT DISTINCT p1.Id
,( SELECT [Category] + ', '
FROM [dbo].[Company] AS p2
WHERE p2.Id = p1.Id
ORDER BY Name
FOR XML PATH('') ) AS Category
FROM [dbo].[Company] AS p1
) AS t1 ON t1.Id = t2.Id
ORDER BY t1.Id
公司table中的每条记录查询结果都有一条记录,类别汇总到类别字段:
+----+------+-----------------+------------+
| Id | Name | Description | Category |
+----+------+-----------------+------------+
| 1 | AAA | <loads of text> | cat1, cat2 |
| 1 | AAA | <loads of text> | cat1, cat2 |
| 2 | BBB | <even more text>| cat1, cat3 |
| 2 | BBB | <even more text>| cat1, cat3 |
+----+------+-----------------+------------+
我认为如果两个 table 都匹配,则 INNER JOIN 只会 select 行。
子查询自行生成预期结果(每个 Id 一条记录,类别聚合)。我在整个查询中尝试了另一个 group by 子句,但失败了,因为我不能在 group 子句中包含 Description 字段,因为它是一个文本类型字段。
我错过了什么?
我会在外部查询中尝试使用 DISTINCT。这应该可以解决您的问题,除非某些行的 Descriptions/Names 不同,这可能是完全可能的,因为您的数据库 table 可能应该是两个 table,并且您可能从未编写过任何代码确保每个 ID 的 description/name 保持不变。如果您在 id、name 和 description 上有一个复合唯一索引,那么您可能没问题。
如果您确实遇到多重描述的问题,您将需要使用聚合来解决外部查询中的问题。或者您将需要修复数据并添加唯一索引以防止将来发生。
就您遇到此问题的原因而言,联接工作正常,但您拥有的是派生 table 和另一个 table 之间的一对多关系。这就是为什么您会获得多个记录以及为什么 distinct 应该修复它的原因,除非 id 的数据对于名称和描述不同。
试试这个:
SELECT DISTINCT
t1.Id
,t2.Name
,cast(t2.[Description] as nvarchar(max))
,t1.Category
FROM [dbo].[Company] AS t2
INNER JOIN (SELECT DISTINCT p1.Id
,( SELECT [Category] + ', '
FROM [dbo].[Company] AS p2
WHERE p2.Id = p1.Id
ORDER BY Name
FOR XML PATH('') ) AS Category
FROM [dbo].[Company] AS p1
) AS t1 ON t1.Id = t2.Id
ORDER BY t1.Id
或者,您可以修复错误的 table 设计。
SELECT *
FROM ( SELECT t1.Id ,
t2.Name ,
t2.[Description] ,
t1.Category ,
ROW_NUMBER() OVER ( PARTITION BY t1.Id, t2.Name,
t1.Category ORDER BY t1.id ) row_num
FROM [dbo].[Company] AS t2
INNER JOIN ( SELECT DISTINCT
p1.Id ,
( SELECT [Category] + ', '
FROM [dbo].[Company] AS p2
WHERE p2.Id = p1.Id
ORDER BY Name
FOR
XML PATH('')
) AS Category
FROM [dbo].[Company] AS p1
) AS t1 ON t1.Id = t2.Id
) t1
ORDER BY t1.Id
要解决使用文本作为数据类型的棘手问题,您可以在拉取该列时强制转换它。如果可能的话,我会将列永久更改为 varchar(max)。
像这样:
SELECT
t1.Id
, t2.Name
, cast(t2.[Description] as varchar(max)) as Description
, t1.Category
FROM [dbo].[Company] AS t2
INNER JOIN (SELECT DISTINCT p1.Id
,( SELECT [Category] + ', '
FROM [dbo].[Company] AS p2
WHERE p2.Id = p1.Id
ORDER BY Name
FOR XML PATH('') ) AS Category
FROM [dbo].[Company] AS p1
) AS t1 ON t1.Id = t2.Id
GROUP BY Id
, Name
, cast(t2.[Description] as varchar(max))
ORDER BY t1.Id
SQL 服务器 2012.
编辑: 我原来的查询比它应该的更复杂,因为我试图对 table 中的字段子集进行 Distinct 查询,并将其加入 table 本身以获得另一个(文本)领域。以下查询也可以解决问题:
SELECT DISTINCT
p1.id
,p1.Name
,CAST( p1.[Description] AS nvarchar(max)) AS Description
,( SELECT [Category] + ', '
FROM [dbo].[Company] AS p2
WHERE p2.Id = p1.Id
ORDER BY Name
FOR XML PATH('') ) AS Categories
FROM [dbo].[Company] AS p1
ORDER BY p1.Id
我有一个 table 的数据与此类似(每个公司的多条记录除了类别字段外都是相同的):
+----+------+-----------------+----------+
| Id | Name | Description | Category |
+----+------+-----------------+----------+
| 1 | AAA | <loads of text> | cat1 |
| 1 | AAA | <loads of text> | cat2 |
| 2 | BBB | <even more text>| cat1 |
| 2 | BBB | <even more text>| cat3 |
+----+------+-----------------+----------+
我正在尝试进行查询以获取此结果(每个公司和类别的唯一记录汇总到 1 个字段中):
| 1 | AAA | <loads of text> | cat1, cat2 |
| 2 | BBB | <even more text>| cat1, cat3 |
使用 SO 上各种主题的信息,我想出了这个:
SELECT
t1.Id
,t2.Name
,t2.[Description]
,t1.Category
FROM [dbo].[Company] AS t2
INNER JOIN (SELECT DISTINCT p1.Id
,( SELECT [Category] + ', '
FROM [dbo].[Company] AS p2
WHERE p2.Id = p1.Id
ORDER BY Name
FOR XML PATH('') ) AS Category
FROM [dbo].[Company] AS p1
) AS t1 ON t1.Id = t2.Id
ORDER BY t1.Id
公司table中的每条记录查询结果都有一条记录,类别汇总到类别字段:
+----+------+-----------------+------------+
| Id | Name | Description | Category |
+----+------+-----------------+------------+
| 1 | AAA | <loads of text> | cat1, cat2 |
| 1 | AAA | <loads of text> | cat1, cat2 |
| 2 | BBB | <even more text>| cat1, cat3 |
| 2 | BBB | <even more text>| cat1, cat3 |
+----+------+-----------------+------------+
我认为如果两个 table 都匹配,则 INNER JOIN 只会 select 行。 子查询自行生成预期结果(每个 Id 一条记录,类别聚合)。我在整个查询中尝试了另一个 group by 子句,但失败了,因为我不能在 group 子句中包含 Description 字段,因为它是一个文本类型字段。
我错过了什么?
我会在外部查询中尝试使用 DISTINCT。这应该可以解决您的问题,除非某些行的 Descriptions/Names 不同,这可能是完全可能的,因为您的数据库 table 可能应该是两个 table,并且您可能从未编写过任何代码确保每个 ID 的 description/name 保持不变。如果您在 id、name 和 description 上有一个复合唯一索引,那么您可能没问题。
如果您确实遇到多重描述的问题,您将需要使用聚合来解决外部查询中的问题。或者您将需要修复数据并添加唯一索引以防止将来发生。
就您遇到此问题的原因而言,联接工作正常,但您拥有的是派生 table 和另一个 table 之间的一对多关系。这就是为什么您会获得多个记录以及为什么 distinct 应该修复它的原因,除非 id 的数据对于名称和描述不同。
试试这个:
SELECT DISTINCT
t1.Id
,t2.Name
,cast(t2.[Description] as nvarchar(max))
,t1.Category
FROM [dbo].[Company] AS t2
INNER JOIN (SELECT DISTINCT p1.Id
,( SELECT [Category] + ', '
FROM [dbo].[Company] AS p2
WHERE p2.Id = p1.Id
ORDER BY Name
FOR XML PATH('') ) AS Category
FROM [dbo].[Company] AS p1
) AS t1 ON t1.Id = t2.Id
ORDER BY t1.Id
或者,您可以修复错误的 table 设计。
SELECT *
FROM ( SELECT t1.Id ,
t2.Name ,
t2.[Description] ,
t1.Category ,
ROW_NUMBER() OVER ( PARTITION BY t1.Id, t2.Name,
t1.Category ORDER BY t1.id ) row_num
FROM [dbo].[Company] AS t2
INNER JOIN ( SELECT DISTINCT
p1.Id ,
( SELECT [Category] + ', '
FROM [dbo].[Company] AS p2
WHERE p2.Id = p1.Id
ORDER BY Name
FOR
XML PATH('')
) AS Category
FROM [dbo].[Company] AS p1
) AS t1 ON t1.Id = t2.Id
) t1
ORDER BY t1.Id
要解决使用文本作为数据类型的棘手问题,您可以在拉取该列时强制转换它。如果可能的话,我会将列永久更改为 varchar(max)。
像这样:
SELECT
t1.Id
, t2.Name
, cast(t2.[Description] as varchar(max)) as Description
, t1.Category
FROM [dbo].[Company] AS t2
INNER JOIN (SELECT DISTINCT p1.Id
,( SELECT [Category] + ', '
FROM [dbo].[Company] AS p2
WHERE p2.Id = p1.Id
ORDER BY Name
FOR XML PATH('') ) AS Category
FROM [dbo].[Company] AS p1
) AS t1 ON t1.Id = t2.Id
GROUP BY Id
, Name
, cast(t2.[Description] as varchar(max))
ORDER BY t1.Id