table 和同一 table 上的子查询之间的内部联接

Inner Join between table and a subquery on the same table

SQL 服务器 2012.

编辑: 我原来的查询比它应该的更复杂,因为我试图对 table 中的字段子集进行 Distinct 查询,并将其加入 table 本身以获得另一个(文本)领域。以下查询也可以解决问题:

SELECT DISTINCT
    p1.id
    ,p1.Name
    ,CAST( p1.[Description] AS nvarchar(max)) AS Description
    ,( SELECT [Category] + ', '
           FROM [dbo].[Company] AS p2
          WHERE p2.Id = p1.Id
          ORDER BY Name
            FOR XML PATH('') ) AS Categories
  FROM [dbo].[Company] AS p1
  ORDER BY p1.Id

我有一个 table 的数据与此类似(每个公司的多条记录除了类别字段外都是相同的):

+----+------+-----------------+----------+
| Id | Name | Description     | Category |
+----+------+-----------------+----------+
| 1  | AAA  | <loads of text> | cat1     |
| 1  | AAA  | <loads of text> | cat2     |
| 2  | BBB  | <even more text>| cat1     |
| 2  | BBB  | <even more text>| cat3     |
+----+------+-----------------+----------+

我正在尝试进行查询以获取此结果(每个公司和类别的唯一记录汇总到 1 个字段中):

| 1  | AAA  | <loads of text> | cat1, cat2 |
| 2  | BBB  | <even more text>| cat1, cat3 |

使用 SO 上各种主题的信息,我想出了这个:

SELECT 
    t1.Id
    ,t2.Name
    ,t2.[Description]
    ,t1.Category
FROM  [dbo].[Company] AS t2 
INNER JOIN (SELECT DISTINCT p1.Id
      ,( SELECT [Category] + ', '
           FROM [dbo].[Company] AS p2
          WHERE p2.Id = p1.Id
          ORDER BY Name
            FOR XML PATH('') ) AS Category
  FROM [dbo].[Company] AS p1
  ) AS t1 ON t1.Id = t2.Id
  ORDER BY t1.Id

公司table中的每条记录查询结果都有一条记录,类别汇总到类别字段:

+----+------+-----------------+------------+
| Id | Name | Description     | Category   |
+----+------+-----------------+------------+
| 1  | AAA  | <loads of text> | cat1, cat2 |
| 1  | AAA  | <loads of text> | cat1, cat2 |
| 2  | BBB  | <even more text>| cat1, cat3 |
| 2  | BBB  | <even more text>| cat1, cat3 |
+----+------+-----------------+------------+

我认为如果两个 table 都匹配,则 INNER JOIN 只会 select 行。 子查询自行生成预期结果(每个 Id 一条记录,类别聚合)。我在整个查询中尝试了另一个 group by 子句,但失败了,因为我不能在 group 子句中包含 Description 字段,因为它是一个文本类型字段。

我错过了什么?

我会在外部查询中尝试使用 DISTINCT。这应该可以解决您的问题,除非某些行的 Descriptions/Names 不同,这可能是完全可能的,因为您的数据库 table 可能应该是两个 table,并且您可能从未编写过任何代码确保每个 ID 的 description/name 保持不变。如果您在 id、name 和 description 上有一个复合唯一索引,那么您可能没问题。

如果您确实遇到多重描述的问题,您将需要使用聚合来解决外部查询中的问题。或者您将需要修复数据并添加唯一索引以防止将来发生。

就您遇到此问题的原因而言,联接工作正常,但您拥有的是派生 table 和另一个 table 之间的一对多关系。这就是为什么您会获得多个记录以及为什么 distinct 应该修复它的原因,除非 id 的数据对于名称和描述不同。

试试这个:

SELECT DISTINCT
    t1.Id
    ,t2.Name
    ,cast(t2.[Description] as nvarchar(max))
    ,t1.Category
FROM  [dbo].[Company] AS t2 
INNER JOIN (SELECT DISTINCT p1.Id
      ,( SELECT [Category] + ', '
           FROM [dbo].[Company] AS p2
          WHERE p2.Id = p1.Id
          ORDER BY Name
            FOR XML PATH('') ) AS Category
  FROM [dbo].[Company] AS p1
  ) AS t1 ON t1.Id = t2.Id
  ORDER BY t1.Id

或者,您可以修复错误的 table 设计。

SELECT  *
FROM    ( SELECT    t1.Id ,
                    t2.Name ,
                    t2.[Description] ,
                    t1.Category ,
                    ROW_NUMBER() OVER ( PARTITION BY t1.Id, t2.Name,
                                        t1.Category ORDER BY t1.id ) row_num
          FROM      [dbo].[Company] AS t2
                    INNER JOIN ( SELECT DISTINCT
                                        p1.Id ,
                                        ( SELECT    [Category] + ', '
                                          FROM      [dbo].[Company] AS p2
                                          WHERE     p2.Id = p1.Id
                                          ORDER BY  Name
                                        FOR
                                          XML PATH('')
                                        ) AS Category
                                 FROM   [dbo].[Company] AS p1
                               ) AS t1 ON t1.Id = t2.Id
        ) t1
ORDER BY t1.Id

要解决使用文本作为数据类型的棘手问题,您可以在拉取该列时强制转换它。如果可能的话,我会将列永久更改为 varchar(max)。

像这样:

SELECT 
    t1.Id
    , t2.Name
    , cast(t2.[Description] as varchar(max)) as Description
    , t1.Category
FROM  [dbo].[Company] AS t2 
INNER JOIN (SELECT DISTINCT p1.Id
      ,( SELECT [Category] + ', '
           FROM [dbo].[Company] AS p2
          WHERE p2.Id = p1.Id
          ORDER BY Name
            FOR XML PATH('') ) AS Category
  FROM [dbo].[Company] AS p1
  ) AS t1 ON t1.Id = t2.Id
  GROUP BY Id
    , Name
    , cast(t2.[Description] as varchar(max))
  ORDER BY t1.Id