在 SQL 服务器中聚合动态列
Aggregate dynamic columns in SQL Server
我有一个窄 table 包含唯一键和源数据
Unique_Key
System
1
IT
1
ACCOUNTS
1
PAYROLL
2
IT
2
PAYROLL
3
IT
4
HR
5
PAYROLL
我希望能够选择一个系统作为基础 - 在本例中为 IT - 然后在重要的地方创建一个动态 SQL 查询:
- 所选系统中不同的唯一键
- 与其他系统共享唯一密钥的比例。这些系统可能是动态的,而且数量远不止 4
我正在考虑使用动态 SQL 和 PIVOT 来首先挑选出 IT 之外的所有系统名称。然后以 IT 为基础,加入那个 table 以获取信息。
select distinct Unique_Key, System_Name
into #staging
from dbo.data
where System_Name <> 'IT'
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX);
SET @cols = STUFF((SELECT distinct ',' + QUOTENAME(System_Name)
FROM #staging
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT Unique_Key, ' + @cols + ' into dbo.temp from
(
select Unique_Key, System_Name
from #staging
) x
pivot
(
count(System_Name)
for System_Name in (' + @cols + ')
) p '
execute(@query)
select *
from
(
select distinct Unique_Key
from dbo.data
where System_Name = 'IT'
) a
left join dbo.temp b
on a.Unique_Key = b.Unique_Key
所以结果 table 是:
Unique_Key
PAYROLL
ACCOUNTS
HR
1
1
1
0
2
1
0
0
3
0
0
0
我要的是更进一步:
Distinct Count IT Key
PAYROLL
ACCOUNTS
HR
3
67%
33%
0%
我可以用特定的 case when/sum 语句进行简单的连接,但想知道是否有办法动态地做到这一点,所以我不需要指定每个系统名称。
感谢任何tips/hints。
您可以尝试使用动态 SQL,如下所示,我会使用条件聚合函数获取枢轴值,然后我们可能会在动态 SQL 中添加 OUTER JOIN
或 EXISTS 条件。
我会使用 sp_executesql 而不是 exec
来避免 sql-injection。
DECLARE @System_Name NVARCHAR(50) = 'IT'
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX),
@parameter AS NVARCHAR(MAX);
SET @parameter = '@System_Name NVARCHAR(50)'
select DISTINCT System_Name
into #staging
from dbo.data t1
WHERE t1.System_Name <> @System_Name
SET @cols = STUFF((SELECT distinct ', SUM(IIF(System_Name = '''+ System_Name+''',1,0)) * 100.0 / SUM(IIF(System_Name = @System_Name,0,1)) ' + QUOTENAME(System_Name)
FROM #staging
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT SUM(IIF(System_Name = @System_Name,0,1)) [Distinct Count IT Key], ' + @cols + ' from dbo.data t1
WHERE EXISTS (
SELECT 1
FROM dbo.data tt
WHERE tt.Unique_Key = t1.Unique_Key
AND tt.System_Name = @System_Name
) '
EXECUTE sp_executesql @query, @parameter, @System_Name
这个方案改变了PIVOT本身的聚合函数。
首先,让我们向 #staging 添加一个列 [has_it],以跟踪每个 Unique_Key 是否有一个 IT 行:
select Unique_Key, System_Name, case when exists(select 1 from data d2 where d2.Unique_Key=d1.Unique_Key and d2.System_Name='IT') then 1 else 0 end as has_it
into #staging
from data d1
where System_Name <> 'IT'
group by Unique_Key, System_Name
现在,此列的 per-System 聚合(总和)除以所需的最终唯一键总数(示例案例=3)returns 请求数。将 PIVOT 更改为以下内容并准备就绪 as-is,无需进一步查询:
set @query = ' select *
from
(
select System_Name,cnt as [Distinct Count IT Key],has_it*1.0/cnt as divcnt
from #staging
cross join
(
select count(distinct Unique_Key) as cnt
from dbo.data
where System_Name = ''IT''
)y
) x
pivot
(
sum(divcnt)
for System_Name in (' + @cols + ')
) p'
编写动态查询时,您从 non-dynamic 查询开始。在转换为动态查询之前,请确保您获得的查询结果是正确的。
对于您需要的结果,查询将是
with cte as
(
select it.Unique_Key, ot.System_Name
from data it
left join data ot on it.Unique_Key = ot.Unique_Key
and ot.System_Name <> 'IT'
where it.System_Name = 'IT'
)
select [ITKey] = count(distinct Unique_Key),
[ACCOUNTS] = count(case when System_Name = 'ACCOUNTS' then Unique_Key end) * 100.0
/ count(distinct Unique_Key),
[HR] = count(case when System_Name = 'HR' then Unique_Key end) * 100.0
/ count(distinct Unique_Key),
[PAYROLL] = count(case when System_Name = 'PAYROLL' then Unique_Key end) * 100.0
/ count(distinct Unique_Key)
from cte;
一旦得到正确的结果,转换为动态查询就不难了。使用 string_agg() 或 xml 那些重复行的路径
declare @sql nvarchar(max);
; with cte as
(
select distinct System_Name
from data
where System_Name <> 'IT'
)
select @sql = string_agg(sql1 + ' / ' + sql2, ',' + char(13))
from cte
cross apply
(
select sql1 = char(9) + quotename(System_Name) + ' = '
+ 'count(case when System_Name = ''' + System_Name + ''' then Unique_Key end) * 100.0 ',
sql2 = 'count(distinct Unique_Key)'
) a
select @sql = 'with cte as' + char(13)
+ '(' + char(13)
+ ' select it.Unique_Key, ot.System_Name' + char(13)
+ ' from data it' + char(13)
+ ' left join data ot on it.Unique_Key = ot.Unique_Key' + char(13)
+ ' and ot.System_Name <> ''IT''' + char(13)
+ ' where it.System_Name = ''IT''' + char(13)
+ ')' + char(13)
+ 'select [ITKey] = count(distinct Unique_Key), ' + char(13)
+ @sql + char(13)
+ 'from cte;' + char(13)
print @sql;
exec sp_executesql @sql;
我有一个窄 table 包含唯一键和源数据
Unique_Key | System |
---|---|
1 | IT |
1 | ACCOUNTS |
1 | PAYROLL |
2 | IT |
2 | PAYROLL |
3 | IT |
4 | HR |
5 | PAYROLL |
我希望能够选择一个系统作为基础 - 在本例中为 IT - 然后在重要的地方创建一个动态 SQL 查询:
- 所选系统中不同的唯一键
- 与其他系统共享唯一密钥的比例。这些系统可能是动态的,而且数量远不止 4
我正在考虑使用动态 SQL 和 PIVOT 来首先挑选出 IT 之外的所有系统名称。然后以 IT 为基础,加入那个 table 以获取信息。
select distinct Unique_Key, System_Name
into #staging
from dbo.data
where System_Name <> 'IT'
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX);
SET @cols = STUFF((SELECT distinct ',' + QUOTENAME(System_Name)
FROM #staging
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT Unique_Key, ' + @cols + ' into dbo.temp from
(
select Unique_Key, System_Name
from #staging
) x
pivot
(
count(System_Name)
for System_Name in (' + @cols + ')
) p '
execute(@query)
select *
from
(
select distinct Unique_Key
from dbo.data
where System_Name = 'IT'
) a
left join dbo.temp b
on a.Unique_Key = b.Unique_Key
所以结果 table 是:
Unique_Key | PAYROLL | ACCOUNTS | HR |
---|---|---|---|
1 | 1 | 1 | 0 |
2 | 1 | 0 | 0 |
3 | 0 | 0 | 0 |
我要的是更进一步:
Distinct Count IT Key | PAYROLL | ACCOUNTS | HR |
---|---|---|---|
3 | 67% | 33% | 0% |
我可以用特定的 case when/sum 语句进行简单的连接,但想知道是否有办法动态地做到这一点,所以我不需要指定每个系统名称。
感谢任何tips/hints。
您可以尝试使用动态 SQL,如下所示,我会使用条件聚合函数获取枢轴值,然后我们可能会在动态 SQL 中添加 OUTER JOIN
或 EXISTS 条件。
我会使用 sp_executesql 而不是 exec
来避免 sql-injection。
DECLARE @System_Name NVARCHAR(50) = 'IT'
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX),
@parameter AS NVARCHAR(MAX);
SET @parameter = '@System_Name NVARCHAR(50)'
select DISTINCT System_Name
into #staging
from dbo.data t1
WHERE t1.System_Name <> @System_Name
SET @cols = STUFF((SELECT distinct ', SUM(IIF(System_Name = '''+ System_Name+''',1,0)) * 100.0 / SUM(IIF(System_Name = @System_Name,0,1)) ' + QUOTENAME(System_Name)
FROM #staging
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT SUM(IIF(System_Name = @System_Name,0,1)) [Distinct Count IT Key], ' + @cols + ' from dbo.data t1
WHERE EXISTS (
SELECT 1
FROM dbo.data tt
WHERE tt.Unique_Key = t1.Unique_Key
AND tt.System_Name = @System_Name
) '
EXECUTE sp_executesql @query, @parameter, @System_Name
这个方案改变了PIVOT本身的聚合函数。
首先,让我们向 #staging 添加一个列 [has_it],以跟踪每个 Unique_Key 是否有一个 IT 行:
select Unique_Key, System_Name, case when exists(select 1 from data d2 where d2.Unique_Key=d1.Unique_Key and d2.System_Name='IT') then 1 else 0 end as has_it
into #staging
from data d1
where System_Name <> 'IT'
group by Unique_Key, System_Name
现在,此列的 per-System 聚合(总和)除以所需的最终唯一键总数(示例案例=3)returns 请求数。将 PIVOT 更改为以下内容并准备就绪 as-is,无需进一步查询:
set @query = ' select *
from
(
select System_Name,cnt as [Distinct Count IT Key],has_it*1.0/cnt as divcnt
from #staging
cross join
(
select count(distinct Unique_Key) as cnt
from dbo.data
where System_Name = ''IT''
)y
) x
pivot
(
sum(divcnt)
for System_Name in (' + @cols + ')
) p'
编写动态查询时,您从 non-dynamic 查询开始。在转换为动态查询之前,请确保您获得的查询结果是正确的。
对于您需要的结果,查询将是
with cte as
(
select it.Unique_Key, ot.System_Name
from data it
left join data ot on it.Unique_Key = ot.Unique_Key
and ot.System_Name <> 'IT'
where it.System_Name = 'IT'
)
select [ITKey] = count(distinct Unique_Key),
[ACCOUNTS] = count(case when System_Name = 'ACCOUNTS' then Unique_Key end) * 100.0
/ count(distinct Unique_Key),
[HR] = count(case when System_Name = 'HR' then Unique_Key end) * 100.0
/ count(distinct Unique_Key),
[PAYROLL] = count(case when System_Name = 'PAYROLL' then Unique_Key end) * 100.0
/ count(distinct Unique_Key)
from cte;
一旦得到正确的结果,转换为动态查询就不难了。使用 string_agg() 或 xml 那些重复行的路径
declare @sql nvarchar(max);
; with cte as
(
select distinct System_Name
from data
where System_Name <> 'IT'
)
select @sql = string_agg(sql1 + ' / ' + sql2, ',' + char(13))
from cte
cross apply
(
select sql1 = char(9) + quotename(System_Name) + ' = '
+ 'count(case when System_Name = ''' + System_Name + ''' then Unique_Key end) * 100.0 ',
sql2 = 'count(distinct Unique_Key)'
) a
select @sql = 'with cte as' + char(13)
+ '(' + char(13)
+ ' select it.Unique_Key, ot.System_Name' + char(13)
+ ' from data it' + char(13)
+ ' left join data ot on it.Unique_Key = ot.Unique_Key' + char(13)
+ ' and ot.System_Name <> ''IT''' + char(13)
+ ' where it.System_Name = ''IT''' + char(13)
+ ')' + char(13)
+ 'select [ITKey] = count(distinct Unique_Key), ' + char(13)
+ @sql + char(13)
+ 'from cte;' + char(13)
print @sql;
exec sp_executesql @sql;