SQL:使用 Pivot 对列的不同药物
SQL: Distinct Medications to Columns Using Pivot
我知道这个问题已经被问过很多次了,但由于我对 SQL 还很陌生,因此很难根据我的目的修改以前的答案。我大部分时间都解决了问题,但在排除重复案例的同时,我很难找到一个关键点来工作。问题是我对语法还不够熟悉,无法适当调整。
我目前的数据看起来像这样(简单版本):
----------------------------------------------------------
| **Medication** | **Patient_ID** |
----------------------------------------------------------
| Amlopidine | 100123 |
----------------------------------------------------------
| Lisinopril | 100123 |
----------------------------------------------------------
| Eprosartan | 200415 |
----------------------------------------------------------
我希望得到这样的结果:
------------------------------------------------------------------------------
| **Patent_ID** | **MED_1** | **MED_2** |
------------------------------------------------------------------------------
| 100123 | Amlopidine | Lisinopril |
------------------------------------------------------------------------------
| 200415 | Eprosartan | NULL |
------------------------------------------------------------------------------
我遇到的问题是,多年来患者可能多次服用相同的药物,导致 table 出现大量重复,这是我试图避免的。
到目前为止我的代码(IndicatorValue = Medication):
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX)
SELECT @cols = STUFF((SELECT ',' + QUOTENAME(col+'_'+cast(rn as varchar(10)))
FROM
(
SELECT row_number() OVER(PARTITION BY Patient_ID
ORDER BY IndicatorValue) rn
FROM dbo.DiseaseCaseIndicator
) t
cross join
(
select DISTINCT 'IndicatorValue' col
) c
group by col, rn
order by rn, col
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT Patient_ID,' + @cols + '
from
(
select Patient_ID,
col+''_''+cast(rn as varchar(10)) col,
value
from
(
select DISTINCT IndicatorValue, Patient_ID,
row_number() over(partition by Patient_ID
order by IndicatorValue) rn
from dbo.DiseaseCaseIndicator WHERE Patient_ID IN (SELECT Patient_ID FROM dbo.HTPatients) AND IndicatorType = ''Medication'' AND Disease = ''Hypertension''
) d
cross apply
(
values (''IndicatorValue'', IndicatorValue)
) c (col, value)
) t
pivot
(
max(value)
for col in (' + @cols + ')
) p '
execute(@query);
很粗糙,我知道,但我还有很多 SQL 有待学习!
因此,主要问题将涉及删除那些残忍的重复项。另外,我有很多专栏,因为我仍然不太清楚 row_number() 函数是如何实现的。我知道我最多只需要 10 列的药物,因为只有少数患者有那么多独特的药物。另外:这种 table 格式的原因是因为主管要求。
如果你们能提供任何见解,我们将不胜感激!!
这是一个动态 SQL 查询,将基于聚合进行透视:
DECLARE @SQL NVARCHAR(MAX) = 'SELECT [Patient_ID]'
+ STUFF((SELECT ', MAX(CASE WHEN RN = ' + CAST([RN] AS NVARCHAR) + ' THEN [Medication] END) Med' + CAST([RN] AS NVARCHAR)
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY [Patient_ID] ORDER BY [Medication]) [RN]
FROM tblName) A
GROUP BY [RN]
FOR XML PATH ('')), 1, 0, '')
+ ' FROM (SELECT [Medication], [Patient_ID], ROW_NUMBER() OVER (PARTITION BY [Patient_ID] ORDER BY [Medication]) [RN]
FROM tblName) A
GROUP BY [Patient_ID]'
EXEC(@SQL)
想法是在 stuff 语句中输出 CASE 聚合。
这是使用 dynamic crosstab 的一种方法:
DECLARE @sql NVARCHAR(MAX) = N''
SELECT @sql =
'SELECT
Patient_ID' + CHAR(10)
SELECT @sql = @sql +
' , MAX(CASE WHEN rn = ' + CONVERT(VARCHAR(10), rn) +' THEN Medication END) AS '
+ QUOTENAME('MED_' + CONVERT(VARCHAR(10), rn)) + CHAR(10)
FROM (
SELECT DISTINCT rn = DENSE_RANK() OVER(PARTITION BY Patient_ID ORDER BY Medication)
FROM tbl
) t
SELECT @sql = @sql +
'FROM (
SELECT *,
rn = DENSE_RANK() OVER(PARTITION BY Patient_ID ORDER BY Medication)
FROM tbl
) t
GROUP BY t.Patient_ID
ORDER BY t.Patient_ID'
PRINT (@sql)
EXEC (@sql)
我知道这个问题已经被问过很多次了,但由于我对 SQL 还很陌生,因此很难根据我的目的修改以前的答案。我大部分时间都解决了问题,但在排除重复案例的同时,我很难找到一个关键点来工作。问题是我对语法还不够熟悉,无法适当调整。
我目前的数据看起来像这样(简单版本):
----------------------------------------------------------
| **Medication** | **Patient_ID** |
----------------------------------------------------------
| Amlopidine | 100123 |
----------------------------------------------------------
| Lisinopril | 100123 |
----------------------------------------------------------
| Eprosartan | 200415 |
----------------------------------------------------------
我希望得到这样的结果:
------------------------------------------------------------------------------
| **Patent_ID** | **MED_1** | **MED_2** |
------------------------------------------------------------------------------
| 100123 | Amlopidine | Lisinopril |
------------------------------------------------------------------------------
| 200415 | Eprosartan | NULL |
------------------------------------------------------------------------------
我遇到的问题是,多年来患者可能多次服用相同的药物,导致 table 出现大量重复,这是我试图避免的。
到目前为止我的代码(IndicatorValue = Medication):
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX)
SELECT @cols = STUFF((SELECT ',' + QUOTENAME(col+'_'+cast(rn as varchar(10)))
FROM
(
SELECT row_number() OVER(PARTITION BY Patient_ID
ORDER BY IndicatorValue) rn
FROM dbo.DiseaseCaseIndicator
) t
cross join
(
select DISTINCT 'IndicatorValue' col
) c
group by col, rn
order by rn, col
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT Patient_ID,' + @cols + '
from
(
select Patient_ID,
col+''_''+cast(rn as varchar(10)) col,
value
from
(
select DISTINCT IndicatorValue, Patient_ID,
row_number() over(partition by Patient_ID
order by IndicatorValue) rn
from dbo.DiseaseCaseIndicator WHERE Patient_ID IN (SELECT Patient_ID FROM dbo.HTPatients) AND IndicatorType = ''Medication'' AND Disease = ''Hypertension''
) d
cross apply
(
values (''IndicatorValue'', IndicatorValue)
) c (col, value)
) t
pivot
(
max(value)
for col in (' + @cols + ')
) p '
execute(@query);
很粗糙,我知道,但我还有很多 SQL 有待学习!
因此,主要问题将涉及删除那些残忍的重复项。另外,我有很多专栏,因为我仍然不太清楚 row_number() 函数是如何实现的。我知道我最多只需要 10 列的药物,因为只有少数患者有那么多独特的药物。另外:这种 table 格式的原因是因为主管要求。
如果你们能提供任何见解,我们将不胜感激!!
这是一个动态 SQL 查询,将基于聚合进行透视:
DECLARE @SQL NVARCHAR(MAX) = 'SELECT [Patient_ID]'
+ STUFF((SELECT ', MAX(CASE WHEN RN = ' + CAST([RN] AS NVARCHAR) + ' THEN [Medication] END) Med' + CAST([RN] AS NVARCHAR)
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY [Patient_ID] ORDER BY [Medication]) [RN]
FROM tblName) A
GROUP BY [RN]
FOR XML PATH ('')), 1, 0, '')
+ ' FROM (SELECT [Medication], [Patient_ID], ROW_NUMBER() OVER (PARTITION BY [Patient_ID] ORDER BY [Medication]) [RN]
FROM tblName) A
GROUP BY [Patient_ID]'
EXEC(@SQL)
想法是在 stuff 语句中输出 CASE 聚合。
这是使用 dynamic crosstab 的一种方法:
DECLARE @sql NVARCHAR(MAX) = N''
SELECT @sql =
'SELECT
Patient_ID' + CHAR(10)
SELECT @sql = @sql +
' , MAX(CASE WHEN rn = ' + CONVERT(VARCHAR(10), rn) +' THEN Medication END) AS '
+ QUOTENAME('MED_' + CONVERT(VARCHAR(10), rn)) + CHAR(10)
FROM (
SELECT DISTINCT rn = DENSE_RANK() OVER(PARTITION BY Patient_ID ORDER BY Medication)
FROM tbl
) t
SELECT @sql = @sql +
'FROM (
SELECT *,
rn = DENSE_RANK() OVER(PARTITION BY Patient_ID ORDER BY Medication)
FROM tbl
) t
GROUP BY t.Patient_ID
ORDER BY t.Patient_ID'
PRINT (@sql)
EXEC (@sql)