将多个 DATE(年)范围内的数据选择到命名列中 SQL 服务器

Selecting data within multiple DATE (year) ranges into named columns SQL Server

我必须 select 来自 SQL 服务器 table 的多个日期范围的数据,即

1990-1994,  1992-1996,  1994-1998,  1996-2000,  1998-2002,  2000-2004,  
2002-2006,  2004-2008,  2006-2010,  2008-2012,  2010-2014  

我使用这个查询来获取没有 DATE 范围的数据,即

SELECT 
    aid, research_area_category_id, 
    CAST(research_area as VARCHAR(100)) [research_area],
    COUNT(*) [Counting]
FROM 
    sub_aminer_paper
GROUP BY 
    CAST(research_area as VARCHAR(100)), aid, research_area_category_id
HAVING 
    aid = 12403 
ORDER BY 
    Counting DESC

这给出了图像中的输出,即

现在,对于使用 WHERE 子句的每个日期范围,我必须在日期范围的相应列中显示数据。而我使用了这个查询,即

SELECT 
    aid, research_area_category_id, 
    [research_area] = CAST(research_area as VARCHAR(100)), 
    [Counting] = COUNT(*),
    [1990 - 1994] = SUM(CASE WHEN p_year BETWEEN 1990 AND 1994 THEN 1 ELSE 0 END),
    [1992 - 1996] = SUM(CASE WHEN p_year BETWEEN 1992 AND 1996 THEN 1 ELSE 0 END),
    [1994 - 1998] = SUM(CASE WHEN p_year BETWEEN 1994 AND 1998 THEN 1 ELSE 0 END),
    [1996 - 2000] = SUM(CASE WHEN p_year BETWEEN 1996 AND 2000 THEN 1 ELSE 0 END),
    [1998 - 2002] = SUM(CASE WHEN p_year BETWEEN 1998 AND 2002 THEN 1 ELSE 0 END),
    [2000 - 2004] = SUM(CASE WHEN p_year BETWEEN 2000 AND 2004 THEN 1 ELSE 0 END),
    [2002 - 2006] = SUM(CASE WHEN p_year BETWEEN 2002 AND 2006 THEN 1 ELSE 0 END),
    [2004 - 2008] = SUM(CASE WHEN p_year BETWEEN 2004 AND 2008 THEN 1 ELSE 0 END),
    [2006 - 2010] = SUM(CASE WHEN p_year BETWEEN 2006 AND 2010 THEN 1 ELSE 0 END),
    [2008 - 2012] = SUM(CASE WHEN p_year BETWEEN 2008 AND 2012 THEN 1 ELSE 0 END),
    [2010 - 2014] = SUM(CASE WHEN p_year BETWEEN 2010 AND 2014 THEN 1 ELSE 0 END)
FROM 
    sub_aminer_paper
WHERE 
    aid = 2937  
    AND p_year BETWEEN 1990 AND 2014            
GROUP BY
    aid, CAST(research_area AS VARCHAR(100)), research_area_category_id
ORDER BY aid ASC, Counting DESC

此查询输出:

但我需要 research_area_category_id 这些列下的值(1990-1994、1992-1996、1994-1998.....等等)。例如。在 1990 - 1994 列中,它应该显示相应的 research_area_category_id1132 而不是 Counting111,同样它应该在 1998 - 2002 列中显示 33 而不是 2,反之亦然。

请帮助并提前致谢。

Tab Alleman 已经在评论中提到了最佳方法,但我会厚着脸皮将其添加为答案。

您明确表示要在透视日期列中显示 research_area_category_id 列中的值。因此,这里的第一步是使每个 CASE 语句的输出为 research_area_category_id,而不是一个整数 1:

CASE WHEN p_year BETWEEN 1990 AND 1994 THEN research_area_category_id ELSE 0 END

如果您 运行 您的代码仅进行此更改,您会发现 SUM 函数导致输出为 research_area_category_id 值的倍数。例如,1998 - 2002 的第一行的值为 66(33 的两倍)。

所以这告诉我们您不想再使用 SUM 函数。但是,您仍然希望聚合(分组)具有不同 p_year 值的所有行的数据,因此您必须改用 some 类型的聚合函数。否则,SQL 服务器会抛出错误,因为您没有按 p_year 分组。

在这种情况下使用的最简单的聚合函数是 MAX,它从被分组到一个行的集合中获取最大值。 official documentation 有一些简单的例子。

如果 research_area_category_id 的所有值都是正数(大于 CASE 语句默认的 0),这仅适用于您的情况,它们看起来是。

将对 CASE 语句的更改与从 SUMMAX 的更改相结合,可以得到以下版本的查询:

SELECT 
aid, research_area_category_id, 
[research_area] = CAST(research_area as VARCHAR(100)), 
[Counting] = COUNT(*),
[1990 - 1994] = MAX(CASE WHEN p_year BETWEEN 1990 AND 1994 THEN research_area_category_id ELSE 0 END),
[1992 - 1996] = MAX(CASE WHEN p_year BETWEEN 1992 AND 1996 THEN research_area_category_id ELSE 0 END),
[1994 - 1998] = MAX(CASE WHEN p_year BETWEEN 1994 AND 1998 THEN research_area_category_id ELSE 0 END),
[1996 - 2000] = MAX(CASE WHEN p_year BETWEEN 1996 AND 2000 THEN research_area_category_id ELSE 0 END),
[1998 - 2002] = MAX(CASE WHEN p_year BETWEEN 1998 AND 2002 THEN research_area_category_id ELSE 0 END),
[2000 - 2004] = MAX(CASE WHEN p_year BETWEEN 2000 AND 2004 THEN research_area_category_id ELSE 0 END),
[2002 - 2006] = MAX(CASE WHEN p_year BETWEEN 2002 AND 2006 THEN research_area_category_id ELSE 0 END),
[2004 - 2008] = MAX(CASE WHEN p_year BETWEEN 2004 AND 2008 THEN research_area_category_id ELSE 0 END),
[2006 - 2010] = MAX(CASE WHEN p_year BETWEEN 2006 AND 2010 THEN research_area_category_id ELSE 0 END),
[2008 - 2012] = MAX(CASE WHEN p_year BETWEEN 2008 AND 2012 THEN research_area_category_id ELSE 0 END),
[2010 - 2014] = MAX(CASE WHEN p_year BETWEEN 2010 AND 2014 THEN research_area_category_id ELSE 0 END)
FROM 
    sub_aminer_paper
WHERE 
    aid = 2937  
    AND p_year BETWEEN 1990 AND 2014            
GROUP BY
    aid, CAST(research_area AS VARCHAR(100)), research_area_category_id
ORDER BY aid ASC, Counting DESC

如果您有兴趣,我在 this SQL fiddle 中模拟了几行数据,以便在回答之前测试此查询。 (我在猜测 p_year 的值,但它们证明了原理,除非我误解了你的要求。)