将多个 DATE(年)范围内的数据选择到命名列中 SQL 服务器
Selecting data within multiple DATE (year) ranges into named columns SQL Server
我必须 select 来自 SQL 服务器 table 的多个日期范围的数据,即
1990-1994, 1992-1996, 1994-1998, 1996-2000, 1998-2002, 2000-2004,
2002-2006, 2004-2008, 2006-2010, 2008-2012, 2010-2014
我使用这个查询来获取没有 DATE 范围的数据,即
SELECT
aid, research_area_category_id,
CAST(research_area as VARCHAR(100)) [research_area],
COUNT(*) [Counting]
FROM
sub_aminer_paper
GROUP BY
CAST(research_area as VARCHAR(100)), aid, research_area_category_id
HAVING
aid = 12403
ORDER BY
Counting DESC
这给出了图像中的输出,即
现在,对于使用 WHERE
子句的每个日期范围,我必须在日期范围的相应列中显示数据。而我使用了这个查询,即
SELECT
aid, research_area_category_id,
[research_area] = CAST(research_area as VARCHAR(100)),
[Counting] = COUNT(*),
[1990 - 1994] = SUM(CASE WHEN p_year BETWEEN 1990 AND 1994 THEN 1 ELSE 0 END),
[1992 - 1996] = SUM(CASE WHEN p_year BETWEEN 1992 AND 1996 THEN 1 ELSE 0 END),
[1994 - 1998] = SUM(CASE WHEN p_year BETWEEN 1994 AND 1998 THEN 1 ELSE 0 END),
[1996 - 2000] = SUM(CASE WHEN p_year BETWEEN 1996 AND 2000 THEN 1 ELSE 0 END),
[1998 - 2002] = SUM(CASE WHEN p_year BETWEEN 1998 AND 2002 THEN 1 ELSE 0 END),
[2000 - 2004] = SUM(CASE WHEN p_year BETWEEN 2000 AND 2004 THEN 1 ELSE 0 END),
[2002 - 2006] = SUM(CASE WHEN p_year BETWEEN 2002 AND 2006 THEN 1 ELSE 0 END),
[2004 - 2008] = SUM(CASE WHEN p_year BETWEEN 2004 AND 2008 THEN 1 ELSE 0 END),
[2006 - 2010] = SUM(CASE WHEN p_year BETWEEN 2006 AND 2010 THEN 1 ELSE 0 END),
[2008 - 2012] = SUM(CASE WHEN p_year BETWEEN 2008 AND 2012 THEN 1 ELSE 0 END),
[2010 - 2014] = SUM(CASE WHEN p_year BETWEEN 2010 AND 2014 THEN 1 ELSE 0 END)
FROM
sub_aminer_paper
WHERE
aid = 2937
AND p_year BETWEEN 1990 AND 2014
GROUP BY
aid, CAST(research_area AS VARCHAR(100)), research_area_category_id
ORDER BY aid ASC, Counting DESC
此查询输出:
但我需要 research_area_category_id
这些列下的值(1990-1994、1992-1996、1994-1998.....等等)。例如。在 1990 - 1994
列中,它应该显示相应的 research_area_category_id
即 1
、1
和 32
而不是 Counting
即 1
、1
和 1
,同样它应该在 1998 - 2002
列中显示 33
而不是 2
,反之亦然。
请帮助并提前致谢。
Tab Alleman 已经在评论中提到了最佳方法,但我会厚着脸皮将其添加为答案。
您明确表示要在透视日期列中显示 research_area_category_id
列中的值。因此,这里的第一步是使每个 CASE
语句的输出为 research_area_category_id
,而不是一个整数 1
:
CASE WHEN p_year BETWEEN 1990 AND 1994 THEN research_area_category_id ELSE 0 END
如果您 运行 您的代码仅进行此更改,您会发现 SUM
函数导致输出为 research_area_category_id
值的倍数。例如,1998 - 2002
的第一行的值为 66
(33 的两倍)。
所以这告诉我们您不想再使用 SUM
函数。但是,您仍然希望聚合(分组)具有不同 p_year
值的所有行的数据,因此您必须改用 some 类型的聚合函数。否则,SQL 服务器会抛出错误,因为您没有按 p_year
分组。
在这种情况下使用的最简单的聚合函数是 MAX
,它从被分组到一个行的集合中获取最大值。 official documentation 有一些简单的例子。
如果 research_area_category_id
的所有值都是正数(大于 CASE
语句默认的 0
),这仅适用于您的情况,它们看起来是。
将对 CASE
语句的更改与从 SUM
到 MAX
的更改相结合,可以得到以下版本的查询:
SELECT
aid, research_area_category_id,
[research_area] = CAST(research_area as VARCHAR(100)),
[Counting] = COUNT(*),
[1990 - 1994] = MAX(CASE WHEN p_year BETWEEN 1990 AND 1994 THEN research_area_category_id ELSE 0 END),
[1992 - 1996] = MAX(CASE WHEN p_year BETWEEN 1992 AND 1996 THEN research_area_category_id ELSE 0 END),
[1994 - 1998] = MAX(CASE WHEN p_year BETWEEN 1994 AND 1998 THEN research_area_category_id ELSE 0 END),
[1996 - 2000] = MAX(CASE WHEN p_year BETWEEN 1996 AND 2000 THEN research_area_category_id ELSE 0 END),
[1998 - 2002] = MAX(CASE WHEN p_year BETWEEN 1998 AND 2002 THEN research_area_category_id ELSE 0 END),
[2000 - 2004] = MAX(CASE WHEN p_year BETWEEN 2000 AND 2004 THEN research_area_category_id ELSE 0 END),
[2002 - 2006] = MAX(CASE WHEN p_year BETWEEN 2002 AND 2006 THEN research_area_category_id ELSE 0 END),
[2004 - 2008] = MAX(CASE WHEN p_year BETWEEN 2004 AND 2008 THEN research_area_category_id ELSE 0 END),
[2006 - 2010] = MAX(CASE WHEN p_year BETWEEN 2006 AND 2010 THEN research_area_category_id ELSE 0 END),
[2008 - 2012] = MAX(CASE WHEN p_year BETWEEN 2008 AND 2012 THEN research_area_category_id ELSE 0 END),
[2010 - 2014] = MAX(CASE WHEN p_year BETWEEN 2010 AND 2014 THEN research_area_category_id ELSE 0 END)
FROM
sub_aminer_paper
WHERE
aid = 2937
AND p_year BETWEEN 1990 AND 2014
GROUP BY
aid, CAST(research_area AS VARCHAR(100)), research_area_category_id
ORDER BY aid ASC, Counting DESC
如果您有兴趣,我在 this SQL fiddle 中模拟了几行数据,以便在回答之前测试此查询。 (我在猜测 p_year
的值,但它们证明了原理,除非我误解了你的要求。)
我必须 select 来自 SQL 服务器 table 的多个日期范围的数据,即
1990-1994, 1992-1996, 1994-1998, 1996-2000, 1998-2002, 2000-2004,
2002-2006, 2004-2008, 2006-2010, 2008-2012, 2010-2014
我使用这个查询来获取没有 DATE 范围的数据,即
SELECT
aid, research_area_category_id,
CAST(research_area as VARCHAR(100)) [research_area],
COUNT(*) [Counting]
FROM
sub_aminer_paper
GROUP BY
CAST(research_area as VARCHAR(100)), aid, research_area_category_id
HAVING
aid = 12403
ORDER BY
Counting DESC
这给出了图像中的输出,即
现在,对于使用 WHERE
子句的每个日期范围,我必须在日期范围的相应列中显示数据。而我使用了这个查询,即
SELECT
aid, research_area_category_id,
[research_area] = CAST(research_area as VARCHAR(100)),
[Counting] = COUNT(*),
[1990 - 1994] = SUM(CASE WHEN p_year BETWEEN 1990 AND 1994 THEN 1 ELSE 0 END),
[1992 - 1996] = SUM(CASE WHEN p_year BETWEEN 1992 AND 1996 THEN 1 ELSE 0 END),
[1994 - 1998] = SUM(CASE WHEN p_year BETWEEN 1994 AND 1998 THEN 1 ELSE 0 END),
[1996 - 2000] = SUM(CASE WHEN p_year BETWEEN 1996 AND 2000 THEN 1 ELSE 0 END),
[1998 - 2002] = SUM(CASE WHEN p_year BETWEEN 1998 AND 2002 THEN 1 ELSE 0 END),
[2000 - 2004] = SUM(CASE WHEN p_year BETWEEN 2000 AND 2004 THEN 1 ELSE 0 END),
[2002 - 2006] = SUM(CASE WHEN p_year BETWEEN 2002 AND 2006 THEN 1 ELSE 0 END),
[2004 - 2008] = SUM(CASE WHEN p_year BETWEEN 2004 AND 2008 THEN 1 ELSE 0 END),
[2006 - 2010] = SUM(CASE WHEN p_year BETWEEN 2006 AND 2010 THEN 1 ELSE 0 END),
[2008 - 2012] = SUM(CASE WHEN p_year BETWEEN 2008 AND 2012 THEN 1 ELSE 0 END),
[2010 - 2014] = SUM(CASE WHEN p_year BETWEEN 2010 AND 2014 THEN 1 ELSE 0 END)
FROM
sub_aminer_paper
WHERE
aid = 2937
AND p_year BETWEEN 1990 AND 2014
GROUP BY
aid, CAST(research_area AS VARCHAR(100)), research_area_category_id
ORDER BY aid ASC, Counting DESC
此查询输出:
但我需要 research_area_category_id
这些列下的值(1990-1994、1992-1996、1994-1998.....等等)。例如。在 1990 - 1994
列中,它应该显示相应的 research_area_category_id
即 1
、1
和 32
而不是 Counting
即 1
、1
和 1
,同样它应该在 1998 - 2002
列中显示 33
而不是 2
,反之亦然。
请帮助并提前致谢。
Tab Alleman 已经在评论中提到了最佳方法,但我会厚着脸皮将其添加为答案。
您明确表示要在透视日期列中显示 research_area_category_id
列中的值。因此,这里的第一步是使每个 CASE
语句的输出为 research_area_category_id
,而不是一个整数 1
:
CASE WHEN p_year BETWEEN 1990 AND 1994 THEN research_area_category_id ELSE 0 END
如果您 运行 您的代码仅进行此更改,您会发现 SUM
函数导致输出为 research_area_category_id
值的倍数。例如,1998 - 2002
的第一行的值为 66
(33 的两倍)。
所以这告诉我们您不想再使用 SUM
函数。但是,您仍然希望聚合(分组)具有不同 p_year
值的所有行的数据,因此您必须改用 some 类型的聚合函数。否则,SQL 服务器会抛出错误,因为您没有按 p_year
分组。
在这种情况下使用的最简单的聚合函数是 MAX
,它从被分组到一个行的集合中获取最大值。 official documentation 有一些简单的例子。
如果 research_area_category_id
的所有值都是正数(大于 CASE
语句默认的 0
),这仅适用于您的情况,它们看起来是。
将对 CASE
语句的更改与从 SUM
到 MAX
的更改相结合,可以得到以下版本的查询:
SELECT
aid, research_area_category_id,
[research_area] = CAST(research_area as VARCHAR(100)),
[Counting] = COUNT(*),
[1990 - 1994] = MAX(CASE WHEN p_year BETWEEN 1990 AND 1994 THEN research_area_category_id ELSE 0 END),
[1992 - 1996] = MAX(CASE WHEN p_year BETWEEN 1992 AND 1996 THEN research_area_category_id ELSE 0 END),
[1994 - 1998] = MAX(CASE WHEN p_year BETWEEN 1994 AND 1998 THEN research_area_category_id ELSE 0 END),
[1996 - 2000] = MAX(CASE WHEN p_year BETWEEN 1996 AND 2000 THEN research_area_category_id ELSE 0 END),
[1998 - 2002] = MAX(CASE WHEN p_year BETWEEN 1998 AND 2002 THEN research_area_category_id ELSE 0 END),
[2000 - 2004] = MAX(CASE WHEN p_year BETWEEN 2000 AND 2004 THEN research_area_category_id ELSE 0 END),
[2002 - 2006] = MAX(CASE WHEN p_year BETWEEN 2002 AND 2006 THEN research_area_category_id ELSE 0 END),
[2004 - 2008] = MAX(CASE WHEN p_year BETWEEN 2004 AND 2008 THEN research_area_category_id ELSE 0 END),
[2006 - 2010] = MAX(CASE WHEN p_year BETWEEN 2006 AND 2010 THEN research_area_category_id ELSE 0 END),
[2008 - 2012] = MAX(CASE WHEN p_year BETWEEN 2008 AND 2012 THEN research_area_category_id ELSE 0 END),
[2010 - 2014] = MAX(CASE WHEN p_year BETWEEN 2010 AND 2014 THEN research_area_category_id ELSE 0 END)
FROM
sub_aminer_paper
WHERE
aid = 2937
AND p_year BETWEEN 1990 AND 2014
GROUP BY
aid, CAST(research_area AS VARCHAR(100)), research_area_category_id
ORDER BY aid ASC, Counting DESC
如果您有兴趣,我在 this SQL fiddle 中模拟了几行数据,以便在回答之前测试此查询。 (我在猜测 p_year
的值,但它们证明了原理,除非我误解了你的要求。)