SQL:按日期分组并对列中的值求和
SQL: Group by date and summing values in a column
我有一个 JDBC 数据库(特别是 DB2,但我正在寻找与数据库无关的东西,至少是 DB2 和 Oracle),它有一个 table,每 10 分钟插入一次记录有关应用程序 运行 的 API 的统计信息。它看起来像:
StatKey, StartDate, EndDate, APIName, StatName, StatValue
201505071498224437562706 2015-05-07 14:12:44.0 2015-05-07 14:22:44.0 API5 Invocations 34
201505071498161437466684 2015-05-07 14:06:14.0 2015-05-07 14:16:14.0 API4 Invocations 79
201505071498060937466556 2015-05-07 13:56:08.0 2015-05-07 14:06:08.0 API4 Average 26,264.37
201505071497263437627286 2015-05-07 14:16:33.0 2015-05-07 14:26:34.0 API2 Invocations 24
201505071497262137620812 2015-05-07 14:16:19.0 2015-05-07 14:26:20.0 API2 Invocations 24
201505071497024537466378 2015-05-07 13:52:43.0 2015-05-07 14:02:44.0 API1 Average 6,830,050
201505071497023337466368 2015-05-07 13:52:31.0 2015-05-07 14:02:32.0 API3 Average 31,523
201505071496023337466361 2015-05-07 13:52:31.0 2015-05-07 14:02:32.0 API2 Invocations 1
201505071494263837628892 2015-05-07 14:16:36.0 2015-05-07 14:26:37.0 API5 Invocations 68
201505071493124437466656 2015-05-07 14:02:44.0 2015-05-07 14:12:44.0 API1 Invocations 2
201505071492263037625304 2015-05-07 14:16:29.0 2015-05-07 14:26:30.0 API3 Average 179,223.29
每 10 分钟,在此期间执行的任何 API 都会有一个类似于上述的条目。但是,多个 JVM 将写入同一个数据库,因此开始和结束时间并非简单地每 10 分钟一次,并且每小时可能有超过 6 个条目。
我想做的是创建一个 SQL,它将每小时 运行 时间的每个小时对所有 API 的所有调用进行分组。例如:
Date&Hour, API, Invocations
2015-05-07 12:00, API1, 100
2015-05-07 12:00, API2, 150
2015-05-07 13:00, API2, 200
etc...
我试过在小时标记处基于主键的 SUBSTR(始终是时间戳加上一些随机数 - 但在小时和分钟之间是 2 个随机数字)进行 GROUP BY,但我'我不确定如何添加每小时的所有 StatName=Invocations。
有人可以提供一些关于我如何完成这个的想法吗?
不太确定你要找的是这个吗?
本质上,它查看 YYYYMMDDHH 10 个位置,因为它们包含要分组的值...,然后仅基于调用求和
SELECT substr(statKey,1,10) as DH, APIName, Sum(Statvalue) Invocations
FROM TableName
WHERE StatName = 'Invocations'
GROUP BY substr(statKey,1,10), APIName, StatName
示例:
WITH CTE AS
(SELECT '201505071498224437562706' AS StatKey,
'2015-05-07 14:12:44.0' AS StartDate,
'2015-05-07 14:22:44.0' AS EndDate,
'API5' AS APIName,
'Invocations' AS StatName,
34 AS statvalue
FROM dual
UNION ALL
SELECT '201505071498161437466684',
'2015-05-07 14:06:14.0',
'2015-05-07 14:16:14.0',
'API4',
'Invocations',
79
FROM dual
UNION ALL
SELECT '201505071498060937466556',
'2015-05-07 13:56:08.0',
'2015-05-07 14:06:08.0',
'API4',
'Average',
26264.37
FROM dual
)
SELECT substr(statKey,1,10) as DH, APIName, StatName, Sum(Statvalue)
FROM TableName
WHERE StatName = 'Invocations'
GROUP BY substr(statKey,1,10), APIName, StatName
至少对于 DB2,为什么不
select date(startdate) as start_date
, hour(startdate) as start_hour
, API
, sum(statvalue) as Invocations
from mytbl
where statname = 'Invocations'
group by date(startdate), hour(startdate), API
如果你真的想要的话,我会把它留作练习,让你将日期和时间重新组合成时间戳...
另一个可能的解决方案:
select to_char(StarDate,'rrrr-mm-dd HH24:')||'00' as DateHour,
APIName as API,
sum(StatValue) as Invocations
from STATISTICS
where StatName = 'Invocations'
group by to_char(StarDate,'rrrr-mm-dd HH24:')||'00', APIName
有多种方法可以做到这一点..
祝你好运!
Oracle 11g R2 架构设置:
CREATE TABLE Data AS
SELECT '201505071498224437562706' AS StatKey, TO_DATE( '2015-05-07 14:12:44', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:22:44', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API5' AS APIName, 'Invocations' AS StatName, 34 AS StatValue FROM DUAL
UNION ALL SELECT '201505071498161437466684' AS StatKey, TO_DATE( '2015-05-07 14:06:14', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:16:14', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API4' AS APIName, 'Invocations' AS StatName, 79 AS StatValue FROM DUAL
UNION ALL SELECT '201505071498060937466556' AS StatKey, TO_DATE( '2015-05-07 13:56:08', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:06:08', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API4' AS APIName, 'Average' AS StatName, 26264.37 AS StatValue FROM DUAL
UNION ALL SELECT '201505071497263437627286' AS StatKey, TO_DATE( '2015-05-07 14:16:33', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:26:34', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API2' AS APIName, 'Invocations' AS StatName, 24 AS StatValue FROM DUAL
UNION ALL SELECT '201505071497262137620812' AS StatKey, TO_DATE( '2015-05-07 14:16:19', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:26:20', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API2' AS APIName, 'Invocations' AS StatName, 24 AS StatValue FROM DUAL
UNION ALL SELECT '201505071497024537466378' AS StatKey, TO_DATE( '2015-05-07 13:52:43', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:02:44', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API1' AS APIName, 'Average' AS StatName, 6830050 AS StatValue FROM DUAL
UNION ALL SELECT '201505071497023337466368' AS StatKey, TO_DATE( '2015-05-07 13:52:31', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:02:32', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API3' AS APIName, 'Average' AS StatName, 31523 AS StatValue FROM DUAL
UNION ALL SELECT '201505071496023337466361' AS StatKey, TO_DATE( '2015-05-07 13:52:31', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:02:32', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API2' AS APIName, 'Invocations' AS StatName, 1 AS StatValue FROM DUAL
UNION ALL SELECT '201505071494263837628892' AS StatKey, TO_DATE( '2015-05-07 14:16:36', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:26:37', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API5' AS APIName, 'Invocations' AS StatName, 68 AS StatValue FROM DUAL
UNION ALL SELECT '201505071493124437466656' AS StatKey, TO_DATE( '2015-05-07 14:02:44', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:12:44', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API1' AS APIName, 'Invocations' AS StatName, 2 AS StatValue FROM DUAL
UNION ALL SELECT '201505071492263037625304' AS StatKey, TO_DATE( '2015-05-07 14:16:29', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:26:30', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API3' AS APIName, 'Average' AS StatName, 179223.29 AS StatValue FROM DUAL;
查询 1:
SELECT TRUNC( EndDate, 'HH' ) AS "Date&Hour",
APIName,
SUM( StatValue ) AS Invocations
FROM Data
WHERE StatName = 'Invocations'
GROUP BY TRUNC( EndDate, 'HH' ),
APIName
| Date&Hour | APINAME | INVOCATIONS |
|-----------------------|---------|-------------|
| May, 07 2015 14:00:00 | API2 | 49 |
| May, 07 2015 14:00:00 | API5 | 102 |
| May, 07 2015 14:00:00 | API1 | 2 |
| May, 07 2015 14:00:00 | API4 | 79 |
日期函数似乎很难以与数据库无关的方式实现。
对于与数据库无关的解决方案,我建议在数据库中创建视图以隐藏特定于数据库的代码的实现,因此允许您使用直接的 select 而没有任何语法问题。
我有一个 JDBC 数据库(特别是 DB2,但我正在寻找与数据库无关的东西,至少是 DB2 和 Oracle),它有一个 table,每 10 分钟插入一次记录有关应用程序 运行 的 API 的统计信息。它看起来像:
StatKey, StartDate, EndDate, APIName, StatName, StatValue
201505071498224437562706 2015-05-07 14:12:44.0 2015-05-07 14:22:44.0 API5 Invocations 34
201505071498161437466684 2015-05-07 14:06:14.0 2015-05-07 14:16:14.0 API4 Invocations 79
201505071498060937466556 2015-05-07 13:56:08.0 2015-05-07 14:06:08.0 API4 Average 26,264.37
201505071497263437627286 2015-05-07 14:16:33.0 2015-05-07 14:26:34.0 API2 Invocations 24
201505071497262137620812 2015-05-07 14:16:19.0 2015-05-07 14:26:20.0 API2 Invocations 24
201505071497024537466378 2015-05-07 13:52:43.0 2015-05-07 14:02:44.0 API1 Average 6,830,050
201505071497023337466368 2015-05-07 13:52:31.0 2015-05-07 14:02:32.0 API3 Average 31,523
201505071496023337466361 2015-05-07 13:52:31.0 2015-05-07 14:02:32.0 API2 Invocations 1
201505071494263837628892 2015-05-07 14:16:36.0 2015-05-07 14:26:37.0 API5 Invocations 68
201505071493124437466656 2015-05-07 14:02:44.0 2015-05-07 14:12:44.0 API1 Invocations 2
201505071492263037625304 2015-05-07 14:16:29.0 2015-05-07 14:26:30.0 API3 Average 179,223.29
每 10 分钟,在此期间执行的任何 API 都会有一个类似于上述的条目。但是,多个 JVM 将写入同一个数据库,因此开始和结束时间并非简单地每 10 分钟一次,并且每小时可能有超过 6 个条目。
我想做的是创建一个 SQL,它将每小时 运行 时间的每个小时对所有 API 的所有调用进行分组。例如:
Date&Hour, API, Invocations
2015-05-07 12:00, API1, 100
2015-05-07 12:00, API2, 150
2015-05-07 13:00, API2, 200
etc...
我试过在小时标记处基于主键的 SUBSTR(始终是时间戳加上一些随机数 - 但在小时和分钟之间是 2 个随机数字)进行 GROUP BY,但我'我不确定如何添加每小时的所有 StatName=Invocations。
有人可以提供一些关于我如何完成这个的想法吗?
不太确定你要找的是这个吗?
本质上,它查看 YYYYMMDDHH 10 个位置,因为它们包含要分组的值...,然后仅基于调用求和
SELECT substr(statKey,1,10) as DH, APIName, Sum(Statvalue) Invocations
FROM TableName
WHERE StatName = 'Invocations'
GROUP BY substr(statKey,1,10), APIName, StatName
示例:
WITH CTE AS
(SELECT '201505071498224437562706' AS StatKey,
'2015-05-07 14:12:44.0' AS StartDate,
'2015-05-07 14:22:44.0' AS EndDate,
'API5' AS APIName,
'Invocations' AS StatName,
34 AS statvalue
FROM dual
UNION ALL
SELECT '201505071498161437466684',
'2015-05-07 14:06:14.0',
'2015-05-07 14:16:14.0',
'API4',
'Invocations',
79
FROM dual
UNION ALL
SELECT '201505071498060937466556',
'2015-05-07 13:56:08.0',
'2015-05-07 14:06:08.0',
'API4',
'Average',
26264.37
FROM dual
)
SELECT substr(statKey,1,10) as DH, APIName, StatName, Sum(Statvalue)
FROM TableName
WHERE StatName = 'Invocations'
GROUP BY substr(statKey,1,10), APIName, StatName
至少对于 DB2,为什么不
select date(startdate) as start_date
, hour(startdate) as start_hour
, API
, sum(statvalue) as Invocations
from mytbl
where statname = 'Invocations'
group by date(startdate), hour(startdate), API
如果你真的想要的话,我会把它留作练习,让你将日期和时间重新组合成时间戳...
另一个可能的解决方案:
select to_char(StarDate,'rrrr-mm-dd HH24:')||'00' as DateHour,
APIName as API,
sum(StatValue) as Invocations
from STATISTICS
where StatName = 'Invocations'
group by to_char(StarDate,'rrrr-mm-dd HH24:')||'00', APIName
有多种方法可以做到这一点..
祝你好运!
Oracle 11g R2 架构设置:
CREATE TABLE Data AS
SELECT '201505071498224437562706' AS StatKey, TO_DATE( '2015-05-07 14:12:44', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:22:44', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API5' AS APIName, 'Invocations' AS StatName, 34 AS StatValue FROM DUAL
UNION ALL SELECT '201505071498161437466684' AS StatKey, TO_DATE( '2015-05-07 14:06:14', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:16:14', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API4' AS APIName, 'Invocations' AS StatName, 79 AS StatValue FROM DUAL
UNION ALL SELECT '201505071498060937466556' AS StatKey, TO_DATE( '2015-05-07 13:56:08', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:06:08', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API4' AS APIName, 'Average' AS StatName, 26264.37 AS StatValue FROM DUAL
UNION ALL SELECT '201505071497263437627286' AS StatKey, TO_DATE( '2015-05-07 14:16:33', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:26:34', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API2' AS APIName, 'Invocations' AS StatName, 24 AS StatValue FROM DUAL
UNION ALL SELECT '201505071497262137620812' AS StatKey, TO_DATE( '2015-05-07 14:16:19', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:26:20', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API2' AS APIName, 'Invocations' AS StatName, 24 AS StatValue FROM DUAL
UNION ALL SELECT '201505071497024537466378' AS StatKey, TO_DATE( '2015-05-07 13:52:43', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:02:44', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API1' AS APIName, 'Average' AS StatName, 6830050 AS StatValue FROM DUAL
UNION ALL SELECT '201505071497023337466368' AS StatKey, TO_DATE( '2015-05-07 13:52:31', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:02:32', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API3' AS APIName, 'Average' AS StatName, 31523 AS StatValue FROM DUAL
UNION ALL SELECT '201505071496023337466361' AS StatKey, TO_DATE( '2015-05-07 13:52:31', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:02:32', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API2' AS APIName, 'Invocations' AS StatName, 1 AS StatValue FROM DUAL
UNION ALL SELECT '201505071494263837628892' AS StatKey, TO_DATE( '2015-05-07 14:16:36', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:26:37', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API5' AS APIName, 'Invocations' AS StatName, 68 AS StatValue FROM DUAL
UNION ALL SELECT '201505071493124437466656' AS StatKey, TO_DATE( '2015-05-07 14:02:44', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:12:44', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API1' AS APIName, 'Invocations' AS StatName, 2 AS StatValue FROM DUAL
UNION ALL SELECT '201505071492263037625304' AS StatKey, TO_DATE( '2015-05-07 14:16:29', 'YYYY-MM-DD HH24:MI:SS' ) AS StartDate, TO_DATE( '2015-05-07 14:26:30', 'YYYY-MM-DD HH24:MI:SS' ) AS EndDate, 'API3' AS APIName, 'Average' AS StatName, 179223.29 AS StatValue FROM DUAL;
查询 1:
SELECT TRUNC( EndDate, 'HH' ) AS "Date&Hour",
APIName,
SUM( StatValue ) AS Invocations
FROM Data
WHERE StatName = 'Invocations'
GROUP BY TRUNC( EndDate, 'HH' ),
APIName
| Date&Hour | APINAME | INVOCATIONS |
|-----------------------|---------|-------------|
| May, 07 2015 14:00:00 | API2 | 49 |
| May, 07 2015 14:00:00 | API5 | 102 |
| May, 07 2015 14:00:00 | API1 | 2 |
| May, 07 2015 14:00:00 | API4 | 79 |
日期函数似乎很难以与数据库无关的方式实现。
对于与数据库无关的解决方案,我建议在数据库中创建视图以隐藏特定于数据库的代码的实现,因此允许您使用直接的 select 而没有任何语法问题。