使用 SUM OVER 子句,如何仅在输出不大于某个值时检查期间的总和,否则使用当前月份值?

Using the SUM OVER clause, how to check sum over period only when output is not greater than a certain value, otherwise use current month value?

示例数据:

select date, agent, sales
from agentsales

date                    agent   sales
2021-01-03 00:00:00.000 Agent A 10
2021-02-05 00:00:00.000 Agent A 15
2021-03-10 00:00:00.000 Agent A 10
2021-01-05 00:00:00.000 Agent B 5
2021-02-06 00:00:00.000 Agent B 28
2021-03-10 00:00:00.000 Agent B 5
2021-01-02 00:00:00.000 Agent C 35
2021-02-04 00:00:00.000 Agent C 25
2021-03-08 00:00:00.000 Agent C 15
2021-01-01 00:00:00.000 Agent D 5
2021-02-02 00:00:00.000 Agent D 35
2021-03-10 00:00:00.000 Agent D 31

我想获得销售量超过 30 次的代理商数量,这样 如果他们的销售量从未超过 30 次,则考虑当前和之前月份的总和,否则仅考虑当月。

预期输出:

YrMon  Count_Agent_more_than_30_sales
Jan21  1
Feb21  2
Mar21  2

逻辑:

Jan21 - 1 since only C has crossed 30 sales
Feb21 - 2 since B and D have crossed 30 sales. Agent D has crossed the 30 mark in the month, and B has crossed over period for first time. C is not considered as it previously crossed the 30 mark.
Mar21 - 2 since A and D have crossed 30 sales. Agent A has crossed mark over period for 1st time. D has crossed for the month. B is not considered as periodic case was already considered in last month. C is not considered as it already crossed 30 mark last month.

如上所述,我想获得销售量超过 30 次的代理商的数量,这样 如果他们的销售量从未超过 30 次,则考虑当前和之前月份的总和,否则只有当月。

我计算期间总和的查询:

;WITH CTE AS (SELECT CAST(YEAR([DATE]) AS VARCHAR)+' '+CAST(MONTH([DATE]) AS VARCHAR) YRMON, AGENT, SUM(SALES) SALES
  FROM AgentSales
  GROUP BY  CAST(YEAR([DATE]) AS VARCHAR)+' '+CAST(MONTH([DATE]) AS VARCHAR), AGENT
  )
  SELECT *, SUM(SALES) OVER(PARTITION BY AGENT ORDER BY YRMON) SUMOVERPERIOD FROM CTE
  ORDER BY 2,1

YRMON   AGENT   SALES   SUMOVERPERIOD
2021 1  Agent A 10      10
2021 2  Agent A 15      25
2021 3  Agent A 10      35
2021 1  Agent B 5       5
2021 2  Agent B 28      33
2021 3  Agent B 5       38
2021 1  Agent C 35      35
2021 2  Agent C 25      60
2021 3  Agent C 15      75
2021 1  Agent D 5       5
2021 2  Agent D 35      40
2021 3  Agent D 31      71

现在我尝试将逻辑应用于计算的总和:

   ;WITH CTE AS (SELECT CAST(YEAR([DATE]) AS VARCHAR)+' '+CAST(MONTH([DATE]) AS VARCHAR) YRMON, AGENT, SUM(SALES) SALES
  FROM AgentSales
  GROUP BY  CAST(YEAR([DATE]) AS VARCHAR)+' '+CAST(MONTH([DATE]) AS VARCHAR), AGENT
  )
  SELECT *, SUM(SALES) OVER(PARTITION BY AGENT ORDER BY YRMON) SUMOVERPERIOD,
  CASE WHEN SUM(SALES) OVER(PARTITION BY AGENT ORDER BY YRMON)>30 THEN 1 ELSE 0 END AS CALC
  FROM CTE
  ORDER BY 2,1

YRMON   AGENT   SALES   SUMOVERPERIOD   CALC
2021 1  Agent A 10      10              0
2021 2  Agent A 15      25              0
2021 3  Agent A 10      35              1
2021 1  Agent B 5       5               0
2021 2  Agent B 28      33              1
2021 3  Agent B 5       38              1
2021 1  Agent C 35      35              1
2021 2  Agent C 25      60              1
2021 3  Agent C 15      75              1
2021 1  Agent D 5       5               0
2021 2  Agent D 35      40              1
2021 3  Agent D 31      71              1

此查询始终考虑当前和上一期间的总和。

如何检查销售额之前是否超过了 30 个销售额标记,并且对于这种情况如何排除进行期间求和?例如,我们可以对 SUM OVER 列的结果应用 LAG 吗?

看起来这应该适合你

  • 您需要预先汇总每个 agentmonth 的销售额,然后得到该汇总的 运行 总和
  • 然后通过将当前数据与 运行 总和
  • 进行比较,简单地检查本月每一行是否交叉
SELECT
  YrMon = FORMAT(Month, 'yyyy MM'),
  Count_Agent_more_than_30_sales =
        COUNT(CASE WHEN SumOverPeriod >= 30 AND SumOverPeriod - sales < 30 OR sales >= 30 THEN 1 END)
FROM (
    SELECT
      Month = EOMONTH(date),
      agent,
      sales = SUM(sales),
      SumOverPeriod = SUM(SUM(sales)) OVER (PARTITION BY agent ORDER BY EOMONTH(date)
          ROWS UNBOUNDED PRECEDING)
    FROM AgentSales
    GROUP BY EOMONTH(date), agent
) sales
GROUP BY Month;

db<>fiddle

请检查其中一项是否符合您的需求(我认为描述混乱)

选项 1

-- If you want to count only the first time [agent] crossed 30 sales
;With MyCTE01 as (
    SELECT 
        [date] = EOMONTH([date], -1),
        [agent],[sales], 
        S = SUM([sales]) OVER (PARTITION BY [agent] ORDER BY [date] ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW)
    FROM [AgentSales]
),
MyCTE02 as (
    SELECT [date],[agent],[sales], S
    FROM MyCTE01
    -- The idea of using "and S - [sales] < 30" instead of ROW_NUMBER came from @Charlieface, but it is better to do the work on DATE data type and not on string
    WHERE S > 30 and S - [sales] < 30
)
SELECT DATENAME(month,[Date]), YEAR([Date]), COUNT(*) 
FROM MyCTE02
GROUP BY [date]
GO

选项 2

-- If you want to count all the [agent] crossed 30 sales till now
;With MyCTE01 as (
    SELECT 
        [date] = DATEADD(DAY, 1, EOMONTH([date], -1)),
        [agent],[sales], 
        S = SUM([sales]) OVER (PARTITION BY [agent] ORDER BY [date] ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW)
    FROM [AgentSales]
)
,MyCTE02 as (
    SELECT [date],[agent],[sales], S
    FROM MyCTE01
    WHERE S > 30
)
SELECT DATENAME(month,[Date]), YEAR([Date]), COUNT(*) 
FROM MyCTE02
GROUP BY [date]
GO

选项 3

-- If you want to count only the first time [agent] crossed 30 sales or when the sales or over 30
;With MyCTE01 as (
    SELECT 
        [date] = DATEADD(DAY,1,EOMONTH([date], -1)),
        [agent],[sales], 
        S = SUM([sales]) OVER (PARTITION BY [agent] ORDER BY [date] ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW)
    FROM [AgentSales]
)
,MyCTE02 as (
    SELECT [date],[agent],[sales], S
    FROM MyCTE01
    -- The idea of using "and S - [sales] < 30" instead of ROW_NUMBER came from @Charlieface, but it is better to do the work on DATE data type and not on string
    WHERE (S > 30 and S - [sales] < 30) or sales > 30
)
SELECT DATENAME(month,[Date]), YEAR([Date]), COUNT(*) 
FROM MyCTE02
GROUP BY [date]
GO

DDL+DML

USE tempdb
GO

DROP TABLE IF EXISTS [AgentSales]
GO
CREATE TABLE [AgentSales](id INT IDENTITY(1,1), [date] DATE, agent VARCHAR(100), sales INT)
GO
INSERT [AgentSales]([date],[agent],[sales]) VALUES
('2021-01-03 00:00:00.000','Agent A', 10),
('2021-02-05 00:00:00.000','Agent A', 15),
('2021-03-10 00:00:00.000','Agent A',10),
('2021-01-05 00:00:00.000','Agent B',5 ),
('2021-02-06 00:00:00.000','Agent B',28),
('2021-03-10 00:00:00.000','Agent B',5 ),
('2021-01-02 00:00:00.000','Agent C',35),
('2021-02-04 00:00:00.000','Agent C',25),
('2021-03-08 00:00:00.000','Agent C',15),
('2021-01-01 00:00:00.000','Agent D',5 ),
('2021-02-02 00:00:00.000','Agent D',35),
('2021-03-10 00:00:00.000','Agent D',31)
GO

SELECT [id],[date],[agent],[sales]
FROM [AgentSales]
GO