如何根据特定的复杂规则计算行数?

How to COUNT rows according to specific complicated rules?

我有以下 table:

custid  custname  channelid  channel  dateViewed
--------------------------------------------------------------
1          A       1         ABSS     2016-01-09  
2          B       2         STHHG    2016-01-19 
3          C       4         XGGTS    2016-01-09 
6          D       4         XGGTS    2016-01-09 
2          B       2         STHHG    2016-01-26 
2          B       2         STHHG    2016-01-28 
1          A       3         SSJ      2016-01-28 
1          A       1         ABSS     2016-01-28 
2          B       2         STHHG    2016-02-02
2          B       7         UUJKS    2016-02-10
2          B       8         AKKDC    2016-02-10
2          B       9         GGSK     2016-02-10
2          B       9         GGSK     2016-02-11
2          B       7         UUJKS    2016-02-27

我希望结果是:

custid  custname  month count  
------------------------------
1          A       1     1
2          B       1     1      
2          B       2     4     
3          C       1     1
6          D       1     1

根据以下规则:

这可以在 SQL 服务器上完成吗?

每当我尝试用复杂的标准计算事物时,我都会使用求和和案例语句。如下所示:

SELECT custid, custname, 
    SUM(CASE WHEN somecriteria
        THEN 1
        ELSE 0
        END) As CriteriaCount
FROM whateverTable
GROUP BY custid, custname

您可以根据需要使 somecriteria 变量成为一个复杂的语句,只要它 returns 是一个布尔值。如果通过,则此行 returns 为 1。如果失败,则该行返回 0,然后我们将返回的值相加以获得计数。

@Sturgus what if I want to define it in the code? Any other alternatives besides defining it in the table? How to write a query that can be run every month to generate the monthly invoice. – saturday 15 mins ago

好吧,无论如何,您都必须保存每个客户的账单开始日期(最少)。如果您想完全在 SQL 中而不是 'editing the database' 中执行此操作,则类似以下的内容应该有效。这种方法的缺点是您需要每月手动编辑 "INSERT INTO" 语句以满足您的需要。如果允许您编辑现有客户 table 或创建新客户,则可以减少这种手动工作量。

DECLARE @CustomerBillingPeriodsTVP AS Table(
   custID int UNIQUE,
   BillingCycleID int,
   BillingStartDate Date,
   BillingEndDate Date
);
INSERT INTO @CustomerBillingPeriodsTVP (custID, BillingCycleID, BillingStartDate, BillingEndDate) VALUES
 (1, 1, '2016-01-03', '2016-01-18'), (2, 1, '2016-01-18', '2016-02-03'), (3, 1, '2016-01-15', '2016-01-30'), (6, 1, '2016-01-14', '2016-01-29');

SELECT A.custid, A.custname, B.BillingCycleID AS [month], COUNT(DISTINCT A.channelid) AS [count]
FROM dbo.tblCustomerChannelViews AS A INNER JOIN @CustomerBillingPeriodsTVP AS B ON A.custid = B.CustID
GROUP BY A.custid, A.custname, B.BillingCycleID;
GO

您从哪里获得客户的账单开始日期?

一般来说,这就是您如何获得从给定日期(本例中为@dd)开始的固定 15 天间隔的任意数量(本例中为 10)。

DECLARE @dd date = CAST('2016-01-19 17:30' AS DATE);

WITH E1(N) AS (
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), 
E4(N) AS (SELECT 1 FROM E2 a, E2 b),  --10,000 rows max
tally(N) AS (SELECT TOP (10) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4)
SELECT
    startd = DATEADD(D,(N-1)*15, @dd),
    endd = DATEADD(D, N*15-1, @dd)    
FROM tally

使其适应定义必须如何为用户(可能还有香奈儿)计算开始日期的规则。

我不确定这个解决方案将如何扩展 - 但有一些好的索引候选和体面的数据管理,它会起作用..

您将需要一些额外的信息供初学者使用,并规范化您的数据。您需要知道每个客户的第一个收费期开始日期。所以将其存储在客户 table.

这是我使用的table:

create table #channelViews
(
    custId int, channelId int, viewDate datetime
)
create table #channel 
(
    channelId int, channelName varchar(max)
)
create table #customer 
(
    custId int, custname varchar(max), chargingStartDate datetime
)

我将填充一些数据。我不会得到与您的示例输出相同的结果,因为我没有适合每个客户的开始日期。不过客户 2 会没事的。

insert into #channel (channelId, channelName)
select 1, 'ABSS'
union select 2, 'STHHG'
union select 4, 'XGGTS'
union select 3, 'SSJ'
union select 7, 'UUJKS'
union select 8, 'AKKDC'
union select 9, 'GGSK'

insert into #customer (custId, custname, chargingStartDate)
select 1, 'A', '4 Jan 2016'
union select 2, 'B', '19 Jan 2016'
union select 3, 'C', '5 Jan 2016'
union select 6, 'D', '5 Jan 2016'

insert into #channelViews (custId, channelId, viewDate)
select 1,1,'2016-01-09'  
union select 2,2,'2016-01-19' 
union select 3,4,'2016-01-09' 
union select 6,4,'2016-01-09' 
union select 2,2,'2016-01-26' 
union select 2,2,'2016-01-28' 
union select 1,3,'2016-01-28' 
union select 1,1,'2016-01-28' 
union select 2,2,'2016-02-02'
union select 2,7,'2016-02-10'
union select 2,8,'2016-02-10'
union select 2,9,'2016-02-10'
union select 2,9,'2016-02-11'
union select 2,7,'2016-02-27'

这里是一个有点笨拙的查询,在一个语句中。 这两个底层子查询实际上是相同的数据,因此可能有更合适/更有效的方式来生成它们。

我们需要从计费中排除在上个月的同一收费期 C 中收费的任何频道。这就是联结的本质。我使用了右连接,以便可以从结果中排除所有此类匹配项(使用 old.custId is null)。

select c.custId, c.[custname], [month], count(*) [count] from 
(
    select new.custId, new.channelId, new.month, new.chargingPeriod
    from 
        (
        select distinct cv.custId, cv.channelId, month(viewdate) [month], (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15 chargingPeriod
        from #channelViews cv join #customer c on cv.custId = c.custId
        ) old
    right join 
        (
        select distinct cv.custId, cv.channelId, month(viewdate) [month], (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15 chargingPeriod
        from #channelViews cv join #customer c on cv.custId = c.custId
        )  new
    on old.custId = new.custId 
        and old.channelId = new.channelId 
        and old.month = new.Month -1
        and old.chargingPeriod = new.chargingPeriod

    where old.custId is null
    group by new.custId, new.month, new.chargingPeriod, new.channelId

) filteredResults
join #customer c on c.custId = filteredResults.custId
group by c.custId, [month], c.custname
order by c.custId, [month], c.custname

最后是我的结果:

custId custname month count
1      A        1     3
2      B        1     1
2      B        2     4
3      C        1     1
6      D        1     1

这个查询做同样的事情:

select c.custId, c.custname, [month], count(*) from 
(
    select cv.custId, min(month(viewdate)) [month], cv.channelId
    from #channelViews cv join #customer c on cv.custId = c.custId
    group by cv.custId, cv.channelId, (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15
) x
join #customer c 
 on c.custId = x.custId
group by c.custId, c.custname, x.[month]
order by custId, [month]
;WITH cte AS (
    SELECT  custid,
            custname, 
            channelid, 
            channel, 
            dateViewed,
            CAST(DATEADD(day,15,dateViewed) as date) as dateEnd,
            ROW_NUMBER() OVER (PARTITION BY custid, channelid ORDER BY dateViewed) AS rn
    FROM (VALUES
        (1, 'A', 1, 'ABSS', '2016-01-09'),(2, 'B', 2, 'STHHG', '2016-01-19'),
        (3, 'C', 4, 'XGGTS', '2016-01-09'),(6, 'D', 4, 'XGGTS', '2016-01-09'),
        (2, 'B', 2, 'STHHG', '2016-01-26'),(2, 'B', 2, 'STHHG', '2016-01-28'),
        (1, 'A', 3, 'SSJ', '2016-01-28'),(1, 'A', 1, 'ABSS', '2016-01-28'),
        (2, 'B', 2, 'STHHG', '2016-02-02'),(2, 'B', 7, 'UUJKS', '2016-02-10'),
        (2, 'B', 8, 'AKKDC', '2016-02-10'),(2, 'B', 9, 'GGSK', '2016-02-10'),
        (2, 'B', 9, 'GGSK', '2016-02-11'),(2, 'B', 7, 'UUJKS', '2016-02-27')
        ) as t(custid, custname, channelid, channel, dateViewed)
), res AS (
    SELECT custid, channelid, dateViewed, dateEnd, 1 as Lev
    FROM cte
    WHERE rn = 1
    UNION ALL
    SELECT c.custid, c.channelid, c.dateViewed, c.dateEnd, lev + 1
    FROM res r
    INNER JOIN cte c ON c.dateViewed > r.dateEnd and c.custid = r.custid and c.channelid = r.channelid
), final AS (
    SELECT  * ,
            ROW_NUMBER() OVER (PARTITION BY custid, channelid, lev ORDER BY dateViewed) rn,
            DENSE_RANK() OVER (ORDER BY custid, channelid, dateEnd) dr
    FROM res
)

SELECT  b.custid,
        b.custname, 
        MONTH(f.dateViewed) as [month],
        COUNT(distinct dr) as [count]
FROM cte b
LEFT JOIN final f
    ON b.channelid = f.channelid and b.custid = f.custid and b.dateViewed between f.dateViewed and f.dateEnd
WHERE f.rn = 1
GROUP BY b.custid,
        b.custname, 
        MONTH(f.dateViewed)

输出:

custid      custname month       count
----------- -------- ----------- -----------
1           A        1           3
2           B        1           1
2           B        2           4
3           C        1           1
6           D        1           1

(5 row(s) affected)

我不知道为什么客户 Acount 字段中得到 1。他得到了:

ABSS     2016-01-09 +1 to count (+15 days = 2016-01-24)
SSJ      2016-01-28 +1 to count
ABSS     2016-01-28 +1 to count (28-01 > 24.01)

所以一月份肯定有count = 3.