如何根据特定的复杂规则计算行数?
How to COUNT rows according to specific complicated rules?
我有以下 table:
custid custname channelid channel dateViewed
--------------------------------------------------------------
1 A 1 ABSS 2016-01-09
2 B 2 STHHG 2016-01-19
3 C 4 XGGTS 2016-01-09
6 D 4 XGGTS 2016-01-09
2 B 2 STHHG 2016-01-26
2 B 2 STHHG 2016-01-28
1 A 3 SSJ 2016-01-28
1 A 1 ABSS 2016-01-28
2 B 2 STHHG 2016-02-02
2 B 7 UUJKS 2016-02-10
2 B 8 AKKDC 2016-02-10
2 B 9 GGSK 2016-02-10
2 B 9 GGSK 2016-02-11
2 B 7 UUJKS 2016-02-27
我希望结果是:
custid custname month count
------------------------------
1 A 1 1
2 B 1 1
2 B 2 4
3 C 1 1
6 D 1 1
根据以下规则:
- 所有频道观看订阅每 15 天收费一次。如果
客户在 15 天内观看了同一频道,他只会
为该频道计费一次。比如custid 2,custname B他的账单周期是1月19日-2月3日(一个账单周期),2月4日-2月20日(一个账单周期)等等。因此,由于他在整个计费周期内观看同一频道,因此他在 1 月份只被计费 1 次;并且他在 2 月因观看(channelid 7、8、9)和 channelid 7 在 2 月 27 日观看而被计费 4 次(因为这属于另一个计费周期,客户 B 也在这里收费)。客户 B 在 2 月 2 日无需支付观看频道 2 的费用,因为他已在 1 月 19 日至 2 月 3 日的结算周期内付费。
- 每个月都会为每个客户生成一张发票,因此,
结果应显示通道的 'Month' 和 'Count'
为每个客户查看。
这可以在 SQL 服务器上完成吗?
每当我尝试用复杂的标准计算事物时,我都会使用求和和案例语句。如下所示:
SELECT custid, custname,
SUM(CASE WHEN somecriteria
THEN 1
ELSE 0
END) As CriteriaCount
FROM whateverTable
GROUP BY custid, custname
您可以根据需要使 somecriteria
变量成为一个复杂的语句,只要它 returns 是一个布尔值。如果通过,则此行 returns 为 1。如果失败,则该行返回 0,然后我们将返回的值相加以获得计数。
@Sturgus what if I want to define it in the code? Any other
alternatives besides defining it in the table? How to write a query
that can be run every month to generate the monthly invoice. –
saturday 15 mins ago
好吧,无论如何,您都必须保存每个客户的账单开始日期(最少)。如果您想完全在 SQL 中而不是 'editing the database' 中执行此操作,则类似以下的内容应该有效。这种方法的缺点是您需要每月手动编辑 "INSERT INTO" 语句以满足您的需要。如果允许您编辑现有客户 table 或创建新客户,则可以减少这种手动工作量。
DECLARE @CustomerBillingPeriodsTVP AS Table(
custID int UNIQUE,
BillingCycleID int,
BillingStartDate Date,
BillingEndDate Date
);
INSERT INTO @CustomerBillingPeriodsTVP (custID, BillingCycleID, BillingStartDate, BillingEndDate) VALUES
(1, 1, '2016-01-03', '2016-01-18'), (2, 1, '2016-01-18', '2016-02-03'), (3, 1, '2016-01-15', '2016-01-30'), (6, 1, '2016-01-14', '2016-01-29');
SELECT A.custid, A.custname, B.BillingCycleID AS [month], COUNT(DISTINCT A.channelid) AS [count]
FROM dbo.tblCustomerChannelViews AS A INNER JOIN @CustomerBillingPeriodsTVP AS B ON A.custid = B.CustID
GROUP BY A.custid, A.custname, B.BillingCycleID;
GO
您从哪里获得客户的账单开始日期?
一般来说,这就是您如何获得从给定日期(本例中为@dd)开始的固定 15 天间隔的任意数量(本例中为 10)。
DECLARE @dd date = CAST('2016-01-19 17:30' AS DATE);
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
E2(N) AS (SELECT 1 FROM E1 a, E1 b),
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000 rows max
tally(N) AS (SELECT TOP (10) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4)
SELECT
startd = DATEADD(D,(N-1)*15, @dd),
endd = DATEADD(D, N*15-1, @dd)
FROM tally
使其适应定义必须如何为用户(可能还有香奈儿)计算开始日期的规则。
我不确定这个解决方案将如何扩展 - 但有一些好的索引候选和体面的数据管理,它会起作用..
您将需要一些额外的信息供初学者使用,并规范化您的数据。您需要知道每个客户的第一个收费期开始日期。所以将其存储在客户 table.
中
这是我使用的table:
create table #channelViews
(
custId int, channelId int, viewDate datetime
)
create table #channel
(
channelId int, channelName varchar(max)
)
create table #customer
(
custId int, custname varchar(max), chargingStartDate datetime
)
我将填充一些数据。我不会得到与您的示例输出相同的结果,因为我没有适合每个客户的开始日期。不过客户 2 会没事的。
insert into #channel (channelId, channelName)
select 1, 'ABSS'
union select 2, 'STHHG'
union select 4, 'XGGTS'
union select 3, 'SSJ'
union select 7, 'UUJKS'
union select 8, 'AKKDC'
union select 9, 'GGSK'
insert into #customer (custId, custname, chargingStartDate)
select 1, 'A', '4 Jan 2016'
union select 2, 'B', '19 Jan 2016'
union select 3, 'C', '5 Jan 2016'
union select 6, 'D', '5 Jan 2016'
insert into #channelViews (custId, channelId, viewDate)
select 1,1,'2016-01-09'
union select 2,2,'2016-01-19'
union select 3,4,'2016-01-09'
union select 6,4,'2016-01-09'
union select 2,2,'2016-01-26'
union select 2,2,'2016-01-28'
union select 1,3,'2016-01-28'
union select 1,1,'2016-01-28'
union select 2,2,'2016-02-02'
union select 2,7,'2016-02-10'
union select 2,8,'2016-02-10'
union select 2,9,'2016-02-10'
union select 2,9,'2016-02-11'
union select 2,7,'2016-02-27'
这里是一个有点笨拙的查询,在一个语句中。
这两个底层子查询实际上是相同的数据,因此可能有更合适/更有效的方式来生成它们。
我们需要从计费中排除在上个月的同一收费期 C 中收费的任何频道。这就是联结的本质。我使用了右连接,以便可以从结果中排除所有此类匹配项(使用 old.custId is null
)。
select c.custId, c.[custname], [month], count(*) [count] from
(
select new.custId, new.channelId, new.month, new.chargingPeriod
from
(
select distinct cv.custId, cv.channelId, month(viewdate) [month], (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15 chargingPeriod
from #channelViews cv join #customer c on cv.custId = c.custId
) old
right join
(
select distinct cv.custId, cv.channelId, month(viewdate) [month], (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15 chargingPeriod
from #channelViews cv join #customer c on cv.custId = c.custId
) new
on old.custId = new.custId
and old.channelId = new.channelId
and old.month = new.Month -1
and old.chargingPeriod = new.chargingPeriod
where old.custId is null
group by new.custId, new.month, new.chargingPeriod, new.channelId
) filteredResults
join #customer c on c.custId = filteredResults.custId
group by c.custId, [month], c.custname
order by c.custId, [month], c.custname
最后是我的结果:
custId custname month count
1 A 1 3
2 B 1 1
2 B 2 4
3 C 1 1
6 D 1 1
这个查询做同样的事情:
select c.custId, c.custname, [month], count(*) from
(
select cv.custId, min(month(viewdate)) [month], cv.channelId
from #channelViews cv join #customer c on cv.custId = c.custId
group by cv.custId, cv.channelId, (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15
) x
join #customer c
on c.custId = x.custId
group by c.custId, c.custname, x.[month]
order by custId, [month]
;WITH cte AS (
SELECT custid,
custname,
channelid,
channel,
dateViewed,
CAST(DATEADD(day,15,dateViewed) as date) as dateEnd,
ROW_NUMBER() OVER (PARTITION BY custid, channelid ORDER BY dateViewed) AS rn
FROM (VALUES
(1, 'A', 1, 'ABSS', '2016-01-09'),(2, 'B', 2, 'STHHG', '2016-01-19'),
(3, 'C', 4, 'XGGTS', '2016-01-09'),(6, 'D', 4, 'XGGTS', '2016-01-09'),
(2, 'B', 2, 'STHHG', '2016-01-26'),(2, 'B', 2, 'STHHG', '2016-01-28'),
(1, 'A', 3, 'SSJ', '2016-01-28'),(1, 'A', 1, 'ABSS', '2016-01-28'),
(2, 'B', 2, 'STHHG', '2016-02-02'),(2, 'B', 7, 'UUJKS', '2016-02-10'),
(2, 'B', 8, 'AKKDC', '2016-02-10'),(2, 'B', 9, 'GGSK', '2016-02-10'),
(2, 'B', 9, 'GGSK', '2016-02-11'),(2, 'B', 7, 'UUJKS', '2016-02-27')
) as t(custid, custname, channelid, channel, dateViewed)
), res AS (
SELECT custid, channelid, dateViewed, dateEnd, 1 as Lev
FROM cte
WHERE rn = 1
UNION ALL
SELECT c.custid, c.channelid, c.dateViewed, c.dateEnd, lev + 1
FROM res r
INNER JOIN cte c ON c.dateViewed > r.dateEnd and c.custid = r.custid and c.channelid = r.channelid
), final AS (
SELECT * ,
ROW_NUMBER() OVER (PARTITION BY custid, channelid, lev ORDER BY dateViewed) rn,
DENSE_RANK() OVER (ORDER BY custid, channelid, dateEnd) dr
FROM res
)
SELECT b.custid,
b.custname,
MONTH(f.dateViewed) as [month],
COUNT(distinct dr) as [count]
FROM cte b
LEFT JOIN final f
ON b.channelid = f.channelid and b.custid = f.custid and b.dateViewed between f.dateViewed and f.dateEnd
WHERE f.rn = 1
GROUP BY b.custid,
b.custname,
MONTH(f.dateViewed)
输出:
custid custname month count
----------- -------- ----------- -----------
1 A 1 3
2 B 1 1
2 B 2 4
3 C 1 1
6 D 1 1
(5 row(s) affected)
我不知道为什么客户 A
在 count
字段中得到 1
。他得到了:
ABSS 2016-01-09 +1 to count (+15 days = 2016-01-24)
SSJ 2016-01-28 +1 to count
ABSS 2016-01-28 +1 to count (28-01 > 24.01)
所以一月份肯定有count = 3
.
我有以下 table:
custid custname channelid channel dateViewed
--------------------------------------------------------------
1 A 1 ABSS 2016-01-09
2 B 2 STHHG 2016-01-19
3 C 4 XGGTS 2016-01-09
6 D 4 XGGTS 2016-01-09
2 B 2 STHHG 2016-01-26
2 B 2 STHHG 2016-01-28
1 A 3 SSJ 2016-01-28
1 A 1 ABSS 2016-01-28
2 B 2 STHHG 2016-02-02
2 B 7 UUJKS 2016-02-10
2 B 8 AKKDC 2016-02-10
2 B 9 GGSK 2016-02-10
2 B 9 GGSK 2016-02-11
2 B 7 UUJKS 2016-02-27
我希望结果是:
custid custname month count
------------------------------
1 A 1 1
2 B 1 1
2 B 2 4
3 C 1 1
6 D 1 1
根据以下规则:
- 所有频道观看订阅每 15 天收费一次。如果 客户在 15 天内观看了同一频道,他只会 为该频道计费一次。比如custid 2,custname B他的账单周期是1月19日-2月3日(一个账单周期),2月4日-2月20日(一个账单周期)等等。因此,由于他在整个计费周期内观看同一频道,因此他在 1 月份只被计费 1 次;并且他在 2 月因观看(channelid 7、8、9)和 channelid 7 在 2 月 27 日观看而被计费 4 次(因为这属于另一个计费周期,客户 B 也在这里收费)。客户 B 在 2 月 2 日无需支付观看频道 2 的费用,因为他已在 1 月 19 日至 2 月 3 日的结算周期内付费。
- 每个月都会为每个客户生成一张发票,因此, 结果应显示通道的 'Month' 和 'Count' 为每个客户查看。
这可以在 SQL 服务器上完成吗?
每当我尝试用复杂的标准计算事物时,我都会使用求和和案例语句。如下所示:
SELECT custid, custname,
SUM(CASE WHEN somecriteria
THEN 1
ELSE 0
END) As CriteriaCount
FROM whateverTable
GROUP BY custid, custname
您可以根据需要使 somecriteria
变量成为一个复杂的语句,只要它 returns 是一个布尔值。如果通过,则此行 returns 为 1。如果失败,则该行返回 0,然后我们将返回的值相加以获得计数。
@Sturgus what if I want to define it in the code? Any other alternatives besides defining it in the table? How to write a query that can be run every month to generate the monthly invoice. – saturday 15 mins ago
好吧,无论如何,您都必须保存每个客户的账单开始日期(最少)。如果您想完全在 SQL 中而不是 'editing the database' 中执行此操作,则类似以下的内容应该有效。这种方法的缺点是您需要每月手动编辑 "INSERT INTO" 语句以满足您的需要。如果允许您编辑现有客户 table 或创建新客户,则可以减少这种手动工作量。
DECLARE @CustomerBillingPeriodsTVP AS Table(
custID int UNIQUE,
BillingCycleID int,
BillingStartDate Date,
BillingEndDate Date
);
INSERT INTO @CustomerBillingPeriodsTVP (custID, BillingCycleID, BillingStartDate, BillingEndDate) VALUES
(1, 1, '2016-01-03', '2016-01-18'), (2, 1, '2016-01-18', '2016-02-03'), (3, 1, '2016-01-15', '2016-01-30'), (6, 1, '2016-01-14', '2016-01-29');
SELECT A.custid, A.custname, B.BillingCycleID AS [month], COUNT(DISTINCT A.channelid) AS [count]
FROM dbo.tblCustomerChannelViews AS A INNER JOIN @CustomerBillingPeriodsTVP AS B ON A.custid = B.CustID
GROUP BY A.custid, A.custname, B.BillingCycleID;
GO
您从哪里获得客户的账单开始日期?
一般来说,这就是您如何获得从给定日期(本例中为@dd)开始的固定 15 天间隔的任意数量(本例中为 10)。
DECLARE @dd date = CAST('2016-01-19 17:30' AS DATE);
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
E2(N) AS (SELECT 1 FROM E1 a, E1 b),
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000 rows max
tally(N) AS (SELECT TOP (10) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4)
SELECT
startd = DATEADD(D,(N-1)*15, @dd),
endd = DATEADD(D, N*15-1, @dd)
FROM tally
使其适应定义必须如何为用户(可能还有香奈儿)计算开始日期的规则。
我不确定这个解决方案将如何扩展 - 但有一些好的索引候选和体面的数据管理,它会起作用..
您将需要一些额外的信息供初学者使用,并规范化您的数据。您需要知道每个客户的第一个收费期开始日期。所以将其存储在客户 table.
中这是我使用的table:
create table #channelViews
(
custId int, channelId int, viewDate datetime
)
create table #channel
(
channelId int, channelName varchar(max)
)
create table #customer
(
custId int, custname varchar(max), chargingStartDate datetime
)
我将填充一些数据。我不会得到与您的示例输出相同的结果,因为我没有适合每个客户的开始日期。不过客户 2 会没事的。
insert into #channel (channelId, channelName)
select 1, 'ABSS'
union select 2, 'STHHG'
union select 4, 'XGGTS'
union select 3, 'SSJ'
union select 7, 'UUJKS'
union select 8, 'AKKDC'
union select 9, 'GGSK'
insert into #customer (custId, custname, chargingStartDate)
select 1, 'A', '4 Jan 2016'
union select 2, 'B', '19 Jan 2016'
union select 3, 'C', '5 Jan 2016'
union select 6, 'D', '5 Jan 2016'
insert into #channelViews (custId, channelId, viewDate)
select 1,1,'2016-01-09'
union select 2,2,'2016-01-19'
union select 3,4,'2016-01-09'
union select 6,4,'2016-01-09'
union select 2,2,'2016-01-26'
union select 2,2,'2016-01-28'
union select 1,3,'2016-01-28'
union select 1,1,'2016-01-28'
union select 2,2,'2016-02-02'
union select 2,7,'2016-02-10'
union select 2,8,'2016-02-10'
union select 2,9,'2016-02-10'
union select 2,9,'2016-02-11'
union select 2,7,'2016-02-27'
这里是一个有点笨拙的查询,在一个语句中。 这两个底层子查询实际上是相同的数据,因此可能有更合适/更有效的方式来生成它们。
我们需要从计费中排除在上个月的同一收费期 C 中收费的任何频道。这就是联结的本质。我使用了右连接,以便可以从结果中排除所有此类匹配项(使用 old.custId is null
)。
select c.custId, c.[custname], [month], count(*) [count] from
(
select new.custId, new.channelId, new.month, new.chargingPeriod
from
(
select distinct cv.custId, cv.channelId, month(viewdate) [month], (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15 chargingPeriod
from #channelViews cv join #customer c on cv.custId = c.custId
) old
right join
(
select distinct cv.custId, cv.channelId, month(viewdate) [month], (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15 chargingPeriod
from #channelViews cv join #customer c on cv.custId = c.custId
) new
on old.custId = new.custId
and old.channelId = new.channelId
and old.month = new.Month -1
and old.chargingPeriod = new.chargingPeriod
where old.custId is null
group by new.custId, new.month, new.chargingPeriod, new.channelId
) filteredResults
join #customer c on c.custId = filteredResults.custId
group by c.custId, [month], c.custname
order by c.custId, [month], c.custname
最后是我的结果:
custId custname month count
1 A 1 3
2 B 1 1
2 B 2 4
3 C 1 1
6 D 1 1
这个查询做同样的事情:
select c.custId, c.custname, [month], count(*) from
(
select cv.custId, min(month(viewdate)) [month], cv.channelId
from #channelViews cv join #customer c on cv.custId = c.custId
group by cv.custId, cv.channelId, (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15
) x
join #customer c
on c.custId = x.custId
group by c.custId, c.custname, x.[month]
order by custId, [month]
;WITH cte AS (
SELECT custid,
custname,
channelid,
channel,
dateViewed,
CAST(DATEADD(day,15,dateViewed) as date) as dateEnd,
ROW_NUMBER() OVER (PARTITION BY custid, channelid ORDER BY dateViewed) AS rn
FROM (VALUES
(1, 'A', 1, 'ABSS', '2016-01-09'),(2, 'B', 2, 'STHHG', '2016-01-19'),
(3, 'C', 4, 'XGGTS', '2016-01-09'),(6, 'D', 4, 'XGGTS', '2016-01-09'),
(2, 'B', 2, 'STHHG', '2016-01-26'),(2, 'B', 2, 'STHHG', '2016-01-28'),
(1, 'A', 3, 'SSJ', '2016-01-28'),(1, 'A', 1, 'ABSS', '2016-01-28'),
(2, 'B', 2, 'STHHG', '2016-02-02'),(2, 'B', 7, 'UUJKS', '2016-02-10'),
(2, 'B', 8, 'AKKDC', '2016-02-10'),(2, 'B', 9, 'GGSK', '2016-02-10'),
(2, 'B', 9, 'GGSK', '2016-02-11'),(2, 'B', 7, 'UUJKS', '2016-02-27')
) as t(custid, custname, channelid, channel, dateViewed)
), res AS (
SELECT custid, channelid, dateViewed, dateEnd, 1 as Lev
FROM cte
WHERE rn = 1
UNION ALL
SELECT c.custid, c.channelid, c.dateViewed, c.dateEnd, lev + 1
FROM res r
INNER JOIN cte c ON c.dateViewed > r.dateEnd and c.custid = r.custid and c.channelid = r.channelid
), final AS (
SELECT * ,
ROW_NUMBER() OVER (PARTITION BY custid, channelid, lev ORDER BY dateViewed) rn,
DENSE_RANK() OVER (ORDER BY custid, channelid, dateEnd) dr
FROM res
)
SELECT b.custid,
b.custname,
MONTH(f.dateViewed) as [month],
COUNT(distinct dr) as [count]
FROM cte b
LEFT JOIN final f
ON b.channelid = f.channelid and b.custid = f.custid and b.dateViewed between f.dateViewed and f.dateEnd
WHERE f.rn = 1
GROUP BY b.custid,
b.custname,
MONTH(f.dateViewed)
输出:
custid custname month count
----------- -------- ----------- -----------
1 A 1 3
2 B 1 1
2 B 2 4
3 C 1 1
6 D 1 1
(5 row(s) affected)
我不知道为什么客户 A
在 count
字段中得到 1
。他得到了:
ABSS 2016-01-09 +1 to count (+15 days = 2016-01-24)
SSJ 2016-01-28 +1 to count
ABSS 2016-01-28 +1 to count (28-01 > 24.01)
所以一月份肯定有count = 3
.