运行 带分区的非重复计数

Running Distinct Count with a Partition

我想要一个 运行 非重复计数,并按年份对以下数据进行分区:

DROP TABLE IF EXISTS #FACT;
CREATE TABLE #FACT("Year" INT,"Month" INT, "Acc" varchar(5));
INSERT INTO #FACT
    values 
        (2015, 1, 'A'),
        (2015, 1, 'B'),
        (2015, 1, 'B'),
        (2015, 1, 'C'),
        (2015, 2, 'D'),
        (2015, 2, 'E'),
        (2015, 3, 'E'),
        (2016, 1, 'A'),
        (2016, 1, 'A'),
        (2016, 2, 'B'),
        (2016, 2, 'C');
SELECT * FROM #FACT;    

以下 return 是正确答案,但是否有更简洁且高效的方法?

WITH 
dnsRnk AS
(
    SELECT 
        "Year"
        , "Month"
        , DenseR  = DENSE_RANK() OVER(PARTITION BY "Year", "Month" ORDER BY "Acc")
    FROM #FACT
),
mxPerMth AS
(
    SELECT
        "Year"
        , "Month"
        , RunningTotal = MAX(DenseR)
    FROM dnsRnk
    GROUP BY 
        "Year"
        , "Month"
)
SELECT 
    "Year"
    , "Month"
    , X = SUM(RunningTotal) OVER (PARTITION BY "Year" ORDER BY "Month")
FROM mxPerMth
ORDER BY 
    "Year"
    , "Month";

上面return下面-答案应该也return完全一样table:

如果您想要 运行 个不同的帐户:

SELECT f.*,
    sum(case when seqnum = 1 then 1 else 0 end) over (partition by year order by month) as cume_distinct_acc
FROM (
    SELECT 
        f.*
        ,row_number() over (partition by account order by year, month) as seqnum
    FROM #fact f
) f;

这会在它出现的第一个月计算每个帐户。

编辑:

糟糕。以上不是按年和月汇总,然后每年重新开始。这是正确的解决方案:

SELECT 
    year
    ,month
    ,sum( sum(case when seqnum = 1 then 1 else 0 end)
        ) over (partition by year order by month) as cume_distinct_acc
FROM (
    SELECT 
        f.*
        ,row_number() over (partition by account, year order by month) as seqnum
    FROM #fact f
) f
group by year, month
order by year, month;

而且,SQL Fiddle 不工作,但下面是一个例子:

with FACT as (
    SELECT yyyy, mm, account
    FROM (values 
        (2015, 1, 'A'),
        (2015, 1, 'B'),
        (2015, 1, 'B'),
        (2015, 1, 'C'),
        (2015, 2, 'D'),
        (2015, 2, 'E'),
        (2015, 3, 'E'),
        (2016, 1, 'A'),
        (2016, 1, 'A'),
        (2016, 2, 'B'),
        (2016, 2, 'C')) v(yyyy, mm, account)
)
SELECT 
    yyyy
    ,mm
    ,sum(sum(case when seqnum = 1 then 1 else 0 end)) over (partition by yyyy order by mm) as cume_distinct_acc
FROM (
    SELECT 
        f.*
        ,row_number() over (partition by account, yyyy order by mm) as seqnum
    FROM fact f
) f
group by yyyy, mm
order by yyyy, mm;

Demo Here:

;with cte as (
    SELECT yearr, monthh, count(distinct acc) as cnt  
    FROM #fact
    GROUP BY yearr, monthh
)
SELECT 
    yearr
    ,monthh
    ,sum(cnt) over (Partition by yearr order by yearr, monthh rows unbounded preceding ) as x
FROM cte