使用 group by 防止重复值并同时计数不同？

Question

我有一个带有年份和客户 ID 的简单 table，现在我想按年份分组并计算每年的不同客户。这很简单并且工作正常，我的问题是我不希望第 1 年的客户在第 2 年重复，我只想看到每年的新客户。那我该怎么做呢？

我试过将 count distinct 与 group by 一起使用，甚至不使用 in 但它似乎不起作用，它总是给我重复的值

select count (distinct customer ID), Year
FROM customers
group by year

假设我在 2015 年到 2019 年有 100 个客户现在我有

Year No of Customers
2015   30
2016   35
2017   40
2018   30
2019   10

总计145，比100多45 我要的是

Year  No of Customers
2015   30
2016   30
2017   20
2018   20
2019   10
Total  100

Answer 1

如果您只想计算出现的第一年的客户，则使用两个级别的聚合：

select min_year, count(*)
from (select customerid, min(year) as min_year
      from customers c
      group by customerid
     ) c
group by min_year
order by min_year;

要得到total，你可以使用grouping sets或rollup（不是所有的数据库都支持这些。典型的方法是：

select min_year, count(*)
from (select customerid, min(year) as min_year
      from customers c
      group by customerid
     ) c
group by min_year with rollup;

Answer 2

也许是这样的：

SELECT count (distinct c1.customerID), c1.Year 
FROM customers c1
WHERE c1.customerID not in (
    SELECT c2.customerID
    FROM customers c2
    WHERE c2.year < c1.year
)
GROUP BY year

使用 group by 防止重复值并同时计数不同？

Prevent duplicate values using group by and count distinct simultaneously?

database

sql-server

group-by

count

distinct