SQL 根据每行一个字段计算另一列中的重复项

Question

我正在制作客户保留报告。我们通过电子邮件识别客户。这是我们 table:

的一些示例数据

+----------------------------+------------------+-------------------+---------------------+------------+-------------+--------------+---------------+------------------+---------------+----------------+--------------+------------------+--+--+--+--+--+
|           Email            | BrandNewCustomer | RecurringCustomer | ReactivatedCustomer | OrderCount | TotalOrders | Date_Created | Customer_Name | Customer_Address | Customer_City | Customer_State | Customer_Zip | Customer_Country |  |  |  |  |  |
+----------------------------+------------------+-------------------+---------------------+------------+-------------+--------------+---------------+------------------+---------------+----------------+--------------+------------------+--+--+--+--+--+
| zyw@marketplace.amazon.com |                1 |                 0 |                   0 |          1 |           1 | 41:50.0      | Sha           |              990 | BRO           | NY             |          112 | US               |  |  |  |  |  |
| zyu@gmail.com              |                1 |                 0 |                   0 |          1 |           1 | 57:25.0      | Zyu           |              181 | Mia           | FL             |          330 | US               |  |  |  |  |  |
| ZyR@aol.com                |                1 |                 0 |                   0 |          1 |           1 | 10:19.0      | Day           |              581 | Myr           | SC             |          295 | US               |  |  |  |  |  |
| zyr@gmail.com              |                1 |                 0 |                   0 |          1 |           1 | 25:19.0      | Nic           |              173 | Was           | DC             |          200 | US               |  |  |  |  |  |
| zy@gmail.com               |                1 |                 0 |                   0 |          1 |           1 | 19:18.0      | Kim           |              675 | MIA           | FL             |          331 | US               |  |  |  |  |  |
| zyou@gmail.com             |                1 |                 0 |                   0 |          1 |           1 | 40:29.0      | zoe           |              160 | Mob           | AL             |          366 | US               |  |  |  |  |  |
| zyon@yahoo.com             |                1 |                 0 |                   0 |          1 |           1 | 17:21.0      | Zyo           |              84  | Sta           | CT             |          690 | US               |  |  |  |  |  |
| zyo@gmail.com              |                1 |                 0 |                   0 |          2 |           2 | 02:03.0      | Zyo           |              432 | Ell           | GA             |          302 | US               |  |  |  |  |  |
| zyo@gmail.com              |                1 |                 0 |                   0 |          1 |           2 | 12:54.0      | Zyo           |              432 | Ell           | GA             |          302 | US               |  |  |  |  |  |
| zyn@icloud.com             |                1 |                 0 |                   0 |          1 |           1 | 54:56.0      | Zyn           |              916 | Nor           | CA             |          913 | US               |  |  |  |  |  |
| zyl@gmail.com              |                0 |                 1 |                   0 |          3 |           3 | 31:27.0      | Ser           |              123 | Mia           | FL             |          331 | US               |  |  |  |  |  |
| zyk@marketplace.amazon.com |                1 |                 0 |                   0 |          1 |           1 | 44:00.0      | Myr           |              101 | MIA           | FL             |          331 | US               |  |  |  |  |  |
+----------------------------+------------------+-------------------+---------------------+------------+-------------+--------------+---------------+------------------+---------------+----------------+--------------+------------------+--+--+--+--+--+

我们通过电子邮件定义我们的客户。因此，所有具有相同电子邮件的订单都被标记为属于一个客户，然后我们在此基础上进行计算。

现在我正在尝试了解电子邮件已更改的客户。因此，为此，我们将尝试按地址排列客户。

所以每一行（所以当用电子邮件分隔时），我想有另一列叫做 Orders_With_Same_Address_Different_Email。我该怎么做？

我尝试过使用 Dense Rank 做一些事情，但它似乎不起作用：

SELECT DISTINCT
Email
,BrandNewCustomer
,RecurringCustomer
,ReactivatedCustomer
,OrderCount
,TotalOrders
,Date_Created
,Customer_Name
,Customer_Address
,Customer_City
,Customer_State
,Customer_Zip
,Customer_Country
,(DENSE_RANK() over (partition by Email order by (case when email <> email then Customer_Address end)  asc) 
+DENSE_RANK() over ( partition by Email order by (case when email <> email then Customer_Address end)  desc) 
- 1) as Orders_With_Same_Name_Different_Email
--*
FROM Customers

Answer 1

尝试计算按地址而不是电子邮件划分的电子邮件：

select   Email,
         -- ...

         Orders_With_Same_Name_Different_Email = iif(
             (count(email) over (partition by Customer_Address) > 1, 
         1, 0)

from     Customers;

但这是一个教训，说明您为什么不使用电子邮件作为客户的标识符。地址也是一个坏主意。使用不会改变的东西。这通常意味着制作一个内部标识符，例如自动递增的东西：

alter table #customers
add customerId int identity(1,1) primary key not null

现在 customerId = 1 将始终指代该特定客户。

Answer 2

您可以按 customer_address 分组并查看计数。这是基于每个客户都有一个地址的假设。

   Select * from table where 
  customer_address IN (
  Select customer_address
  From table group by customer_address
  having count(distinct customer_email) 
   >1)

Answer 3

如果我明白你想做什么，我会这样解决：

请注意，您不需要 CTE 中的 having 子句，但根据您的数据，它可以使它更快。（也就是说，如果你有一个大数据集。）

WITH email2addr
(
  select email, count(distinct customer_address) as addr_cnt
  from customers
  group by email
  having count(distinct customer_address) > 1
)

SELECT 
    Email
    ,BrandNewCustomer
    ,RecurringCustomer
    ,ReactivatedCustomer
    ,OrderCount
    ,TotalOrders
    ,Date_Created
    ,Customer_Name
    ,Customer_Address
    ,Customer_City
    ,Customer_State
    ,Customer_Zip
    ,Customer_Country
    CASE when coalese(email2addr.addr_cnt,1) > 1 then 'Y' ELSE 'N' END as has_more_than_1_email 
from customers
left join email2addr on customers.email = email2addr.email

SQL 根据每行一个字段计算另一列中的重复项

SQL count duplicates in another column based on one field per row

sql

sql-server

dense-rank