如何计算正在使用的相同电子邮件,但此 MySQL 数据库中的域不同?

How do I COUNT the same email being used, but a different domain in this MySQL database?

小数据分析师在这里。我有三个字段:clean_email、电子邮件和 email_domain

清理邮箱:是域名前的信息。因此,如果电子邮件是 dataguy@yahoo.com,则此字段将只显示 dataguy

Email 是整个邮箱:dataguy@yahoo.com

email_domain 只是域​​名:yahoo.com

我需要计算具有不同域的干净电子邮件的数量。我们注意到一些电子邮件可能是 dataguy@yahoo.com、dataguy@hotmail.com 或 dataguy@outlook.com。您注意到电子邮件是相同的,但域不同,因此我们试图识别何时发生这种情况。此人的域总数为 3,我需要列出这些特定域。

我目前的查询是:

   SELECT clean_email, email, COUNT(DISTINCT email_domain)
    FROM email
GROUP BY clean_email, email

我尝试过以多种方式使用 COUNT,但它没有返回我需要的结果。它通常 returns 1 行。

您可以使用 substring_index():

SELECT substring_index(email, '@', 1) as clean_email,
       COUNT(DISTINCT substring_index(email, '@', -1))
FROM email
GROUP BY clean_email;

编辑:

如果您想要域,请使用 GROUP_CONCAT():

SELECT substring_index(email, '@', 1) as clean_email,
       COUNT(DISTINCT substring_index(email, '@', -1)),
       GROUP_CONCAT(DISTINCT substring_index(email, '@', -1))
FROM email
GROUP BY clean_email
HAVING COUNT(DISTINCT substring_index(email, '@', -1)) > 1;

CTE怎么样?...

WITH multiple_domain AS
(
    SELECT 
        clean_email,
        COUNT(*) AS domain_count
    FROM email
    GROUP BY clean_email
    HAVING COUNT(*) > 1
)
SELECT 
    e.clean_email,
    e.email_domain,
    md.domain_count
FROM email AS e
JOIN multiple_domain AS md ON md.clean_email = e.clean_email
ORDER BY e.email;

或者如果 MySQL 8.0 还没有,派生的 table 做同样的工作...

SELECT 
    e.clean_email,
    e.email_domain,
    md.domain_count
FROM email AS e
JOIN 
(
    SELECT 
        clean_email,
        COUNT(*) AS domain_count
    FROM email
    GROUP BY clean_email
    HAVING COUNT(*) > 1
) AS md ON md.clean_email = e.clean_email
ORDER BY e.email;

Click here to have a play with it on SQL Fiddle