是否可以在来自 3 个表的 SQL 查询中使用两个 COUNT 和两个 JOIN?
Is it possible to use two COUNT and two JOIN in a SQL query from 3 tables?
所以我在这里要做的是获取一份报告,了解不同用户发送了多少封电子邮件(使用类似 MailChimp 的应用程序),但我希望在一个查询中使用两个不同的指标。我想知道每个用户发送了多少封电子邮件。这意味着如果他们分别向 100 个联系人发送了 3 封电子邮件,那将显示 300。但我也想知道发送了多少封独特的电子邮件,这意味着将显示 3.
我想要的东西看起来像:
-------------------------------------------------------------
| Full Name | Username | Total Sent | Unique Mails |
|-------------|-----------------|------------|--------------|
| John Doe | jdoe@mail.com | 12000 | 4 |
| James Smith | jsmith@mail.com | 6000 | 12 |
| Jane Jones | jjones@mail.com | 4000 | 2 |
| ... | ... | ... | ... |
-------------------------------------------------------------
所以我可以知道 John 向很多联系人发送了一些电子邮件,而 James 向较少的联系人发送了更多电子邮件。
我的查询如下所示。我已经更改了 table 和列名,但这在其他方面是对它的精确表示。
SELECT
CONCAT(Usernames.FirstName, ' ', Usernames.LastName) AS 'Full Name',
Usernames.Username,
COUNT(Sent_Mail_Contacts.IDContact) AS `Total Sent`,
COUNT(Mass_Mail.IDMass_Mail) AS `Individual E-Mails`
FROM Usernames
LEFT JOIN Sent_Mail_Contacts ON Usernames.Username = Sent_Mail_Contacts.Username
LEFT JOIN Mass_Mail ON Usernames.Username = Mass_Mail.Username
GROUP BY Usernames.Username
ORDER BY `Total Sent`
我有一个带用户名的 table,一个带电子邮件联系的个人联系人的 table 和一个带唯一电子邮件的 table。
那么我的查询是否有意义?这可能吗?因为现在当我 运行 它时,它给了我这样的东西:
-------------------------------------------------------------
| Full Name | Username | Total Sent | Unique Mails |
|-------------|-----------------|------------|--------------|
| John Doe | jdoe@mail.com | 12000 | 12000 |
| James Smith | jsmith@mail.com | 6000 | 6000 |
| Jane Jones | jjones@mail.com | 4000 | 4000 |
| ... | ... | ... | ... |
-------------------------------------------------------------
我只是在两列中给我相同的数字,需要 7 分钟来处理。
下面是 3 个 table 单独显示的示例,如果有帮助的话:
Usernames
------------------------------------------------
| Username | FirstName | LastName | ... |
|-----------------|-----------|----------|-----|
| jdoe@mail.com | John | Doe | ... |
| jsmith@mail.com | James | Smith | ... |
| jjones@mail.com | Jane | Jones | ... |
| ... | ... | ... | ... |
------------------------------------------------
Mass_Mail
----------------------------------------------------
| ID_Mass_Mail | Username | Date | ... |
|--------------|----------------|------------|-----|
| 1 | jdoe@mail.com | 2019-01-16 | ... |
| 2 | jdoe@mail.com | 2019-01-29 | ... |
| 3 | jjones@mail.com| 2019-02-14 | ... |
| ... | ... | ... | ... |
----------------------------------------------------
Sent_Mail_Contacts
---------------------------------------------------------------------
| ID_Mass_Mail | Username | Contact_ID | Contact_Email | ... |
|--------------|----------------|------------|----------------|------
| 1 | jdoe@mail.com | 1 | bob@mail.com | ... |
| 1 | jdoe@mail.com | 2 | jim@mail.com | ... |
| 1 | jdoe@mail.com | 3 | cindy@mail.com | ... |
| ... | ... | ... | ... | ... |
| 2 | jdoe@mail.com | 4 | mike@mail.com | ... |
| 2 | jdoe@mail.com | 2 | jim@mail.com | ... |
| 2 | jdoe@mail.com | 3 | cindy@mail.com | ... |
| ... | ... | ... | ... | ... |
---------------------------------------------------------------------
使用COUNT(DISTINCT ...)
:
SELECT
CONCAT(Usernames.FirstName, ' ', Usernames.LastName) AS 'Full Name',
Usernames.Username,
COUNT(Sent_Mail_Contacts.IDContact) AS `Total Sent`,
COUNT(DISTINCT Mass_Mail.IDMass_Mail) AS `Individual E-Mails`
FROM Usernames
LEFT JOIN Sent_Mail_Contacts ON Usernames.Username = Sent_Mail_Contacts.Username
LEFT JOIN Mass_Mail ON Usernames.Username = Mass_Mail.Username
GROUP BY Usernames.Username
ORDER BY `Total Sent`
注意:虽然这不会使查询更快。首先,您至少应确保在 JOIN
中使用 primary/foreign 键关系:Usernames(Username)
、Sent_Mail_Contacts(Username)
、Mass_Mail(Username)
假设 IDMass_Mail
中的值表示一个唯一的电子邮件,那么您只需编辑最后一个 COUNT
以使用 DISTINCT
关键字。
COUNT(DISTINCT Mass_Mail.IDMass_Mail) AS `Individual E-Mails`
这将 return 分组中唯一值的数量 Username
。
如果您能够将索引添加到 Sent_Mail_Contacts
和 Mass_Mail
表中的 Username
列,您应该也会获得性能提升。
首先:为什么 Mass_Mail
和 Sent_Mail_Contacts
都包含一个 Username
?这看起来多余。还是 Sent_Mail_Contacts.ID_Mass_Mail
可以为空?
至少对于这个查询,我想我们可以完全忽略 Sent_Mail_Contacts
中的 Username
。真正链接这两个表的是 ID_Mass_Mail
,而您在查询中忘记了这个连接条件。
select
ws_concat(' ', u.firstname, u.lastname) as full_name,
u.username,
count(smc.idmass_mail) as total_sent,
count(mm.idmass_mail) as individual_e_mails
from usernames u
left join mass_mail mm on mm.username = u.username
left join sent_mail_contacts smc on smc.id_mass_mail = u.id_mass_mail
group by u.username
order by total_sent;
我设法使用一个查询来做到这一点(除了出于隐私考虑更改实际 table 和列名之外)看起来完全像这样。
SELECT
Accounts.Account_Name AS `account`,
Usernames.Username AS `username`,
COUNT(Mass_Mail_Reached_Contacts.ID_Contact) AS `total_emails`,
COUNT(Mass_Mail_Reached_Contacts.ID_Mass_Mail) /
(
SELECT COUNT(*)
FROM
Mass_Mail_Reached_Contacts
WHERE
Mass_Mail_Reached_Contacts.DATE >= '2019-02-01'
AND
Mass_Mail_Reached_Contacts.DATE <= '2019-02-28'
)
* 100 AS `%`,
COUNT(DISTINCT Mass_Mail.ID_Mass_Mail) AS `unique_emails`,
COUNT(Mass_Mail_Reached_Contacts.ID_Mass_Mail) /
COUNT(DISTINCT mass_mail.ID_Mass_Mail)
AS `avg_contacts_per_email`
FROM
Usernames
LEFT JOIN Mass_Mail_Reached_Contacts ON Mass_Mail_Reached_Contacts.Username = Usernames.Username
LEFT JOIN Account ON Account.ID_Account = Usernames.ID_Account
LEFT JOIN Mass_Mail ON Mass_Mail.ID_Mass_Mail = Mass_Mail_Reached_Contacts.ID_mass_mail
WHERE
Mass_Mail_Reached_Contacts.DATE >= '2019-02-01'
AND
Mass_Mail_Reached_Contacts.DATE <= '2019-02-28'
GROUP BY
Usernames.Username
HAVING COUNT(DISTINCT Mass_Mail.IDMass_Mail) > 0
ORDER BY
`total_emails` DESC
我现在可以 table 看起来像这样
Emails Stats
--------------------------------------------------------------------------------------
| account | username | total_emails | % | unique_emails | avg_contact_email |
|----------|--------------|--------------|-------|------------------------------------
| Bob inc. | bob@mail.com | 28,550 | 14.52 | 12 | 2379.17 |
| ... | ... | ... | ... | ... | ... |
--------------------------------------------------------------------------------------
所以我在这里要做的是获取一份报告,了解不同用户发送了多少封电子邮件(使用类似 MailChimp 的应用程序),但我希望在一个查询中使用两个不同的指标。我想知道每个用户发送了多少封电子邮件。这意味着如果他们分别向 100 个联系人发送了 3 封电子邮件,那将显示 300。但我也想知道发送了多少封独特的电子邮件,这意味着将显示 3.
我想要的东西看起来像:
-------------------------------------------------------------
| Full Name | Username | Total Sent | Unique Mails |
|-------------|-----------------|------------|--------------|
| John Doe | jdoe@mail.com | 12000 | 4 |
| James Smith | jsmith@mail.com | 6000 | 12 |
| Jane Jones | jjones@mail.com | 4000 | 2 |
| ... | ... | ... | ... |
-------------------------------------------------------------
所以我可以知道 John 向很多联系人发送了一些电子邮件,而 James 向较少的联系人发送了更多电子邮件。
我的查询如下所示。我已经更改了 table 和列名,但这在其他方面是对它的精确表示。
SELECT
CONCAT(Usernames.FirstName, ' ', Usernames.LastName) AS 'Full Name',
Usernames.Username,
COUNT(Sent_Mail_Contacts.IDContact) AS `Total Sent`,
COUNT(Mass_Mail.IDMass_Mail) AS `Individual E-Mails`
FROM Usernames
LEFT JOIN Sent_Mail_Contacts ON Usernames.Username = Sent_Mail_Contacts.Username
LEFT JOIN Mass_Mail ON Usernames.Username = Mass_Mail.Username
GROUP BY Usernames.Username
ORDER BY `Total Sent`
我有一个带用户名的 table,一个带电子邮件联系的个人联系人的 table 和一个带唯一电子邮件的 table。
那么我的查询是否有意义?这可能吗?因为现在当我 运行 它时,它给了我这样的东西:
-------------------------------------------------------------
| Full Name | Username | Total Sent | Unique Mails |
|-------------|-----------------|------------|--------------|
| John Doe | jdoe@mail.com | 12000 | 12000 |
| James Smith | jsmith@mail.com | 6000 | 6000 |
| Jane Jones | jjones@mail.com | 4000 | 4000 |
| ... | ... | ... | ... |
-------------------------------------------------------------
我只是在两列中给我相同的数字,需要 7 分钟来处理。
下面是 3 个 table 单独显示的示例,如果有帮助的话:
Usernames
------------------------------------------------
| Username | FirstName | LastName | ... |
|-----------------|-----------|----------|-----|
| jdoe@mail.com | John | Doe | ... |
| jsmith@mail.com | James | Smith | ... |
| jjones@mail.com | Jane | Jones | ... |
| ... | ... | ... | ... |
------------------------------------------------
Mass_Mail
----------------------------------------------------
| ID_Mass_Mail | Username | Date | ... |
|--------------|----------------|------------|-----|
| 1 | jdoe@mail.com | 2019-01-16 | ... |
| 2 | jdoe@mail.com | 2019-01-29 | ... |
| 3 | jjones@mail.com| 2019-02-14 | ... |
| ... | ... | ... | ... |
----------------------------------------------------
Sent_Mail_Contacts
---------------------------------------------------------------------
| ID_Mass_Mail | Username | Contact_ID | Contact_Email | ... |
|--------------|----------------|------------|----------------|------
| 1 | jdoe@mail.com | 1 | bob@mail.com | ... |
| 1 | jdoe@mail.com | 2 | jim@mail.com | ... |
| 1 | jdoe@mail.com | 3 | cindy@mail.com | ... |
| ... | ... | ... | ... | ... |
| 2 | jdoe@mail.com | 4 | mike@mail.com | ... |
| 2 | jdoe@mail.com | 2 | jim@mail.com | ... |
| 2 | jdoe@mail.com | 3 | cindy@mail.com | ... |
| ... | ... | ... | ... | ... |
---------------------------------------------------------------------
使用COUNT(DISTINCT ...)
:
SELECT
CONCAT(Usernames.FirstName, ' ', Usernames.LastName) AS 'Full Name',
Usernames.Username,
COUNT(Sent_Mail_Contacts.IDContact) AS `Total Sent`,
COUNT(DISTINCT Mass_Mail.IDMass_Mail) AS `Individual E-Mails`
FROM Usernames
LEFT JOIN Sent_Mail_Contacts ON Usernames.Username = Sent_Mail_Contacts.Username
LEFT JOIN Mass_Mail ON Usernames.Username = Mass_Mail.Username
GROUP BY Usernames.Username
ORDER BY `Total Sent`
注意:虽然这不会使查询更快。首先,您至少应确保在 JOIN
中使用 primary/foreign 键关系:Usernames(Username)
、Sent_Mail_Contacts(Username)
、Mass_Mail(Username)
假设 IDMass_Mail
中的值表示一个唯一的电子邮件,那么您只需编辑最后一个 COUNT
以使用 DISTINCT
关键字。
COUNT(DISTINCT Mass_Mail.IDMass_Mail) AS `Individual E-Mails`
这将 return 分组中唯一值的数量 Username
。
如果您能够将索引添加到 Sent_Mail_Contacts
和 Mass_Mail
表中的 Username
列,您应该也会获得性能提升。
首先:为什么 Mass_Mail
和 Sent_Mail_Contacts
都包含一个 Username
?这看起来多余。还是 Sent_Mail_Contacts.ID_Mass_Mail
可以为空?
至少对于这个查询,我想我们可以完全忽略 Sent_Mail_Contacts
中的 Username
。真正链接这两个表的是 ID_Mass_Mail
,而您在查询中忘记了这个连接条件。
select
ws_concat(' ', u.firstname, u.lastname) as full_name,
u.username,
count(smc.idmass_mail) as total_sent,
count(mm.idmass_mail) as individual_e_mails
from usernames u
left join mass_mail mm on mm.username = u.username
left join sent_mail_contacts smc on smc.id_mass_mail = u.id_mass_mail
group by u.username
order by total_sent;
我设法使用一个查询来做到这一点(除了出于隐私考虑更改实际 table 和列名之外)看起来完全像这样。
SELECT
Accounts.Account_Name AS `account`,
Usernames.Username AS `username`,
COUNT(Mass_Mail_Reached_Contacts.ID_Contact) AS `total_emails`,
COUNT(Mass_Mail_Reached_Contacts.ID_Mass_Mail) /
(
SELECT COUNT(*)
FROM
Mass_Mail_Reached_Contacts
WHERE
Mass_Mail_Reached_Contacts.DATE >= '2019-02-01'
AND
Mass_Mail_Reached_Contacts.DATE <= '2019-02-28'
)
* 100 AS `%`,
COUNT(DISTINCT Mass_Mail.ID_Mass_Mail) AS `unique_emails`,
COUNT(Mass_Mail_Reached_Contacts.ID_Mass_Mail) /
COUNT(DISTINCT mass_mail.ID_Mass_Mail)
AS `avg_contacts_per_email`
FROM
Usernames
LEFT JOIN Mass_Mail_Reached_Contacts ON Mass_Mail_Reached_Contacts.Username = Usernames.Username
LEFT JOIN Account ON Account.ID_Account = Usernames.ID_Account
LEFT JOIN Mass_Mail ON Mass_Mail.ID_Mass_Mail = Mass_Mail_Reached_Contacts.ID_mass_mail
WHERE
Mass_Mail_Reached_Contacts.DATE >= '2019-02-01'
AND
Mass_Mail_Reached_Contacts.DATE <= '2019-02-28'
GROUP BY
Usernames.Username
HAVING COUNT(DISTINCT Mass_Mail.IDMass_Mail) > 0
ORDER BY
`total_emails` DESC
我现在可以 table 看起来像这样
Emails Stats
--------------------------------------------------------------------------------------
| account | username | total_emails | % | unique_emails | avg_contact_email |
|----------|--------------|--------------|-------|------------------------------------
| Bob inc. | bob@mail.com | 28,550 | 14.52 | 12 | 2379.17 |
| ... | ... | ... | ... | ... | ... |
--------------------------------------------------------------------------------------