具有多个参数的 GROUP BY 和 COUNT
GROUP BY & COUNT with multiple parameters
我有一个简单的配置:
2 个表以 many-to-many 关系链接,所以它给了我 3 个表。
Table作者:
idAuthor INT
name VARCHAR
Table 发表:
idPublication INT,
title VARCHAR,
date YEAR,
type VARCHAR,
conference VARCHAR,
journal VARCHAR
Table author_has_publication:
Author_idAuthor,
Publication_idPublication
我正在尝试获取在 SIGMOD 会议和 PVLDB 会议上发表至少 2 篇论文的所有作者姓名。
现在我实现了这一点,但我仍然有双重结果。我的查询:
SELECT author.name, publication.journal, COUNT(*)
FROM author
INNER JOIN author_has_publication
ON author.idAuthor = author_has_publication.Author_idAuthor
INNER JOIN publication
ON author_has_publication.Publication_idPublication = publication.idPublication
GROUP BY publication.journal, author.name
HAVING COUNT(*) >= 2
AND (publication.journal = 'PVLDB' OR publication.journal = 'SIGMOD');
returns
+-------+---------+----------+
| name | journal | COUNT(*) |
+-------+---------+----------+
| Renee | PVLDB | 2 |
| Renee | SIGMOD | 2 |
+-------+---------+----------+
如您所见,结果是正确的,但加倍了,因为我只想要 1 倍的名称。
其他问题,如何只修改一个会议的number参数,比如获取所有发布至少3个SIGMOD和至少1个PVLDB的作者?
如果您不关心 journal
,请不要 select 它,它会拆分您的结果。此外,普通过滤器需要放在 WHERE
子句中,而不是 HAVING
子句中:
SELECT author.name, COUNT(*)
FROM author
INNER JOIN author_has_publication
ON author.idAuthor = author_has_publication.Author_idAuthor
INNER JOIN publication
ON author_has_publication.Publication_idPublication =
publication.idPublication
WHERE publication.journal IN('PVLDB','SIGMOD')
GROUP BY author.name
HAVING COUNT(CASE WHEN publication.journal = 'SIGMOD' THEN 1 END) >= 2
AND COUNT(CASE WHEN publication.journal = 'PVLDB' THEN 1 END) >= 2;
对于第二个问题,使用这个 HAVING()
子句:
HAVING COUNT(CASE WHEN publication.journal = 'SIGMOD' THEN 1 END) >= 3
AND COUNT(CASE WHEN publication.journal = 'PVLDB' THEN 1 END) >= 1;
我有一个简单的配置: 2 个表以 many-to-many 关系链接,所以它给了我 3 个表。
Table作者:
idAuthor INT
name VARCHAR
Table 发表:
idPublication INT,
title VARCHAR,
date YEAR,
type VARCHAR,
conference VARCHAR,
journal VARCHAR
Table author_has_publication:
Author_idAuthor,
Publication_idPublication
我正在尝试获取在 SIGMOD 会议和 PVLDB 会议上发表至少 2 篇论文的所有作者姓名。 现在我实现了这一点,但我仍然有双重结果。我的查询:
SELECT author.name, publication.journal, COUNT(*)
FROM author
INNER JOIN author_has_publication
ON author.idAuthor = author_has_publication.Author_idAuthor
INNER JOIN publication
ON author_has_publication.Publication_idPublication = publication.idPublication
GROUP BY publication.journal, author.name
HAVING COUNT(*) >= 2
AND (publication.journal = 'PVLDB' OR publication.journal = 'SIGMOD');
returns
+-------+---------+----------+
| name | journal | COUNT(*) |
+-------+---------+----------+
| Renee | PVLDB | 2 |
| Renee | SIGMOD | 2 |
+-------+---------+----------+
如您所见,结果是正确的,但加倍了,因为我只想要 1 倍的名称。
其他问题,如何只修改一个会议的number参数,比如获取所有发布至少3个SIGMOD和至少1个PVLDB的作者?
如果您不关心 journal
,请不要 select 它,它会拆分您的结果。此外,普通过滤器需要放在 WHERE
子句中,而不是 HAVING
子句中:
SELECT author.name, COUNT(*)
FROM author
INNER JOIN author_has_publication
ON author.idAuthor = author_has_publication.Author_idAuthor
INNER JOIN publication
ON author_has_publication.Publication_idPublication =
publication.idPublication
WHERE publication.journal IN('PVLDB','SIGMOD')
GROUP BY author.name
HAVING COUNT(CASE WHEN publication.journal = 'SIGMOD' THEN 1 END) >= 2
AND COUNT(CASE WHEN publication.journal = 'PVLDB' THEN 1 END) >= 2;
对于第二个问题,使用这个 HAVING()
子句:
HAVING COUNT(CASE WHEN publication.journal = 'SIGMOD' THEN 1 END) >= 3
AND COUNT(CASE WHEN publication.journal = 'PVLDB' THEN 1 END) >= 1;