Mysql 通过具有

Mysql query with group by having

我有一个包含 3 列的 table:id、句子和语言。所以句子可以是英语和德语,ID 分配给具有相同含义但不同语言的句子,例如

ID | sentence | language
1  | Hello    | en
1  | Hallo    | de
2  | Sorry    | en

可能有些句子只存在于一种语言中。现在我想找出所有可用两种语言的句子,我可以这样做:

SELECT 
    *
FROM
    `sentences`
WHERE
    LENGTH(sentence) > 0
        AND (language = 'en' OR language = 'de')
GROUP BY id
HAVING COUNT(language) = 2

而且我只得到德语句子的结果。然后我做

SELECT 
    *
FROM
    sentences
WHERE
    id IN (SELECT 
            id
        FROM
            `sentences`
        WHERE
            LENGTH(sentence) > 0
                AND (language = 'en' OR language = 'de')
        GROUP BY id
        HAVING COUNT(language) = 2)

这应该可行,但查询需要很长时间。我的问题:有没有什么奇特的方法可以做到这一点?

INNER JOINS 比使用 IN 子句更快

SELECT en.id, 
       en.sentence as en_sentence,
       de.sentence as de_sentence,
       en.language as en_language,
       de.language as de_language
FROM sentences en
INNER JOIN sentences de ON en.ID = de.ID AND en.language = 'en' AND de.language = 'de'
WHERE length(en.sentence) > 0
AND length(de.sentence) > 0

如果您的数据允许,请删除长度为 0 的句子。在运行之前备份:

DELETE FROM sentences WHERE LENGTH(SENTENCE) = 0

拿出select*,想要的都来。如果您没有索引,请添加语言和 ID 的组合索引。

这给你留下了

SELECT 
    ID, sentence, language.
FROM
    `sentences`
WHERE
    language = 'en' OR language = 'de'
GROUP BY id
HAVING COUNT(language) = 2