与“=”运算符相比,SQLite "LIKE" 运算符非常慢

SQLite "LIKE" operator is very slow compared to the "=" operator

当我在 SQLite 中使用 LIKE 运算符时,与我使用 = 相比,它非常慢。 使用 = 运算符大约需要 14 毫秒,但是当我使用 LIKE 时,大约需要 440 毫秒。我正在用 DB Browser for SQLite 测试这个。这是快速运行的查询:

SELECT re.ENTRY_ID, 
       GROUP_CONCAT(re.READING_ELEMENT, '§') AS read_element,
       GROUP_CONCAT(re.FURIGANA_BOTTOM, '§') AS furigana_bottom,
       GROUP_CONCAT(re.FURIGANA_TOP, '§') AS furigana_top,
       GROUP_CONCAT(re.NO_KANJI, '§') AS no_kanji,
       GROUP_CONCAT(re.READING_COMMONNESS, '§') AS read_commonness, 
       GROUP_CONCAT(re.READING_RELATION, '§') AS read_rel,
       GROUP_CONCAT(se.SENSE_ID, '§') AS sense_id, 
       GROUP_CONCAT(se.GLOSS, '§') AS gloss, 
       GROUP_CONCAT(se.POS, '§') AS pos, 
       GROUP_CONCAT(se.FIELD, '§') AS field,
       GROUP_CONCAT(se.DIALECT, '§') AS dialect, 
       GROUP_CONCAT(se.INFORMATION, '§') AS info 
FROM Jmdict_Reading_Element AS re LEFT JOIN 
     Jmdict_Sense_Element AS
     se ON re.ENTRY_ID = se.ENTRY_ID
WHERE re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Reading_Element WHERE READING_ELEMENT = 'example') OR 
      re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Sense_Element WHERE GLOSS = 'example')
 GROUP BY re.ENTRY_ID

当我改变时速度变慢

WHERE re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Reading_Element WHERE READING_ELEMENT = 'example') OR 
re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Sense_Element WHERE GLOSS = 'example')

WHERE re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Reading_Element WHERE READING_ELEMENT LIKE 'example') OR 
re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Sense_Element WHERE GLOSS LIKE 'example')

我需要这样做才能使用通配符,例如

WHERE re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Reading_Element WHERE READING_ELEMENT LIKE 'example%') OR 
re.ENTRY_ID IN (SELECT ENTRY_ID FROM Jmdict_Sense_Element WHERE GLOSS LIKE 'example%')

这是对数据库本身的 link: https://www.mediafire.com/file/hyuymc84022gzq7/dictionary.db/file

谢谢

我想知道使用 HAVING 是否会加快您的查询速度:

SELECT re.ENTRY_ID, 
       GROUP_CONCAT(re.READING_ELEMENT, '§') AS read_element,
       GROUP_CONCAT(re.FURIGANA_BOTTOM, '§') AS furigana_bottom,
       GROUP_CONCAT(re.FURIGANA_TOP, '§') AS furigana_top,
       GROUP_CONCAT(re.NO_KANJI, '§') AS no_kanji,
       GROUP_CONCAT(re.READING_COMMONNESS, '§') AS read_commonness, 
       GROUP_CONCAT(re.READING_RELATION, '§') AS read_rel,
       GROUP_CONCAT(se.SENSE_ID, '§') AS sense_id, 
       GROUP_CONCAT(se.GLOSS, '§') AS gloss, 
       GROUP_CONCAT(se.POS, '§') AS pos, 
       GROUP_CONCAT(se.FIELD, '§') AS field,
       GROUP_CONCAT(se.DIALECT, '§') AS dialect, 
       GROUP_CONCAT(se.INFORMATION, '§') AS info 
FROM Jmdict_Reading_Element re LEFT JOIN 
     Jmdict_Sense_Element se
     ON re.ENTRY_ID = se.ENTRY_ID
GROUP BY re.ENTRY_ID
HAVING SUM(CASE WHEN re.READING_ELEMENT = 'example' THEN 1 ELSE 0 END) > 0 OR
       SUM(CASE WHEN se.GLOSS = 'example' THEN 1 ELSE 0 END) > 0);

FTS?

在 Win 10 上使用 DB Browser for sqlite。

  • "fast" 查询 returns 84 毫秒内 25 行
  • "slow" 查询(使用LIKE "example%")returns 1025ms 33行

像这样创建了 fts4 表:

create virtual table jre_fts using FTS4(entry_id,reading_element);
insert into jre_fts select entry_id, reading_element from Jmdict_Reading_Element;
create virtual table jse_fts using FTS4(entry_id,gloss);
insert into jse_fts select entry_id, gloss from Jmdict_Sense_Element;

用了 7390 毫秒,数据库从 70,296KB 增长到 110,708KB。

像这样修改了 WHERE:

 WHERE re.ENTRY_ID IN (SELECT ENTRY_ID FROM jre_fts WHERE READING_ELEMENT MATCH '^example') OR 
re.ENTRY_ID IN (SELECT ENTRY_ID FROM jse_fts WHERE GLOSS MATCH '^example')

查询在 60 毫秒内返回了 33 行。

我无法测试或分析 FTS 如何在 reading_element 列上工作,但也许该方法显示出希望。

尝试在您正在使用的列上放置全文索引。

Full Text Indexing

创建目录

USE {yourDB}  
GO  
CREATE FULLTEXT CATALOG {catalogName}
WITH ACCENT_SENSITIVITY = OFF

创建索引

USE {yourDB}  
GO  
CREATE FULLTEXT INDEX ON {someTable} ({col1}, {col2})
ON catalogName

备注 这更方便,但请查看您的排序规则是否不区分大小写,例如 'a' = 'A'。例如,通常排序规则会有一个 ci_utf8(ci = 不区分大小写)。我这样做是为了方便用户和程序员。