当使用星号（通配符）运算符进行搜索时，MySQL 将如何使用 FT 索引？

Question

基本上，我有巨大的 table（约 3000 万条记录），其中一列有一个 fulltext 索引。

搜索查询如下所示：

... WHERE MATCH(body) AGAINST('+Hello +my*' IN BOOLEAN MODE) ...

我的存储引擎是 InnoDB，所以我们有一些限制：

最小字长为 3 个字符。

但是，文档是这样说的：

If a word is specified with the truncation operator, it is not stripped from a boolean query, even if it is too short (as determined from the ft_min_word_len setting) or a stopword. This occurs because the word is not seen as too short or a stopword, but as a prefix that must be present in the document in the form of a word that begins with the prefix. Suppose that ft_min_word_len=4.

问题是：在这种情况下MySQL如何使用FT索引？单词 bae 不应出现在任何索引中，因为它不符合最小单词长度的要求。也许像这样的查询会慢一点？

Answer 1

查询可能会稍微慢一些，但不是您推理的结果。

创建索引时也会用到最小字长设置，所以innodb不会索引小于最小字长的字

在执行全文搜索的过程中，innodb 再次检查正在搜索的词的长度与最小词长限制，并剔除比限制短的词，因为它们在索引中找不到。所以，如果你有一个 'my' 的搜索条件（注意没有星号），这将被 innodb 忽略。

然而，当你使用通配符运算符的字符数小于限制时（比如你的 my*，它只有两个字符），这些仍然包含在搜索中，因为 innodb会看图案，而不仅仅是单词。

显然，仅检查完全匹配比检查完全匹配和单词开头更快，但速度不会有显着差异。

当使用星号（通配符）运算符进行搜索时，MySQL 将如何使用 FT 索引？

How MySQL will use FT index when searching is performed with an asterisk (wildcard) operator?

mysql

innodb

full-text-indexing