Sphinx 查询花费太多时间

Question

我在 table 上建立索引，其中约有 90 000 000 行。全文搜索必须在名为 email 的 varchar 字段上进行。我还将 parent_id 设置为属性。

当我运行查询搜索与具有少量命中的单词匹配的电子邮件时，它们会立即被解雇：

mysql> SELECT count(*) FROM users WHERE MATCH('diedsmiling');
+----------+
| count(*) |
+----------+
|       26 |
+----------+
1 row in set (0.00 sec)

mysql> show meta;
+---------------+-------------+
| Variable_name | Value       |
+---------------+-------------+
| total         | 1           |
| total_found   | 1           |
| time          | 0.000       |
| keyword[0]    | diedsmiling |
| docs[0]       | 26          |
| hits[0]       | 26          |
+---------------+-------------+
6 rows in set (0.00 sec)

当我搜索与具有大量点击的单词相匹配的电子邮件时，事情变得复杂了：

mysql> SELECT count(*) FROM users WHERE MATCH('mail');
+----------+
| count(*) |
+----------+
| 33237994 |
+----------+
1 row in set (9.21 sec)

mysql> show meta;
+---------------+----------+
| Variable_name | Value    |
+---------------+----------+
| total         | 1        |
| total_found   | 1        |
| time          | 9.210    |
| keyword[0]    | mail     |
| docs[0]       | 33237994 |
| hits[0]       | 33253762 |
+---------------+----------+
6 rows in set (0.00 sec)

使用parent_id属性，不给任何利润：

mysql> SELECT count(*) FROM users WHERE MATCH('mail') AND parent_id = 62003;
+----------+
| count(*) |
+----------+
|    21404 |
+----------+
1 row in set (8.66 sec)

mysql> show meta;
+---------------+----------+
| Variable_name | Value    |
+---------------+----------+
| total         | 1        |
| total_found   | 1        |
| time          | 8.666    |
| keyword[0]    | mail     |
| docs[0]       | 33237994 |
| hits[0]       | 33253762 |

这是我的 sphinx 配置：

source src1
{
    type            = mysql
    sql_host        = HOST
    sql_user        = USER
    sql_pass        = PASS
    sql_db          = DATABASE
    sql_port        = 3306  # optional, default is 3306

    sql_query       = \
             SELECT id, parent_id, email \
                FROM users

    sql_attr_uint   = parent_id     

}    

index test1
{       
    source          = src1
    path            = /var/lib/sphinx/test1

}

我需要运行的查询如下所示：

SELECT * FROM users WHERE MATCH('mail') AND parent_id = 62003;

我需要获取所有匹配某项工作并具有某项parent_id的电子邮件。

我的问题是： 有没有办法优化上述情况？对于这种类型的查询，也许有更方便的匹配方式？如果我迁移到带有 SSD 磁盘的服务器，性能增长是否显着？

Answer 1

只是为了得到计数就可以了

 Select id from index where match(...) limit 0 option ranker=none; show meta;

并从 total_found.

获取

将比调用 group by 的 count[*) 更有效。

如果只有一个单词，甚至 call keywords('word','index',1);。

Sphinx 查询花费太多时间

Sphinx query takes too much time

sphinx