Postgresql，tsquery不适用于部分字符串

Question

我正在使用 postgres 的 tsquery 函数在可能包含多种语言和数字的字母的字段中进行搜索。似乎在每种情况下，搜索都会搜索到搜索到的短语的一部分，然后停止搜索，直到您写下完整的短语。

例如：当搜索词是“15339”时，搜索名称“15339”会输出正确的行，但如果是“153”，则不会。

搜索 Al-Alamya，如果字词是 'al-'，它会起作用，return 行，但在其后添加字母，例如，'al-alam' 不会 return 直到我写完全名 ('Al-Alamya').

我的查询：

SELECT *
FROM (SELECT DISTINCT ON ("consumer_api_spot"."id") "consumer_api_spot"."id",
                                                    "consumer_api_spot"."name",

      FROM "consumer_api_spot"
               INNER JOIN "consumer_api_account" ON ("consumer_api_spot"."account_id" = "consumer_api_account"."id")
               INNER JOIN "users_user" ON ("consumer_api_account"."id" = "users_user"."account_id")

      WHERE (
                    users_user.id = 53 AND consumer_api_spot.active
                    AND
                    "consumer_api_spot"."vectorized_name" @@ tsquery('153')
                )
      GROUP BY "consumer_api_spot"."id"
     ) AS "Q"
LIMIT 50 OFFSET 0

Answer 1

如果您检查 documentation，您将找到有关可以指定为 tsquery 的内容的更多信息。它们支持分组、使用布尔运算进行组合以及添加前缀，这可能是您想要的。来自文档的示例：

Also, lexemes in a tsquery can be labeled with * to specify prefix matching:
SELECT 'super:*'::tsquery;
This query will match any word in a tsvector that begins with “super”.

因此在您的查询中您应该将 tsquery('153') 的部分修改为 tsquery('153:*')。

顺便说一句。我不知道您是如何构建数据库模式的，但是您可以使用 GIN index 为列添加 tsvector 索引。我假设您从 "consumer_api_spot"."name" 列生成 "consumer_api_spot"."vectorized_name" 列。如果是这种情况，您可以像这样为该列创建一个 tsvector 索引：

CREATE INDEX gin_name on consumer_api_spot using gin (to_tsvector('english',name))

然后您可以更改此查询：

"consumer_api_spot"."vectorized_name" @@ tsquery('153')

进入这个：

to_tsvector('english', "consumer_api_spot"."name") @@ to_tsquery('english', '153:*')

并获得潜在的速度优势，因为查询将使用索引。

关于'english'的注意事项：创建索引时不能省略语言，但它不会影响其他语言的查询或数字查询。但是要注意，创建索引和执行查询的语言必须相同，PostgreSQL 才能使用索引。

Postgresql，tsquery不适用于部分字符串

Postgresql, tsquery doesn't work with part of string

sql

postgresql

text-search

tsvector