PostgreSQL 全文搜索缩写
PostgreSQL full text search abbreviations
我使用 'german' 创建了 Postgresql 全文搜索。我如何配置,当我搜索 "Bezirk" 时,包含 "Bez." 的行也是匹配项? (反之亦然)
尝试在搜索中使用通配符。
例如:
tableName.column LIKE 'Bez%'
%
将搜索 Bez
之后的任何字母或数字
描述非常模糊,无法理解您要实现的目标,但看起来您需要简单的 pattern matching search as you looking for abbreviations (so need to do stemming like in Full Text Search). I would with pg_trgm
来达到此目的:
WITH t(word) AS ( VALUES
('Bez'),
('Bezi'),
('Bezir')
)
SELECT word, similarity(word, 'Bezirk') AS similarity
FROM t
WHERE word % 'Bezirk'
ORDER BY similarity DESC;
结果:
word | similarity
-------+------------
Bezir | 0.625
Bezi | 0.5
Bez | 0.375
(3 rows)
@pozs 是对的。您需要使用 synonym dictionary。
1 - 在目录 $SHAREDIR/tsearch_data 中创建包含以下内容的文件 german.syn:
Bez Bezirk
2 - 执行查询:
CREATE TEXT SEARCH DICTIONARY german_syn (
template = synonym,
synonyms = german);
CREATE TEXT SEARCH CONFIGURATION german_syn(COPY='simple');
ALTER TEXT SEARCH CONFIGURATION german_syn
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
word, hword, hword_part
WITH german_syn, german_stem;
现在可以测试了。执行查询:
test=# SELECT to_tsvector('german_syn', 'Bezirk') @@ to_tsquery('german_syn', 'Bezirk & Bez');
?column?
----------
t
(1 row)
test=# SELECT to_tsvector('german_syn', 'Bez Bez.') @@ to_tsquery('german_syn', 'Bezirk');
?column?
----------
t
(1 row)
附加链接:
我使用 'german' 创建了 Postgresql 全文搜索。我如何配置,当我搜索 "Bezirk" 时,包含 "Bez." 的行也是匹配项? (反之亦然)
尝试在搜索中使用通配符。
例如:
tableName.column LIKE 'Bez%'
%
将搜索 Bez
描述非常模糊,无法理解您要实现的目标,但看起来您需要简单的 pattern matching search as you looking for abbreviations (so need to do stemming like in Full Text Search). I would with pg_trgm
来达到此目的:
WITH t(word) AS ( VALUES
('Bez'),
('Bezi'),
('Bezir')
)
SELECT word, similarity(word, 'Bezirk') AS similarity
FROM t
WHERE word % 'Bezirk'
ORDER BY similarity DESC;
结果:
word | similarity
-------+------------
Bezir | 0.625
Bezi | 0.5
Bez | 0.375
(3 rows)
@pozs 是对的。您需要使用 synonym dictionary。
1 - 在目录 $SHAREDIR/tsearch_data 中创建包含以下内容的文件 german.syn:
Bez Bezirk
2 - 执行查询:
CREATE TEXT SEARCH DICTIONARY german_syn (
template = synonym,
synonyms = german);
CREATE TEXT SEARCH CONFIGURATION german_syn(COPY='simple');
ALTER TEXT SEARCH CONFIGURATION german_syn
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
word, hword, hword_part
WITH german_syn, german_stem;
现在可以测试了。执行查询:
test=# SELECT to_tsvector('german_syn', 'Bezirk') @@ to_tsquery('german_syn', 'Bezirk & Bez');
?column?
----------
t
(1 row)
test=# SELECT to_tsvector('german_syn', 'Bez Bez.') @@ to_tsquery('german_syn', 'Bezirk');
?column?
----------
t
(1 row)
附加链接: