Postgres 从 Select 列表中查找和 Return 个关键字

Question

我有一个简单的 postgres table，其中包含一个评论（文本）列。在一个视图中，我需要在评论字段中搜索单词列表，然后 return 以逗号分隔的单词列表作为一列（以及一堆普通列）。

已定义的关键字列表包含大约 20 个单词。 IE。苹果、香蕉、梨、桃、李。

理想的结果应该是这样的：

id | comments                    | keywords
-----------------------------------------------------
1  | I like bananas!             | bananas
2  | I like apples.              | apples
3  | I don't like fruit          | 
4  | I like apples and bananas!  | apples,bananas

我想我需要做一个子查询 array_agg？或者可能 'where in'。但我不知道如何将它们组合在一起。

非常感谢，史蒂夫

Answer 1

您可以使用全文搜索工具来获得结果：

使用您的单词列表设置新的 ispell dictionary。
创建 full-text search configuration，它将基于您的字典。不要忘记从配置中删除所有其他词典，因为在您的情况下，所有其他词实际上都是停用词。

之后当你执行

select plainto_tsquery('<your config name>', 'I like apples and bananas!')

你只会得到你的关键字：'apples' & 'bananas' 甚至 'apple' & 'banana' 如果你正确设置字典。

默认情况下，英语配置使用减少单词结尾的滚雪球词典，因此如果您运行

select plainto_tsquery('english', 'I like apples and bananas!')

你会得到

'like' & 'appl' & 'banana'

这不完全适合你的情况。

Answer 2

另一个更简单的方法（但更慢）：

创建字典table:

create table keywords (nm text);

insert into keywords (nm)
values ('apples'), ('bananas');

对您的文本执行以下脚本以提取关键字

select string_agg(regexp_replace(foo, '[^a-zA-Z\-]*', '', 'ig'), ',') s
  from regexp_split_to_table('I like apples and bananas!', E'\s+') foo 
 where regexp_replace(foo, '[^a-zA-Z\-]*', '', 'ig') in (select nm from keywords)

这个解决方案在语义上更差，所以香蕉和香蕉将是不同的关键字。

Postgres 从 Select 列表中查找和 Return 个关键字

Postgres Find and Return Keywords From List Within Select

postgresql

search

full-text-search

keyword-search

array-agg