如何在 Django 中进行不区分重音的 TrigramSimilarity 搜索?

How to do an accent-insensitive TrigramSimilarity search in django?

如何将不区分重音的搜索添加到 django docs 中的以下代码段:

>>> from django.contrib.postgres.search import TrigramSimilarity
>>> Author.objects.create(name='Katy Stevens')
>>> Author.objects.create(name='Stephen Keats')
>>> test = 'Katie Stephens'
>>> Author.objects.annotate(
...     similarity=TrigramSimilarity('name', test),
... ).filter(similarity__gt=0.3).order_by('-similarity')
[<Author: Katy Stevens>, <Author: Stephen Keats>]

这怎么匹配test = 'Kâtié Stéphèns'

存在 unaccent 查找:

The unaccent lookup allows you to perform accent-insensitive lookups using a dedicated PostgreSQL extension.

此外,如果您查看 django 文档的 aggregation 部分,您可以阅读以下内容:

When specifying the field to be aggregated in an aggregate function, Django will allow you to use the same double underscore notation that is used when referring to related fields in filters. Django will then handle any table joins that are required to retrieve and aggregate the related value.


由上推得:

您可以使用 trigram_similar 查找,结合 unaccent,然后 annotate 结果:

Author.objects.filter(
    name__unaccent__trigram_similar=test
).annotate(
    similarity=TrigramSimilarity('name__unaccent', test),
).filter(similarity__gt=0.3).order_by('-similarity')

如果你想让它尽可能接近原始样本(并省略一个可能较慢的过滤,然后再省略另一个):

Author.objects.annotate(
    similarity=TrigramSimilarity('name__unaccent', test),
).filter(similarity__gt=0.3).order_by('-similarity')

那些只适用于 Django 版本 >= 1.10


编辑:

虽然上面的方法应该有效,@Private 报告发生了这个错误:

Cannot resolve keyword 'unaccent' into a field. Join on 'unaccented' not permitted.

这可能是一个错误,或者 unaccent 并非旨在以这种方式工作。以下代码没有错误:

Author.objects.filter(
    name__unaccent__trigram_similar=test
).annotate(
    similarity=TrigramSimilarity('name', test),
).filter(similarity__gt=0.3).order_by('-similarity')